Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hello, I Am Having Issues Using The Dataset Service Pointing To The Wrong Fileserver Url When Running Tasks Remotely On The Clearml Agents Running In K8S. I Was Following The Onboarding Guides Available On Youtube. In The Video They Run A Command To Uploa

Hello, I am having issues using the dataset service pointing to the wrong fileserver url when running tasks remotely on the ClearML Agents running in k8s. I was following the onboarding guides available on Youtube. In the video they run a command to upload the data from your local laptop with

clearml-data create --project "Full Overview" --name "Fashion MNIST"
clearml-data add --files fashion_mnist
clearml-data close

In the following video they clone the task and run it remotely. I keep getting errors from the agent worker that it is unable to download the dataset but it's referencing the localhost address:

2025-06-03 01:27:39,638 - clearml.storage - ERROR - Could not download 
 , err: HTTPConnectionPool(host='localhost', port=8081): Max retries exceeded with url: /Full%20Overview/.datasets/Fashion%20MNIST/Fashion%20MNIST.c4a325608ecd4c40a12535e9eed0f54d/artifacts/state/state.json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f385e826480>: Failed to establish a new connection: [Errno 111] Connection refused')) 
Traceback (most recent call last):
  File "/root/.clearml/venvs-builds/3.12/task_repository/ml-clearml-poc.git/scripts/scratch/train_xgboost_dataver.py", line 19, in <module>
    data_path = Dataset.get(dataset_name="Fashion MNIST", alias="Fashion MNIST").get_local_copy()
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.clearml/venvs-builds/3.12/lib/python3.12/site-packages/clearml/datasets/dataset.py", line 1806, in get
    instance = get_instance(dataset_id)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/.clearml/venvs-builds/3.12/lib/python3.12/site-packages/clearml/datasets/dataset.py", line 1718, in get_instance
    raise ValueError("Could not load Dataset id={} state".format(task.id))
ValueError: Could not load Dataset id=c4a325608ecd4c40a12535e9eed0f54d state

I have been trying to fix this, including setting the env vars in the agent CLEARML_FILES_HOST to point to the k8s address for the service. This still does not work. When I look in the ClearML UI, I see the dataset includes references to localhost.

I'm confused as to how this is setup, given that I was under the impression local/remote exec was supposed to be seamless?

  
  
Posted 3 months ago
Votes Newest

Answers

319 Views
0 Answers
3 months ago
2 months ago
Tags
Similar posts