Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, When Downloading Datasets Using The Python Sdk On A Runner Initiated Using

Hi, when downloading datasets using the Python SDK on a runner initiated using execute_remotely , I get this issue:

Traceback (most recent call last):
  File ".../train.py", line 124, in <module>
    clearml_ds = Dataset.get(dataset_id=dataset_id)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../dataset.py", line 1806, in get
    instance = get_instance(dataset_id)
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".../dataset.py", line 1704, in get_instance
    raise ValueError("Provided id={} is not a Dataset ID".format(task.id))
ValueError: Provided id=440676a069114e0181234c2b00f94f0bb is not a Dataset ID

When I rerun the script, sometimes it does work however. Why could this issue be intermittent?

I tried recreating the issue by running clearml-data get --id manually on the device and there it does not occur.

  
  
Posted 23 days ago
Votes Newest

Answers 5


So running this train script will sometimes work and sometimes give the error of the original post

  
  
Posted 23 days ago

Also, do you have a code snippet that reproduces this?

  
  
Posted 23 days ago

Hi @<1795626098352984064:profile|SoggyElk61> , is it possible you have multiple environments?

  
  
Posted 23 days ago

Is this sufficient information or can I get help elsewhere?

  
  
Posted 21 days ago

Thanks for responding. Here you can find some references:

The runner is a ubuntu machine on which a specific user is made. Here we have a venv from which we run:

clearml-agent daemon --queue gpu_12gb --detached --gpus 1
clearml-agent daemon --queue gpu_24gb --detached --gpus 0

clearml-agent daemon --queue no_gpu --detached

This user has a clearml.conf file in its home directory. When I run clearml-data commands as this user from the venv everything works as expected.

I also have a second machine, in this case a VM with as a sole purpose being a runner. It is started using clearml-agent daemon --queue gpu_24gb --gpus 0 in a service, and here I get the same issues.

The code used to run is:

task = Task.init(
    project_name=config.project,
    task_name=config.task_name,
    output_uri=config.output_uri,
)
task_id = task.task_id
task.get_logger().set_default_upload_destination(uri=config.output_uri)
task.connect(yml)

if config.clearml_queue != "local":
    print(f"Running on ClearML queue: {config.clearml_queue}")
    task.execute_remotely(queue_name=config.clearml_queue)
else:
    print("Running locally")
...

# Fetch the dataset from ClearML
    print(f"Downloading {dataset_id} for {split_type}")
    clearml_ds = Dataset.get(dataset_id=dataset_id)

    # Then set the alias to the dataset name
    ds.alias = f"{clearml_ds.project}/{clearml_ds.name}"

    # Refetch but set the alias
    clearml_ds = Dataset.get(dataset_id=dataset_id, alias=ds.alias)

    ds_path = clearml_ds.get_local_copy()
    print(f"Downloaded {dataset_id} for {split_type}")

We added the second fetch because we were getting issues for dataset aliases not being set. However, this doesn’t matter for this issue since it crashed on the first get

  
  
Posted 23 days ago
3K Views
5 Answers
23 days ago
21 days ago
Tags