Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi, When Downloading Datasets Using The Python Sdk On A Runner Initiated Using


Thanks for responding. Here you can find some references:

The runner is a ubuntu machine on which a specific user is made. Here we have a venv from which we run:

clearml-agent daemon --queue gpu_12gb --detached --gpus 1
clearml-agent daemon --queue gpu_24gb --detached --gpus 0

clearml-agent daemon --queue no_gpu --detached

This user has a clearml.conf file in its home directory. When I run clearml-data commands as this user from the venv everything works as expected.

I also have a second machine, in this case a VM with as a sole purpose being a runner. It is started using clearml-agent daemon --queue gpu_24gb --gpus 0 in a service, and here I get the same issues.

The code used to run is:

task = Task.init(
    project_name=config.project,
    task_name=config.task_name,
    output_uri=config.output_uri,
)
task_id = task.task_id
task.get_logger().set_default_upload_destination(uri=config.output_uri)
task.connect(yml)

if config.clearml_queue != "local":
    print(f"Running on ClearML queue: {config.clearml_queue}")
    task.execute_remotely(queue_name=config.clearml_queue)
else:
    print("Running locally")
...

# Fetch the dataset from ClearML
    print(f"Downloading {dataset_id} for {split_type}")
    clearml_ds = Dataset.get(dataset_id=dataset_id)

    # Then set the alias to the dataset name
    ds.alias = f"{clearml_ds.project}/{clearml_ds.name}"

    # Refetch but set the alias
    clearml_ds = Dataset.get(dataset_id=dataset_id, alias=ds.alias)

    ds_path = clearml_ds.get_local_copy()
    print(f"Downloaded {dataset_id} for {split_type}")

We added the second fetch because we were getting issues for dataset aliases not being set. However, this doesn’t matter for this issue since it crashed on the first get

  
  
Posted 4 days ago
26 Views
0 Answers
4 days ago
4 days ago