Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi, When Downloading Datasets Using The Python Sdk On A Runner Initiated Using


Thanks for responding. Here you can find some references:

The runner is a ubuntu machine on which a specific user is made. Here we have a venv from which we run:

clearml-agent daemon --queue gpu_12gb --detached --gpus 1
clearml-agent daemon --queue gpu_24gb --detached --gpus 0

clearml-agent daemon --queue no_gpu --detached

This user has a clearml.conf file in its home directory. When I run clearml-data commands as this user from the venv everything works as expected.

I also have a second machine, in this case a VM with as a sole purpose being a runner. It is started using clearml-agent daemon --queue gpu_24gb --gpus 0 in a service, and here I get the same issues.

The code used to run is:

task = Task.init(
    project_name=config.project,
    task_name=config.task_name,
    output_uri=config.output_uri,
)
task_id = task.task_id
task.get_logger().set_default_upload_destination(uri=config.output_uri)
task.connect(yml)

if config.clearml_queue != "local":
    print(f"Running on ClearML queue: {config.clearml_queue}")
    task.execute_remotely(queue_name=config.clearml_queue)
else:
    print("Running locally")
...

# Fetch the dataset from ClearML
    print(f"Downloading {dataset_id} for {split_type}")
    clearml_ds = Dataset.get(dataset_id=dataset_id)

    # Then set the alias to the dataset name
    ds.alias = f"{clearml_ds.project}/{clearml_ds.name}"

    # Refetch but set the alias
    clearml_ds = Dataset.get(dataset_id=dataset_id, alias=ds.alias)

    ds_path = clearml_ds.get_local_copy()
    print(f"Downloaded {dataset_id} for {split_type}")

We added the second fetch because we were getting issues for dataset aliases not being set. However, this doesn’t matter for this issue since it crashed on the first get

  
  
Posted one month ago
412 Views
0 Answers
one month ago
one month ago