Hi, When Downloading Datasets Using The Python Sdk On A Runner Initiated Using

Unanswered

Thanks for responding. Here you can find some references:

The runner is a ubuntu machine on which a specific user is made. Here we have a venv from which we run:

clearml-agent daemon --queue gpu_12gb --detached --gpus 1
clearml-agent daemon --queue gpu_24gb --detached --gpus 0

clearml-agent daemon --queue no_gpu --detached

This user has a clearml.conf file in its home directory. When I run clearml-data commands as this user from the venv everything works as expected.

I also have a second machine, in this case a VM with as a sole purpose being a runner. It is started using clearml-agent daemon --queue gpu_24gb --gpus 0 in a service, and here I get the same issues.

The code used to run is:

task = Task.init(
    project_name=config.project,
    task_name=config.task_name,
    output_uri=config.output_uri,
)
task_id = task.task_id
task.get_logger().set_default_upload_destination(uri=config.output_uri)
task.connect(yml)

if config.clearml_queue != "local":
    print(f"Running on ClearML queue: {config.clearml_queue}")
    task.execute_remotely(queue_name=config.clearml_queue)
else:
    print("Running locally")
...

# Fetch the dataset from ClearML
    print(f"Downloading {dataset_id} for {split_type}")
    clearml_ds = Dataset.get(dataset_id=dataset_id)

    # Then set the alias to the dataset name
    ds.alias = f"{clearml_ds.project}/{clearml_ds.name}"

    # Refetch but set the alias
    clearml_ds = Dataset.get(dataset_id=dataset_id, alias=ds.alias)

    ds_path = clearml_ds.get_local_copy()
    print(f"Downloaded {dataset_id} for {split_type}")

We added the second fetch because we were getting issues for dataset aliases not being set. However, this doesn’t matter for this issue since it crashed on the first get

  				
Posted 
	one month ago

					More
				  		
  Report
		
					SoggyElk61
				
					0
					 × 1

412 Views

0 Answers

one month ago