Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Is There A Way To Use Agent For Dataset Creation Tasks?

Is there a way to use agent for dataset creation tasks?
My point is that I want to use agents to create datasets.
For that purpose I need to pass parameters to task that creates a dataset.
My flow goes like this:
` dataset = Dataset.create(
dataset_name=args['dataset_name'],
dataset_tags=args['cml_tags'].split(','),
dataset_project=args['cml_project_name'])

task = Task.current_task()
task.set_parameters(**args) `and it breaks with the last line since current_task returns None.

Is there a way to workaround this issue?

  
  
Posted one year ago
Votes Newest

Answers 3


You could probably either:
Start the task first (using Task.init ), and then set the parameters if needed Attach the dataset to the task itself

  
  
Posted one year ago

GentleSwallow91 , you can also use Task.create()
https://clear.ml/docs/latest/docs/references/sdk/task#taskcreate

  
  
Posted one year ago

Well actually I have tried a different approach and it works.
` task = Task.init(project_name=args['cml_project_name'],
task_type=TaskTypes.data_processing,
task_name=f'Dataset for {os.path.basename(OBJECT_NAME)}',
tags=args['cml_tags'].split(','),
output_uri = args['cml_output_uri'],
auto_connect_frameworks=True)

    dataset = Dataset.create(
        dataset_name=os.path.basename(OBJECT_NAME), 
        dataset_tags=args['cml_tags'].split(','),
        dataset_project=args['cml_project_name'])

    dataset.add_files(DATA_RAW, verbose=True)
    # upload data to s3
    dataset.upload(output_url=args['cml_output_uri']) 
    dataset.finalize(verbose=True)
    task.close() `With this approach there is a master Task and a separate task for dataset creation. But cloning and sending master task to agent works just fine.
  
  
Posted one year ago
606 Views
3 Answers
one year ago
one year ago
Tags
Similar posts