Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Am Running A Script Very Similar To The One In

Hi, I am running a script very similar to the one in this example , except that the data parameter for training is taken from a clearml.Dataset.get . I can clone my job and modify my parameters, but the data parameter for model training is now cached and no longer computed from my script. How can I make it such that clearml does not overwrite that parameter ?

data = clearml.Dataset.get(dataset_name=ds_name)  # ds_name is a param

model.train(data=data) # Here data is overwritten for the cached value on a cloned job
  
  
Posted 10 months ago
Votes Newest

Answers 10


okay I'll try that. Although I am using parameters from the argparser to set the task name and project. Can I init with dummy values and update those after ?

  
  
Posted 10 months ago

okay, and after I can use something like task.set_name("args.ds_name") ?

  
  
Posted 10 months ago

Hi HurtStarfish47 , Do you have some basic code snippet that reproduces this behavior?

  
  
Posted 10 months ago

For example:

task = Task.init(project_name='examples', task_name='PyTorch MNIST train', output_uri=True)

    # Training settings
    parser = argparse.ArgumentParser(description='PyTorch MNIST Example')
    parser.add_argument('--ds-name', default="blabla")
    args = parser.parse_args()
  
  
Posted 10 months ago

I'd suggest running Task.init first and then exposing the dataset name using argparser afterwards

  
  
Posted 10 months ago

Hi CostlyOstrich36 , Here's sample code:

from ultralytics import YOLO
from clearml import Task, Dataset
from jsonargparse import CLI

def train_yolo(ds_name: str=None):
    dataset_path = Dataset.get(dataset_name=ds_name).get_local_copy() 
    task = Task.current_task()
    
    if task == None:
        task = Task.init(project_name="YOLO", task_name=ds_name)
    
    model = YOLO("yolov8n")
    model.train(data=dataset_path)
    
if __name__ == "__main__":
    CLI(train_yolo)

I enqueued a job using this code (with clearml-task). It ran on machine1 and crashed at some point. I reset the job and re-enqueued it, and it now ran machine2 . For some reason the training started fine on the clearml dataset, but when there was a second call to the data (during model.val), it was looking for a dataset in /home/machine1/.clearml/cache/storage_manager/datasets/... and it crashes the job.

  
  
Posted 10 months ago

For more info, I am using jsonargparse to expose my params to clearml, but it looks like it's also picking up the params directly from YOLO

  
  
Posted 10 months ago

Just set defaults

  
  
Posted 10 months ago

Yes 🙂

  
  
Posted 10 months ago

awesome thanks !

  
  
Posted 10 months ago
715 Views
10 Answers
10 months ago
10 months ago
Tags