Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
My Nth Question For The Day

My nth question for the day 🙂

What’s the general pattern for running a pipeline - train model, evaluate metrics and publish the model if satisfactory (based on a threshold, for example)

  
  
Posted 3 years ago
Votes Newest

Answers 7


seconded, I'm curious about this also.

  
  
Posted 3 years ago

What’s the general pattern for running a pipeline - train model, evaluate metrics and publish the model if satisfactory (based on a threshold, for example)

Basically I would do:
parameters for pipeline:
TaskA = Training model Task (think of it as our template Task)
Metric = title/series/sign we want to choose based on, where sign is max/min
Project = Project to compare the performance so that we could decide to publish based on the best Metric.

Pipeline:
Clone TaskA Change TaskA arguments (if needed) Launch and wait until completed Get TaskA's instance Metric value = Task.get_task(task_id='instance_id_111').get_last_scalar_metrics[Metric.title][Metric.series][Metric.sign])5. Get all Tasks with metric above/below this one,
tasks = Tasks.get_tasks(project=, name=, etc...) tasks = sorted(tasks, key=lambda x: x.get_last_scalar_metrics[Metric.title][Metric.series][Metric.sign]))6. pick the best one
# best task, if this is us, publish if tasks[-1].id == instance_id_111: tasks[-1].publish()wdyt?

  
  
Posted 3 years ago

AgitatedDove14 I'm making some progress on this. I've currently got the situation that my training run saved all of these files, and Task.get_task(param['TaskA']).models['output''][-1] gets me just one of them, training_args.bin . Then -2 gets me another, rng_state.pth

If I just get Task.get_task(param['TaskA']).models['output'] , I end up getting a huge list of, like, [<clearml.model.Model object at 0x7fec2841c880>, <clearml.model.Model object at 0x7fec2841c8e0>, <clearml.model.Model object at 0x7fec2841c820>...

So I think I have a solution here, which is to just loop backwards through the list until I find the right file I want to load.

But I just noticed that for some reason pytorch_model.bin isn't there. I'm not sure why that wasn't saved. huh

  
  
Posted 3 years ago

That's cool AgitatedDove14 , will try it out and pester you a bit more. 🙂

  
  
Posted 3 years ago

Very interesting, thanks! I'll look into it!

  
  
Posted 3 years ago

Interesting, I wasn't aware of the possibilities you outline there at the end, where you, like, programmatically pull all the results down for all the tasks. Neat!

A more complex version of this which I'm trying to figure out:

I trained a model using TaskA. I need to now pull that model down from the saved artifacts of TaskA and fine-tune it in TaskB That finetuning in TaskB spits out a metric.
Is there a way to do this all elegantly? Currently my process is to manually download the models from the UI, then manually upload them to S3, then manually pull them down from S3 and then start the code to finetune TaskB

  
  
Posted 3 years ago

Is there a way to do this all elegantly?

Of yes there is, this is how TaskB code will look:

` task = Task.init(..., 'task b')
param = {'TaskA' :'TaskAs ID HERE'}
task.connect(param)
taska_model = Task.get_task(param['TaskA']).models['output''][-1]
torch.load(taska_model.get_local_copy())

train

torch.save('modelb') `I might have missed something there, but generally speaking this will let you:
Select TASKA as a parameter of TaskB training process Will register automagically Tasks'A model as input model of TaskB Store TasksB in the model repositorySo basically full lineage with ability to automate. wdyt?

  
  
Posted 3 years ago
932 Views
7 Answers
3 years ago
one year ago
Tags