Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Good Folks Here! Does Clearml Allow Auto-Rerun Of Failed Jobs, For Example When A Spot Instance Gets Interrupted, Please? (Or Auto-Resume, If Checkpointing Logic In Place)

Hi good folks here! Does ClearML allow auto-rerun of Failed jobs, for example when a SPOT instance gets interrupted, please? (or auto-resume, if checkpointing logic in place)

  
  
Posted 2 years ago
Votes Newest

Answers 11


Do Pipelines work with Hyperparameter search, and with single training jobs?

  
  
Posted 2 years ago

This great tool is worth paying for!

  
  
Posted 2 years ago

OK, just making sure πŸ™‚

  
  
Posted 2 years ago

Thanks for the answers πŸ™‚

  
  
Posted 2 years ago

Tagging my colleague @<1529271085315395584:profile|AmusedCat74> who needs this with me πŸ™‚

  
  
Posted 2 years ago

@<1546665634195050496:profile|SolidGoose91> regarding spot instances, are you referring to tasks running using the AutoScaler App? If so, the autoscaler app should detect the failed spot machine and create a new spot machine that should start running the specific task which was interrupted

  
  
Posted 2 years ago

@<1546665634195050496:profile|SolidGoose91> pipeliens are yours to implement as you with - you define what which step will do. However, for Hyperparameter search, you have the HPO app, which might be a quicker ready-made solution πŸ™‚

  
  
Posted 2 years ago

Hi @<1546665634195050496:profile|SolidGoose91> , I think this capability exists when running pipelines. The pipeline controller will detect spot instances that failed and will retry running them.

Are you using the PRO or the open source auto scaler?

  
  
Posted 2 years ago

We’re on the PRO πŸ™‚

  
  
Posted 2 years ago

And yes, I was also referring to tasks ran by the Autoscaler (potentially via the HPO) app, too.

  
  
Posted 2 years ago

Yes, we love the HPO app, and are using it :)

  
  
Posted 2 years ago
1K Views
11 Answers
2 years ago
2 years ago
Tags