Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello, Ive Been Reading The Docs Of Hyperparameteroptimizer, And Various Questions In The Channel, But Couldn'T Find An Answer. I Have A Working Hpo Run, But Many Times Experiments Fail , For Uncontrollable Reasons. Is There A Way To Tell The Optimizer To

hello, ive been reading the docs of HyperParameterOptimizer, and various questions in the channel, but couldn't find an answer. I have a working HPO run, but many times experiments fail , for uncontrollable reasons. Is there a way to tell the optimizer to re-run these failed experiments? right now it just continues on and reports only the successful ones

  
  
Posted one year ago
Votes Newest

Answers 5


NervousFrog58 it seems to be this failure will repeat - wouldn't it make more sense to fix your code so that such errors would not happen and not restart a failing experiment?

  
  
Posted one year ago

hi NervousFrog58
Can you share some more details with us please ?
Do you mean that when you have an experiment failing, you would like to have a snippet that reset and relaunch it, the way you do through the UI ?
Your ClearML packages version, and your logs would be very userful too 🙂

  
  
Posted one year ago

the code is fine, these failures happen because of external circumstances that cannot be controlled

  
  
Posted one year ago

I see... If you intercept them in your code, you can actually re-enqueue you code at that time...

  
  
Posted one year ago

yes , either a code snippet or a builtin flag.
im using clearml==1.6.2 package and we are running version: 1.1.1-135 • 1.1.1 • 2.14 in the server.
in term of logs im getting :
2022-07-07 16:33:59 [W 2022-07-07 16:33:59,801] Trial 8 failed, because the value None could not be cast to float. 2022-07-07 16:33:59 OptunaObjective result metric=None, iteration None 2022-07-07 16:33:59 [W 2022-07-07 16:33:59,920] Trial 11 failed, because the value None could not be cast to float. 2022-07-07 16:34:00 OptunaObjective result metric=None, iteration Nonewhich is fine, the trials should have failed, im just looking for a way to restart them

  
  
Posted one year ago
510 Views
5 Answers
one year ago
one year ago
Tags