Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Everyone, I'M Using The

Hi everyone, I'm using the https://api.clear.ml/ server and ran a bunch of experiments using hydra multirun (sequential runs). Many of these experiments appear with status running on clearml even though they have finished running, and not all of the plots got uploaded. Is this because the server is a bit overloaded and is timing out when receiving the logs?

  
  
Posted 2 years ago
Votes Newest

Answers 11


im running them with python my_script.py -m my_parameter=value_1,value_2,value_3 (using hydra multirun)

  
  
Posted 2 years ago

So as you say, it seems hydra kills these

Hmm let me check in the code, maybe we can somehow hook into it

  
  
Posted 2 years ago

AttractiveCockroach17 could it be Hydra actually kills these processes?
(I'm trying to figure out if we can fix something with the hydra integration so that it marks them as aborted)

  
  
Posted 2 years ago

it doesnt happen with all the tasks of the multirun as you can see in the photo

  
  
Posted 2 years ago

AttractiveCockroach17 can I assume you are working with the hydra local launcher ?

  
  
Posted 2 years ago

Hi AttractiveCockroach17

. Many of these experiments appear with status running on clearml even though they have finish running,

Could it be their process just terminated? (i.e. not properly shutdown) ?
How are you running these multiple experiments?
BTW: if the server does not see any change in a Task for (I think the default is 2 hours) it will automatically mark these Task as aborted

  
  
Posted 2 years ago

indeed, im looking at their corresponding multirun outputs folder and the logs terminate before without error and the only plots saved are those in clearml. So as you say, it seems hydra kills these

  
  
Posted 2 years ago

each of those runs finished producing 10 plots each but in clearml only 1, a few, or none got uploaded

  
  
Posted 2 years ago

dont think will be reproducible with the hydra example. It was just that I launched like 50 jobs and some of them because of the parameters maybe failed (strangely with no error).
But is ok for now I guess, will debug wether those experiments that failed would failed if ran independently as well

  
  
Posted 2 years ago

yes

  
  
Posted 2 years ago

Okay, so I can't figure why it would "kill" the new experiments, I mean it should run them, but is there any "smart stopping" that causes it to kill he process before it ends ?
BTW: can this be reproduced with the clearml hydra example ?

  
  
Posted 2 years ago
1K Views
11 Answers
2 years ago
one year ago
Tags