Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi There, I Have Several Experiments Hanging/Stuck In The Middle Or At The End Of The Training, With The Last Message Logged Being:

Hi there, I have several experiments hanging/stuck in the middle or at the end of the training, with the last message logged being:

train INFO: Engine run complete. Time taken: 00:16:18
clearml.reporter - WARNING - Event reporting sub-process lost, switching to thread based reporting

What could be reason? How can I debug them? (I cannot reproduce locally and I don't have a clue of where the task could be stuck and why)

  
  
Posted 7 months ago
Votes Newest

Answers 2


Hi @<1523701066867150848:profile|JitteryCoyote63> , how are you running the experiments? What's the OS/platform?

  
  
Posted 7 months ago

Hi @<1523701087100473344:profile|SuccessfulKoala55> I was able to find the issue, I was creating a queue and worker subprocess that were not properly cleaned up

  
  
Posted 7 months ago
646 Views
2 Answers
7 months ago
7 months ago
Tags