Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I Am Using Clearml Pro And Pretty Regularly I Will Restart An Experiment And Nothing Will Get Logged To Clearml. It Shows The Experiment Running (For Days) And It'S Running Fine On The Pc But No Scalers Or Debug Samples Are Shown. How Do We Troubleshoot T

I am using ClearML Pro and pretty regularly I will restart an experiment and nothing will get logged to ClearML. It shows the experiment running (for days) and it's running fine on the PC but no scalers or debug samples are shown.
How do we troubleshoot this?

  
  
Posted one month ago
Votes Newest

Answers 69


Console output and also what you get on the ClearML task page under the console section

  
  
Posted one month ago

Just to make sure, did the logging to the clearml server work previously and stoped working at some point?

  
  
Posted one month ago

When the script is hung at the end the experiment says failed in ClearML

  
  
Posted one month ago

So I am only seeing values for the first epoch. It seems like it does not track all of them so maybe something is happening when it tries to log scalars.
I have seen it only log iterations but setting task.set_initial_iteration(0) seemed to fix that so it now seems to be logging the correct epoch
Tensorboard is correct and works. I have never seen an issue in the tensorboard logs

  
  
Posted one month ago

Yes I see it in the terminal on the machine

  
  
Posted one month ago

What happens if you're running the reporting example from the ClearML github repository?

  
  
Posted one month ago

Yes it is logging to the console. The script does hang whenever it completes all the epochs when it is having the issue.

  
  
Posted one month ago

The console logging still works. Aborting the task was in the log but did not work and the process continued until I killed it.

  
  
Posted one month ago

Not sure why that is related to saving images

  
  
Posted one month ago
2K Views
69 Answers
one month ago
one month ago
Tags