Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Is There Any Way To Post Slack Alerts For The Frozen Experiments? (Eg, After Server Restart They Sometimes Get Stuck In Running Mode, Or

is there any way to post Slack alerts for the frozen experiments? (eg, after server restart they sometimes get stuck in Running mode, or https://github.com/pytorch/pytorch/issues/804 if there is not enough shared memory)

Posted 3 years ago
Votes Newest

Answers 5

Hi DilapidatedDucks58

By default the Slack monitor service monitoring the tasks by status, there is no ‘freeze’ status, so it will be a bit hard to monitor it.

That said, you can always add a different filters to the monitoring service so you will get the specific tasks relevant for you. Maybe adding a tag to those tasks and filter according to it? What do you think?

Posted 3 years ago

for me, increasing shm-size usually helps. what does this RC fix?

Posted 3 years ago


is there any way to post Slack alerts for the frozen experiments?

The latest RC should solve the PyTorch data loader, do you want to test it?
pip install clearml==0.17.5rc2

Posted 3 years ago

yeah, that sounds right! thanks, will try

Posted 3 years ago

DilapidatedDucks58 I think it should not be hard to modify the Slack monitoring code to detect frozen tasks - these are essentially tasks who are still in the running state, but who's last_update field has not been updated for more than X minutes or hours (according to your own preference)

Posted 3 years ago
5 Answers
3 years ago
one year ago