Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Hi, Is It Possible To Resume An Experiment That Stopped Unexpectedly, By Using A Checkpoint Of The Model?

Hi, Is it possible to resume an experiment that stopped unexpectedly, by using a checkpoint of the model?

Posted 3 years ago
Votes Newest

Answers 5

If you have the check point (see output_uri for automatically uploading it) then you can always load it. Do you mean if you can change the iteration/ step counter? Or do you mean with trains-agent?

Posted 3 years ago

I would clone the first experiment, then in the cloned experiment, I would change the initial weights (assuming there is a parameter storing that) to point to the latest checkpoint, i.e. provide the full path/link. Then enqueue it for execution. The downside is that the iteration counter will start from 0 and not the previous run.

Posted 3 years ago

AstonishingSeaturtle47 , makes sense?

Posted 3 years ago

Yes, Thanks

Posted 3 years ago

Is it considered the same experiment? Is it possible to use the trains-agent? Submit a resume from the UI?

Posted 3 years ago
5 Answers
3 years ago
10 months ago