Unanswered
Hello, Everyone!
I Have A Question Regarding Clearml Features.
We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs.
Thu
AgitatedDove14 Let me clarify I think you have misunderstood me.
The main reason we need the above mentioned functionality is because there are some experiments that need to run for a long time. Let's say weeks.
However, the importance of the experiment is low so when other, more important experiments appear. We need to temporarily pause(kill or something else) running HPO task and reassign the resource for other needs.
Later, when more important experiments has been completed, we can continue HPO task from the same state.
Hope this makes the problem more clear.
173 Views
0
Answers
2 years ago
one year ago