Unanswered
Hello, Everyone!
I Have A Question Regarding Clearml Features.
We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs.
Thu
Hi AgitatedDove14 I get the reported scalars from the web usingmodel_task = Task.get_task(task_id=model_task_id) scalars = model_task.get_reported_scalars()
then register each of the scalars with something likelogger.report_scalar(title=metric_key, series=series_val['name'], value=y, iteration=x)
Then you have reported scalars to which I am able to append rest of the model training reports.
Workers are running across multiple machines and you can monitor if a task is dead by looking at the web page.
172 Views
0
Answers
2 years ago
one year ago