Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
ThickKitten19
Moderator
2 Questions, 9 Answers
  Active since 10 January 2023
  Last activity 11 months ago

Reputation

0

Badges 1

9 × Eureka!
0 Votes
12 Answers
579 Views
0 Votes 12 Answers 579 Views
one year ago
0 Votes
6 Answers
551 Views
0 Votes 6 Answers 551 Views
11 months ago
0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

Hi AgitatedDove14 I get the reported scalars from the web using
model_task = Task.get_task(task_id=model_task_id) scalars = model_task.get_reported_scalars()then register each of the scalars with something like
logger.report_scalar(title=metric_key, series=series_val['name'], value=y, iteration=x)Then you have reported scalars to which I am able to append rest of the model training reports.
Workers are running across multiple machines and you can monitor if a task is dead by looking...

one year ago
0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

AgitatedDove14 Let me clarify I think you have misunderstood me.

The main reason we need the above mentioned functionality is because there are some experiments that need to run for a long time. Let's say weeks.
However, the importance of the experiment is low so when other, more important experiments appear. We need to temporarily pause(kill or something else) running HPO task and reassign the resource for other needs.
Later, when more important experiments has been completed, we can conti...

one year ago
0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

Quick question when you say the HPO Task, you mean the HPO controller logic Task (i.e. the one launching the training jobs), or do you mean the actual training job itself (i.e. running with a specific set of parameters decided by the HPO controlling task) ?

AgitatedDove14 Sorry, my bad! By HPO task I mean the actual training job itself.
We run the HPO controller logic Task on a separate cpu only machine, so we can think that this task is always on. Only the training jobs can go ...

one year ago
0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

AgitatedDove14 I am not restarting the agent itself, I just need to be able continue the experiment from the same progress point. It can be a different agent. In fact, I am just loading the progress to another agent within the available queue.

one year ago
0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

I see! Then the command clearml-agent execute --id <task_id here> should reload the reported scalars and the task needs to reload last checkpoints only, right?

That's good question too! We didn't figure out the best way of continuing for both the grid and optuna. Can you suggest something?

one year ago
11 months ago