Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Hi I Want To Have Several Boards Connected To The Same Experiment Manager, And Have Agents On The Manager Using These Boards, One Agent For Each Board. I Thought That If I Know What The Agent Is, I Can Assign One Board Per Agent - If The Agent Is 1, Then

I want to have several boards connected to the same experiment manager, and have agents on the manager using these boards, one agent for each board.
I thought that if I know what the agent is, I can assign one board per agent - if the agent is 1, then use IP X, if 2, use IP Y, and so on.

My questions are:

  • Can I know, while running a task, which agent is running it?
  • Can I custom name an agent while spinning it?
Posted 4 months ago
Votes Newest

Answers 11

@<1533619716533260288:profile|SmallPigeon24> the intent behind queues supporting multiple workers is to have many consumers for that queue ("producer") - it does not mean multiple workers can pull the same task from the queue

Posted 3 months ago

Queues can have multiple workers, and that implies multiple instances of a task can run concurrently.

@<1533619716533260288:profile|SmallPigeon24> as long as these are the Exact same instances you can have them runing simultaneously (think multi node training), that said each one should "know" not to report over the others, because of course it will overwrite the reports.

Back to your point on multiple agents:
You cannot have two Tasks in the same queue, that means that a single agent pulls a task that agent needs to "spin it" multiple times on the "remote boards"

Another option is creating multiple Tasks (i.e. cloning), then use the user-properties, or hyper parameters, to tell the Code where to report to (of course this implies manually calling the Logger class, because the auto logging is going directly to the main task)
To me this seems the most logical, as it allows to debug the individual executions as well as have a combined view of the metrics.

Lastly you can always use the comparison feature to have all the individual metrics on the same graph

wdyt ?

Posted 3 months ago

Will it get 1 or 2?

Posted 3 months ago

@<1533619716533260288:profile|SmallPigeon24> why would two workers take the same task? You can only have one agent running a task, so you will always get the last_worker representing the agent running the task

Posted 3 months ago

@<1523701087100473344:profile|SuccessfulKoala55> I disagree. Queues can have multiple workers, and that implies multiple instances of a task can run concurrently.
This is necessary for board farms, or any non-tiny scale of work.

Posted 3 months ago

Hi @<1533619716533260288:profile|SmallPigeon24> , to your questions :

  • Yes. It's under 'worker name'
  • You can set worker name in clearml.conf with agent.worker_name
Posted 4 months ago

@<1523701087100473344:profile|SuccessfulKoala55> No, I mean a chip. A piece of hardware that cannot, on it's on, run an agent and as such an attached computer - in this case, the server - will have an agent accessing it via ssh. In my case, I want to have a "board farm" - multiple boards for running inferences on them, and I'd like to have them all connected to the same server.

Posted 4 months ago

@<1523701070390366208:profile|CostlyOstrich36> Will it work? Assume I have 3 workers.
1 takes a task
2 takes a task
1 checks last_worker

Posted 3 months ago

@<1533619716533260288:profile|SmallPigeon24> you can use the task's last_worker property

Posted 3 months ago

@<1533619716533260288:profile|SmallPigeon24> I assume by "board" you mean "queue"?

Posted 4 months ago

@<1523701070390366208:profile|CostlyOstrich36> Hi!
Thank you very much for the informative answer.
I have a follow-up question on q.1: Is there a pythonic way to retrieve that info mid-run?

Posted 4 months ago
11 Answers
4 months ago
3 months ago