Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello Folks! I Have A Pipeline With Three Tasks: A, B, And C I Want To Set It Up So That: A Gets Assigned A Machine (E.G. Based On The Queue) B Always Gets Assigned To The Same Machine As A (But May Run In A Different Docker Etc.) C Will Be Submitted To

Hello folks!
I have a pipeline with three tasks: A, B, and C
I want to set it up so that:

A gets assigned a machine (e.g. based on the queue)
B always gets assigned to the same machine as A (but may run in a different docker etc.)
C will be submitted to a different queue and I don’t care as much

Is there a way to define “task affinity” in this way?

  
  
Posted 2 years ago
Votes Newest

Answers 10


AgitatedDove14 much obliged!

  
  
Posted 2 years ago

Really what I need is for A and B to be separate tasks, but guarantee they will be assigned to the same machine so that the clearml dataset cache on that machine will be warm.

I think that what you are looking for is multi-machine cache (which is fully supported). Basically mount an NFS/SMB folder from a NAS to any of those machines, configure the cache folder to point to it, and not you do not need to worry about affinity ?
no?

Is there a way to group A and B into a sub-pipeline, have the pipeline be queued and executed remotely, but the tasks A and B inside it be treated like local tasks? or something like that?

actually yes, you could have pipeline AB' , that always "executes locally" (meaning not scheduling itself or it's components) , where A, B are the components. from the original pipeline perspective the component is a Task AB (which is this new pipeline). The only caveat is that pipeline AB, tasks A, B need to be on the same git repo

  
  
Posted 2 years ago

CostlyOstrich36 yes, for the cache.
AgitatedDove14 I am not sure queue will be sufficient. it would require a queue per execution of the pipeline.

Really what I need is for A and B to be separate tasks, but guarantee they will be assigned to the same machine so that the clearml dataset cache on that machine will be warm.

Is there a way to group A and B into a sub-pipeline, have the pipeline be queued and executed remotely, but the tasks A and B inside it be treated like local tasks? or something like that?

  
  
Posted 2 years ago

C will be submitted to a different queue and I don’t care as much

Is there a way to define “task affinity” in this way?

Hi RoughTiger69 ,
when you say Task affinity, you mean, I want C to be executed next to A/B ? Affinity as a concept doesn't really exist, it can be abstracted to a queue, where you have agents pulling from multiple queues. Then C can be pushed to one the the queues (in theory you might be able to programmtically control the Queue of C), wdyt?

  
  
Posted 2 years ago

Is there a specific reason you would want them executed on the same machine? Cache?

  
  
Posted 2 years ago

Oh, you want the same machine to execute the two tasks/steps?

  
  
Posted 2 years ago

CostlyOstrich36

  
  
Posted 2 years ago

I don’t think so.
In most cases I woudl have multiple agents pulling from the same queue. I can’t have a queue per pipeline execution.
So if I submit A and B to the same queue, it still doesn’t gurantee that they will be pulled by the same agent….

  
  
Posted 2 years ago

Is that what you're looking for?

  
  
Posted 2 years ago

Hi RoughTiger69 , you can specify a queue per step with execution_queue parameter in add_function_step
https://clear.ml/docs/latest/docs/references/sdk/automation_controller_pipelinecontroller

Same goes for the docker image - docker parameter add_function_step

  
  
Posted 2 years ago