@<1524922424720625664:profile|TartLeopard58> @<1545216070686609408:profile|EnthusiasticCow4>
Notice that when you are spinning multiple agents on the same GPU, the Tasks should request the "correct" fractional GPU container, i.e. if they pick a "regular" no mem limit.
So something like
CLEARML_WORKER_NAME=host-gpu0a clearml-agent daemon --gpus 0 clearml/fractional-gpu:u22-cu12.3-2gb
CLEARML_WORKER_NAME=host-gpu0b clearml-agent daemon --gpus 0 clearml/fractional-gpu:u22-cu12.3-2gb
Also remeber to add --pid=host
to your conf file extra_docker_arguments
None
How does it work with k8s? how can I request the two pods to sit on the same gpu?
Is there currently a way to bind the same GPU to multiple queues? I believe the agent complains last time I tried (which was a bit ago).
@<1545216070686609408:profile|EnthusiasticCow4>
Is there currently a way to bind the same GPU to multiple queues? I believe the agent complains last time I tried (which was a bit ago)
run multiple agents on the same GPU,
CLEARML_WORKER_NAME=host-gpu0a clearml-agent daemon --gpus 0
CLEARML_WORKER_NAME=host-gpu0b clearml-agent daemon --gpus 0
@<1535069219354316800:profile|PerplexedRaccoon19>
is it in the OSS version too?
Yep, free of charge β€
That's great! I look forward to trying this out.
How does it work with k8s?
You need to install the clearml-glue and them on the Task request the container, notice you need to preconfigure the clue with the correct Job YAML
Iβm also curious if itβs available to bind the same GPU to multiple queues.