Reputation
Badges 1
4 × Eureka!ReassuredTiger98 thanks for sharing those threads. I found them very insightful.
Hey AgitatedDove14 ,
The way
clearml
is designed, is to have queues abstract different types pf resources.
Configuring multiple queues and multiple agents based on the resources can be a solution for many use-cases . But when the instances are non-homogeneous, there can be too many combinations of resources based no.of GPUs, no. of cores, disk space etc. that work for various workloads. I’m thinking that creating as many agents and queues can get messy for managing a...
Can I assume we are talking Kubernetes under the hood for the resource allocation?
yes
The granularity offered by K8s (and as you specified) is sometimes way too detailed for a user, for example I know I want 4 GPUs but 100GB disk-space, no idea, just give me 3 levels to choose from (if any, actually I would prefer a default that is large enough, since this is by definition for temp cache only), and the same argument for number of CPUs..
While i agree that over-detailing makes u...
This will be quite easy to implement using the clearml k8s glue, just use user-properties and change the template based on it. I can point to where you need to modify the code
I’m pretty new to this. So, it’d be great if you can do that.