Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Does Clearml Support Running The Experiments On Any "Serverless" Environments (I.E. Vertexai, Sagemaker, Etc.), Such That Gpu Resources Are Allocated On Demand? Alternatively, Is There A Story For Auto-Scaling Gpu Machines Based On Experiments Waiting In

Does ClearML support running the experiments on any "serverless" environments (i.e. VertexAI, SageMaker, etc.), such that GPU resources are allocated on demand?
Alternatively, is there a story for auto-scaling GPU machines based on experiments waiting in the queue and some policy?

  
  
Posted 2 years ago
Votes Newest

Answers 3


Does ClearML support running the experiments on any "serverless" environments

Can you please elaborate by what you mean "serverless"?

such that GPU resources are allocated on demand?

You can define various queues for resources according to whatever structure you want. Does that make sense?

Alternatively, is there a story for auto-scaling GPU machines based on experiments waiting in the queue and some policy?

Do you mean an autoscaler for AWS for example?

  
  
Posted 2 years ago

Hi IcyJellyfish61 , while spinning up and down EKS is not supported (albeit very cool 😄 ) we have an autoscaler in the applications section that does exactly what you need, spin up and down EC2 instances according to demand 🙂
If you're using http://app.clear.ml as you server, you can find it at https://app.clear.ml/applications .
Unfortunately, it is unavailable for the opensource server and only to paid tiers.

  
  
Posted 2 years ago

re. "serverless" I mean running a training task on cloud services such that machines with GPUs for those tasks are provisioned on demand.
That means we don't have to keep a pool of machines with GPUs standing by, and don't have to deal with autoscaling. The cloud provider, upon receipt of such a training task, provisions the machines and runs the training.
This is a common use case for example in VertexAI.

Regarding Autoscaling - yes, autoscaling EC2 instances for example based on pending experiments in the ClearML experiments queue.
Even better - if you can autoscale (create and stop) EKS instances.

  
  
Posted 2 years ago
560 Views
3 Answers
2 years ago
one year ago
Tags