Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Anyone Doing Sagemaker With Clearml - Something Like The K8S Glue But The Tasks Are Pulled Into Sagemaker Training Jobs

Anyone doing sagemaker with Clearml - something like the k8s glue but the tasks are pulled into sagemaker training jobs

  
  
Posted 3 years ago
Votes Newest

Answers 15


Do you have any experience and things to watch out for?

Yes, for testing start with cheap node instances 🙂
If I remember correctly everything is preconfigured to support GPU instances (aka nvidia runtime).
You can take one of the templates from here as a starting point:
https://aws.amazon.com/blogs/compute/running-gpu-accelerated-kubernetes-workloads-on-p3-and-p2-ec2-instances-with-amazon-eks/

  
  
Posted 3 years ago

Got it. Never ran GPU workload in EKS before. Do you have any experience and things to watch out for?

  
  
Posted 3 years ago

Running multiple k8s_daemon rightt? k8s_daemon("1xGPU") and k8s_daemon('cpu') right?

  
  
Posted 3 years ago

Basic setup:
glues service per "job template" (e.g. k8s resources, for example cpu requirement, or gpu requirement).
queue per glue service, e.g. cpu_machine queue, and 1xGPU queue
wdyt?

  
  
Posted 3 years ago

AgitatedDove14 - any pointers on how to run gpu tasks with k8s glue. How to control the queue and differentiate tasks that need cpu vs gpu in this context

  
  
Posted 3 years ago

I think my main point is, k8s glue on aks or gke basically takes care of spinning new nodes, as the k8s service does that. Aws autoscaler is kind of a replacement , make sense?

  
  
Posted 3 years ago

Sagemaker will make that easy, especially if I have sagemaker as the long tail choice. Granted at a higher cost

  
  
Posted 3 years ago

For different workloads, I need to habe different cluster scaler rules and account for different gpu needs

  
  
Posted 3 years ago

As in if there are jobs, first level is new pods, second level is new nodes in the cluster.

  
  
Posted 3 years ago

AgitatedDove14 aws autoscaler is not k8s native right? That's sort of the loose point I am coming at.

  
  
Posted 3 years ago

Aws autoscaler will work with iam rules along as you have it configured on the machine itself. Sagemaker job scheduling (I'm assuming this is what you are referring to, and not the notebook) you need to select the instance as well (basically the same as ec2). What do you mean by using the k8s glue, like inherit and implement the same mechanism but for sagemaker I stead of kubectl ?

  
  
Posted 3 years ago

AgitatedDove14 - i had not used the autoscaler since it asks for access key. Mainly looking for GPU use cases - with sagemaker one can choose any instance they want and use it, autoscaler would need set instance configured right? need to revisit. Also I want to use the k8s glue if not for this. Suggestions?

  
  
Posted 3 years ago

BTW is it cheaper than ec2 instance? Why not use the aws autoscaler ?

  
  
Posted 3 years ago

That should not be complicated to implement. Basically you could run 'clearm-task execute --id taskid' as the sagemaker cmd. Can you manually launch it on sagemaker?

  
  
Posted 3 years ago

Would this be a good use case to have?

  
  
Posted 3 years ago
1K Views
15 Answers
3 years ago
one year ago
Tags