Hi, Is There A Means To Leverage On Clearml To Run A Ml Inference Container That Does Not Terminate?

Answered

Hi, is there a means to leverage on clearml to run a ml inference container that does not terminate?

  				
Posted 
	2 months ago

					More  		
  Report
		
					OddShrimp85
				
					0
					 × 1

Votes Newest

Answers 11

Thanks AgitatedDove14 . what I could think of is to write a task that may run python subproecss to do "helm install". In those python script, we could point to /download the helm chart from somewhere (e.g. nfs, s3).

Does this sound right to u?
Anything that I was wondering is if we could pass the helm charts /files when we uses clearml sdk, so we could minimise the step to push them to the nfs/s3.

  				
Posted 
	19 days ago

					More  		
  Report
		
					OddShrimp85
				
					0
					 × 1

How can I make a task that does a helm install or kubectl create deployment.yaml?

The task that it launches should have your code that actually does the helm deployments and other things, thing of the Task as a way to launch a script that does something, that script can then just interact with the cluster. The queue itself (i.e. clearml-agent) will not directly deploy helm charts, it will only deploy jobs (i.e. pods)

  				
Posted 
	2 months ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Can clearml-serving does helm install or upgrade?

Not sure I follow, how would a helm chart install be part of the ml running ? I mean clearml-serving is installed via helm chart, but this is a "one time" i.e. you install the clearm-serving and then you can via CLI / python send models to be served there, this is not a "deployed per model" scenario, but a deployment for multiple models, dynamically loaded

  				
Posted 
	2 months ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

To clarify, there might be cases where we get helm chart /k8s manifests to deploy a inference services. A black box to us.

Users may need to deploy this service where needed to test out against other software components. This needs gpu resources which a queue system will allow them to queue up and eventually get this deployed instead of hard resource allocation to this purpose

  				
Posted 
	2 months ago

					More  		
  Report
		
					OddShrimp85
				
					0
					 × 1

AgitatedDove14 I still trying to figure out how to do so. Coz when I add a task in queue, clearml agent basically creates a pod with the container. How can I make a task that does a helm install or kubectl create deployment.yaml?

  				
Posted 
	2 months ago

					More  		
  Report
		
					OddShrimp85
				
					0
					 × 1

Can clearml-serving does helm install or upgrade? We have cases where the ml models do not come from the ml experiments in clearml. But would like to tap on clearml q to enable resource queuing.

  				
Posted 
	2 months ago

					More  		
  Report
		
					OddShrimp85
				
					0
					 × 1

Hi OddShrimp85
You mean something like clearml-serving ?
None

  				
Posted 
	2 months ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 I looking at a queue system which clearml q offers that allow user to queue job to deploy an app / inference service. This cam be as simple as a pod or a more complete helm chart.

  				
Posted 
	2 months ago

					More  		
  Report
		
					OddShrimp85
				
					0
					 × 1

A more advanced case will be to decide how long this job should run amd terminate after that. This is to improve the usage of gpu

  				
Posted 
	2 months ago

					More  		
  Report
		
					OddShrimp85
				
					0
					 × 1

This cam be as simple as a pod or a more complete helm chart.

True, and this could be good for batch processing, but if you want restapi service then clearml-serving is probably a better fit
does that make sense ?

  				
Posted 
	2 months ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

To clarify, there might be cases where we get helm chart /k8s manifests to deploy a inference services. A black box to us.

I see, in that event, yes you could use clearml queues to do that, as long as you have the credentials the "Task" is basically just a deployment helm task.
You could also have a monitoring code there so that the same Task is pure logic, spinning the helm chart, monitoring the usage, and when it's done taking it down

  				
Posted 
	2 months ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

209 Views

11 Answers

2 months ago

18 days ago