Clearml Server Deployment Uses Node Storage. If More Than One Node Is Labeled As App=Clearml, And You Redeploy Or Update Later, Then Clearml Server May Not Locate All Your Data.

Answered

ClearML Server deployment uses node storage. If more than one node is labeled as app=clearml, and you redeploy or update later, then ClearML Server may not locate all your data.https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_kubernetes.html

Does this mean this is not really a production support? What happens if the node dies and comes back?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Votes Newest

Answers 21

TrickySheep9 make sense?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

is there GPU support

That's basically depends on your template yaml resources, you can have multiple of those each one "connected" with a diff glue pulling from a diff queue. This way the user can enqueue a Task in a specific queue, say single_gpu , then the glue listens on that queue and for each clearml Task it creates a k8s job the single gpu as specified in the pod template yaml.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

yes, TrickySheep9 use the k8s glue from here:
https://github.com/allegroai/clearml-agent/blob/master/examples/k8s_glue_example.py

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 - these instructions are out of date? https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_kubernetes_helm.html

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

one last tiny thing TrickySheep9 .. please do let us know how you get on, good or bad.. and if you bump into anything unexpected then please do scream and let us know 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AlertBlackbird30
				
					0
					 × 1

Hi TrickySheep9
You should probably check the new https://github.com/allegroai/clearml-server-helm-cloud-ready helm chart 😉
https://github.com/allegroai/clearml-server-helm-cloud-ready

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

basically PVC for all the DBs 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Thanks!

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Beyond this have the UI running, have to start playing with it. Any suggestions for agents with k8s?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

No 😞

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

The repo seems to be

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Sure thing

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

No, if you need the cloud ready install (which you do), follow the instructions on the repo readme (not the easy single node setup in the docs, which we will be updating soon)
https://github.com/allegroai/clearml-server-helm-cloud-ready

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

sure, will do AlertBlackbird30

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Wait, let me double check

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

agentservice...

Not related, the agent-services job is to run control jobs, such as pipelines and HPO control processes.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AlertBlackbird30 - got it running. Few comments:

Nodeport is set by default despite being parameter in values.yml. For example:` webserver:
extraEnvs: []

service:
type: NodePort
port: 80 `2. Ingress was using 8080 for webserver but service was 80
3. Had to change path in ingress to “/*” instead of “/” to get it working for me

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

All right got it, will try it out. Thanks for the quick response.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

The helm chart installs a agentservice, how is that related if at all?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

in the repo whereas the docs are https://allegroai.github.io/clearml-server-helm/

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Thanks! Is there GPU support, not clear from the Readme AgitatedDove14

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					TrickySheep9
				
					0
					 × 1

Write your answer

1K Views

21 Answers

3 years ago

one year ago