Hey! I Just Finished The Movie

Answered

Hey!
I just finished the movie https://www.youtube.com/watch?v=Y5tPfUm9Ghg&list=UUj09XsAWj-RF9kY4UvBJh_A
spoiler, we are using trains on promise, for I think half a year so far. But we are using it mainly as a dashboard for our experiments. We are using Kubeflow, to manage our pipline training; Easier dataset access; K8s node scaler and jupyter notebook manager, we are running jupyter lab.

Edit

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

Votes Newest

Answers 18

Hey AgitatedDove14 ! Thanks!

For now we are using Kale to build kubeflow pipeline. Then we run this pipeline, so I'm not sure where the agent fits inside the Kubeflow ecosystem, can you elaborate more ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

AgitatedDove14 , ideally yeah, but we manually adds a line to the generate script from kale before running it.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

not only the diff, but also the git itself. because this make no sense to push the git into the docker image

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

and there is not trains agent on kubeflow. At least not that I'm aware of

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

Hi GiddyPeacock64
If you already have K8s setup, and are already using ClearML.
In your kubeflow Yaml:
trains-agent execute --id <task_id> --full-monitoringThis will install everything your Task needs inside the docker. Just make sure that you pass the env variable setting the ClearML , see here:
https://github.com/allegroai/clearml-server/blob/6434f1028e6e7fd2479b22fe553f7bca3f8a716f/docker/docker-compose.yml#L127

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

The first pipeline
step is calling init

GiddyPeacock64 Is this enough to track all the steps?
I guess my main question is every step in the pipeline an actual Task/Job or is it a single small function?
Kubeflow is great for simple DAGs but when you need to build more complex logic it is usually a bit limited
(for example the visibility into what's going on inside each step is missing so you cannot make a decision based on that).
WDYT?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

am.. it depends, it can be a small function but can also be a task. We are using kale so it easier to group/create task and move data from one task to another. Each step is independent but the next is being creating from the data of the pervious one.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

Kale just translate the notebooks into kubeflow pipeline dsl. So In kubeflow when the pipeline is running we can see the pipeline elements.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

And voila full trace including Git and uncommitted changes, python packages, and the ability to change arguments from the UI 🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

well. we are using it from our code and the trains dashboard works just fine. But this is without the agent. Moreover, Kubeflow uses docker images for the training pipelines, there for trains cannot capture the diff

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

GiddyPeacock64 and you see the kale (KF) jobs in the kubeflow UI ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hmm... For quick and dirty integration that would probably do the trick, you could very well issue clearml-task commands on each kubeflow vertex (is that how they are called?)
What do you think AgitatedDove14 ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GrumpyPenguin23
				
					0
					 × 1

🙂

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

GiddyPeacock64 Are you sending the jobs from JupyterLab Kale extension ?

EDIT:
Is the pipeline step itself calling Task.init?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

So main question, is there a best partice how to connect clearml and kubeflow ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

From this movie, its clear that we are not using all trains (clearml) functionality. Because kubeflow is manage our pipeline. So for example, trains cannot record the git changes

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

I'm a movie star 🤩

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GrumpyPenguin23
				
					0
					 × 1

The first pipeline
step is calling init

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					GiddyPeacock64
				
					0
					 × 1

Write your answer

2K Views

18 Answers

4 years ago

2 years ago