Hey All! Great Job On Clearml By The Way

Answered

Hey all! Great job on ClearML by the way 🙂
I’m currently exploring whether it could be a fit for me and my team and I have a few questions:
If I want to send a job from a machine with some version of the code base, my understanding is that the commit hash is picked up and any diff from the remote branch will be applied. Is that correct? Are the only two options for setting up the right environment for a Task either docker or git+pip? Do you support caching of git that evolves with the code base to speed it up? If I want to retrain the same model at a certain cadence on some streaming data (either by time passed or new data accumulated), would that something that ClearML supports or would I have to do the automation externally and trigger the agent from there. I’m sure it can be done with something like Airflow but wondering if you have something else in mind.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					LazyTurkey38
				
					0
					 × 1

Votes Newest

Answers 7

I meant a Python API...

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

I don’t mean a serving endpoint, just the equivalent of “cloning an experiment” and running it on a different (larger) dataset.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					LazyTurkey38
				
					0
					 × 1

If I want to retrain the same model at a certain cadence on some streaming data

Do you mean a serving endpoint?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

You’re saying there’s a built-in scheduler? SuccessfulKoala55
If so where can I find it?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					LazyTurkey38
				
					0
					 × 1

Hi LazyTurkey38 !
Thanks for you kind words 🙏

my understanding is that the commit hash is picked up and any diff from the remote branch will be applied. Is that correct??

Correct 🙂 - do can get some control of this process or override it, if you'd like, but that's the default behavior.

Are the only two options for setting up the right environment for a Task either docker or git+pip?

You can have your ClearML Agent run the code in docker, based on some image you choose (or a default image, or even a complete standalone, prebuilt image you can build using the Agent), or you can have your ClearML Agent run the code in a virtual python environment. In both cases (unless you use a standalone image) the task's requirements and code are installed in the execution sandbox (venv inside the docker, or just the venv) and executed there.

Do you support caching of git that evolves with the code base to speed it up?

Yes 🙂 - ClearML Agent has both a venv cache and a cvs cache, so you can get a speedup rather quickly 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

In fact, if there is a good python API to list/duplicate/edit/run experiments by ID, it seems straightforward to do that from Airflow (or any other job scheduler). I’m just wondering if there is some built-in scheduler.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					LazyTurkey38
				
					0
					 × 1

Of course there is 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Write your answer

1K Views

7 Answers

3 years ago

one year ago