Hi, I'Ve Asked

Answered

Hi, I'Ve Asked

Hi, I've asked MagnificentArcticwolf3 but I think it should be asked in here too,
Trains-agent reproduces the environment (via docker) of an experiment I want to re-run.
But let's say our environment is static, or that we use for example on-demand instances on AWS (whose environment is also static).
Instead of using an image, we could use our existing environment. I could provide trains-agent with the python path and that's it.
What happens currently is that each environment setup takes about 20 minutes (including all package installations and environment creation) which is a really painful bottleneck and budget consumer (imagine 20 minutes for every init. in a multi-step pipeline, disastrous).
Is there a way to skip this environment creation and instead just use an existing python path?
Thanks.
Edit: I know this goes against the dockerization ideology, but I think the time-gain is significant enough so this should be considered.

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					SmarmySeaurchin8
				
					0
					 × 1

Votes Newest

Answers 5

You probably know you don't have to use docker with your agent. The alternative is to use "ad hoc" virtual environments.

It is a bit tricky, you need to remove the requirements from the queued task configuration. But you can't remove them all, since in that case the agent will use your project's requirements file (if I remember correctly).
I simply kept a single "light weight" requirement in the list, just to avoid looking for my requirements list.
And you need to set a flag for the agent to inherit system installations. In this mode the agent creates new virtual environment, but can use all the system installations.

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					ColossalDeer61
				
					0
					 × 1

SmarmySeaurchin8 Following up on ColossalDeer61 's hint, notice https://allegroai-trains.slack.com/archives/CTK20V944/p1597248476076700?thread_ts=1597248298.075500&cid=CTK20V944 not-too-old thread on reusing globally installed packages.

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					FrothyDog40
				
					0

SmarmySeaurchin8 we are looking into " https://github.com/conda/conda-pack " , allowing you to store an entire conda environment into a tar.gz file. With this feature, we could utilize the "base docker image" as a "base conda environment" basically deploying a fully stored conda env on the agent machine.
One could think of it as a poor wo/man's docker solution. That said it does have some merits like, bare-metal connection on Windows/Mac that is missing from the docker solution...

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					MagnificentArcticwolf3
				
					0

Not sure if this is a valid solution, but if you have your docker container already pre-configured with all requirements - you can set the Task to ignore/disregards your DEV requirements. It should look somewhat like this -

` def ignore_task_requirements():
task=Task.current_task()
with task._edit_lock:
task.reload()
script = task.data.script

    pip_req = script.requirements["pip"]
    pip_req_splitted = pip_req.split("\n")
    pip_req_commented = '\n#'.join(pip_req_splitted)
    script.requirements["pip"] = pip_req_commented

    task._edit(script=script) `

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					UptightCoyote42
				
					0

UptightCoyote42 nice!
BTW: make sure to clear requirements["conda"]
(not visible on the UI but tells the agent which packages were used by conda, out effort to try and see if we could do pip/conda interoperability , not sure if it actually paid off 🙂

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

5 Answers

5 years ago

2 years ago