Hi WackyRabbit7
If the trains-agent
running docker mode, you can add it to agent.docker_init_bash_script
in the ~/trains.conf
file.
Is there a way to do so without touching the config? directly through the Task object?
I'm asking that because the DSes we have are working on multiple projects, and they have only one trains.conf
file, I wouldn't want them to edit it each time they switch project
One solution I can think about is having a different image per Task
, with the apt-get
packages. you can just build a new image based on the one you have with the apt-get
packages (or change to one with those packages).
Another one is running more than one agent, each one with different trains.conf
file, one for each project.
Currently, task object doesn’t have a parameter for installing packages when running with trains-agent
.
Hi WackyRabbit7 ,
directly through the Task object?
How will the Task object be relevant if you'd like to affect the agent? When the task is running, the agent has already started executing it...
Am I correct in my understanding that what you'd like is for agent.docker_init_bash_script
to be "aware" of the Task the agent is going to run?
TimelyPenguin76 if I build a custom image, do I have to host it on dockerhub for it to run on the agent? If not how do I make the agent aware of my custom image?
SuccessfulKoala55 The simplest thing i can think of is on Task.execute_remotely
to be able to append ot the docker_init_bash_script
Maybe even a dedicated argument specifically for apt-get
packages, since it is very common to need stuff like that
I believe that is why MetaFlow chose conda
as their package manager, because it can take care of these kind of dependencies (even though I hate conda 😄 )
if I build a custom image, do I have to host it on dockerhub for it to run on the agent?
You dont need to host it, but in this case the machine running the agent should have the image (you can verify on the machine with docker images
).
If not how do I make the agent aware of my custom image?
Once the image is the base docker image for this task, and the image was verify on the agent’s machine, the agent should be able to use it
Okay so that is a bit complicated
In our setup, the DSes don't really care about agents, the agents are being managed by our MLops team.
So essentially if you imagine it the use case looks like that:
A data scientists wants to execute some CPU heavy task. The MLops team supplied him with a queue name, and the data scientist knows that when he needs something heavy he pushes it there - the DS doesn't know nothing about where it is executed, the execution environment is fully managed by the MLOps team. Now if the data scientists needs an apt
package, he has no way to access that machine, because it is not in his domain. So as it is now, he will have either to change his trains.conf
which is not ideal, because he might need that package only for a specific task, or he will have to contact an MLOps member so he would prepare a docker image for him on the remote agents.
I think, it will be very useful, to allow DSes to be able to control that on a task level - so a DS could, without the help of an MLOps member, specify a task-specific apt
dependency on his own
I will open an issue about it, because this is a use case that I predict will be very common for us, there are always these annoying apt
dependencies (like tkinter
and other *-dev
packages)
In our team there is a similar requirement, some scripts requires external dependencies. We have built several Docker images and these can be selected within the script itself by using -task.set_base_docker("<docker-image>:<tag>")
UptightCoyote42 - How are these images avaialble to all agents? Do you host them on Docker hub?
Docker hub is probably not a bad idea. In my case there were only two workstations so I've copied the Dockerfile and rebuilt the image