Okay so that is a bit complicated
In our setup, the DSes don't really care about agents, the agents are being managed by our MLops team.
So essentially if you imagine it the use case looks like that:
A data scientists wants to execute some CPU heavy task. The MLops team supplied him with a queue name, and the data scientist knows that when he needs something heavy he pushes it there - the DS doesn't know nothing about where it is executed, the execution environment is fully managed by the MLOps team. Now if the data scientists needs an apt
package, he has no way to access that machine, because it is not in his domain. So as it is now, he will have either to change his trains.conf
which is not ideal, because he might need that package only for a specific task, or he will have to contact an MLOps member so he would prepare a docker image for him on the remote agents.
I think, it will be very useful, to allow DSes to be able to control that on a task level - so a DS could, without the help of an MLOps member, specify a task-specific apt
dependency on his own
I will open an issue about it, because this is a use case that I predict will be very common for us, there are always these annoying apt
dependencies (like tkinter
and other *-dev
packages)