Well, I went into the code and found several suspicious parts around docker commands:
This is the first part, where the docker commands do not work: https://github.com/allegroai/trains-agent/blob/master/trains_agent/commands/worker.py#L2108 Calling functor here has probably switched params: https://github.com/allegroai/trains-agent/blob/master/trains_agent/commands/worker.py#L2122 and then here https://github.com/allegroai/trains-agent/blob/master/trains_agent/commands/worker.py#L2011So I suppose, I cannot set it now without modifying your code.
Thanks for reply. I will check on it and get back to you with result.
Hi WorriedParrot51
Take a look at the Experiment execution section:
there is script
and working directory
working directory is the base of the git repository (which is cloned into the docker file)
So if for some reason trains did not properly detect the current working dir here is what should solve the issue, without changing the PYTHONPATH
script path: ./sub_folder/scripy.py working directory: .
What do you think?
Hi WorriedParrot51
So I think what you need is to map your external code into the docker, is that correct?
Also you want to always set the PYTHONPATH.
You can achieve both by configuring the trains.conf:
Here you can always add a predefined environment and mount point, regardless of the docker image or other docker argument arguments:
https://github.com/allegroai/trains-agent/blob/master/docs/trains.conf#L98
Will this solve the issue?
WorriedParrot51 I now see ...
Two solutions that I can quickly think of:
In the code add:import sys sys.path.append('./my_sub_module')
Assuming you always have to add the sub-directories to make the code work, and assuming they are part of the repository, this is probably the table stolution
2. In the the UI in the Docker base image, add -e PYTHONPATH=/folder
or from code (which is exactly what you did)
a clean interface task.set_base_docker('nvidia/cids -e PYTHONPATH=/folder")
M: "So I think what you need is to map your external code into the docker, is that correct?"
K: No. The code, that must be in PYTHONPATH is part of the repo being cloned by agent, but must be referenced in PYTHONPATH, because of multiple modules being referenced from that path.
M: "You can achieve both by configuring the trains.conf"
K: For variability reason, I hope that it would be possible, to configure this for each agent separately in initial command (which is obviously impossible by now). Yep, I can change this in config, but that was not the way I initially want.
Thanks for response anyway, I will try to dig into the code a bit more and in case I will be able to fix it, I will let you know via some merge/pull request.
I have solved the problem by adding docker_arguments
to self._extra_docker_arguments
on line: https://github.com/allegroai/trains-agent/blob/master/trains_agent/commands/worker.py#L2105 which works exactly as I need.
Sorry. I have read your reply now again and realize, that it will probably not solve my problem. The repo I am using has the script, that is being run, but it has a lot of python modules, that are being used as well, but not in relative way, but using PYTHONPATH. So this is the reason, why I need to set the pythonpath env variable to place (plus one subfolder) where the repo is clonned on agent start.
ADD: Yep, it did not work. So is there any way, how the ENV variable can be set as in docker run (or using env file) on trains-agent startup? (I can create special docker image for my runner with this variable, but I do not want to build another image just because of env variable)
I will try the #2 - that can be good as well. Thanks. ADD: Yep. Works well.