is there a guide regarding the configuration required for dockers?
Yes we do have a guide: https://github.com/allegroai/trains-agent#starting-the-trains-agent-in-docker-mode
You can also specified the image for the docker, in the example the image is
nvidia/cuda but you can put a specific one for your needs (maybe
I can give it a shot (I’m using conda now) what is the overhead of going into dockers with the fact that I dont have “docker hands on experience”?
You don’t really need “docker hands on experience”
is the flow using dockers is more supported than conda?
Its the same flow, but running inside a docker image
you are right, it written cuda version 10.2 (even though I installed only cuda 10.1, weird)
do you know why it's 10.2?
and do you know why trains count on that? (instead of looking in the python environment of the executed script?)
Hi RattySeagull0 ,
If not specified, the values are taken from
nvidia-smi for cuda_version, can you share you output for
You changed the version from 10.2 to 10.1 and
nvidia-smi output is the same? did you do a restart after the change?
Actually you can, when you clone an experiment, in the
EXECUTION section , you can change the
BASE DOCKER IMAGE to the image you like the experiment to run with. This way you can use different docker images for different experiments.
You can use the same queue :)
got it thanks!
Is it possible to use different dockers (containing different cuda versions) in different experiments?
or I have to open different queues for that? (or something like that)
How do you clone the tasks? with
Task.clone ? If so, you can use
cloned_task.set_base_docker(<VALUE FOR BASE DOCKER IMAGE>)
is the flow using dockers is more supported than conda? is there a guide regarding the configuration required for dockers?
Didnt use it so far, but I will start 🙂
The version of the cudatoolkit is 10.1 inside the experiment, and trains try to work with 10.2, probably because the same reason it displays in the nvidia-smi
I can give it a shot (I'm using conda now) what is the overhead of going into dockers with the fact that I dont have "docker hands on experience"?
Ohhh I thought you changed it from 10.2 to 10.1, my mistake.
What do you get for
BTW, what about running trains-agent in docker mode? That can solve all your cuda issues
when my system was "clean" I installed cuda 10.1 (never installed cuda 10.2) hope i'm not mistaken
weird, I will try to find why is that
what do you mean change?
Is it something that I can config from the call to task.init? (my goal is that I wont be required to change in manualy)