Badges 115 × Eureka!
wrt 1 and 3: my bad, i had too high expectations for the default Docker image 🙂 , thought it was ready to run tensorflow out of the box, but apparently it isn't. I managed to run my rounds with another image.
wrt 2: yes, i already changed the
conda and added
tensorflow-gpu as dependency, as i do in my local environment, but the environment that is created doesn't have access to the GPUs, as the other one does. How can i set the base python versi...
yes, in general, i want to control the behavior of
git clone . Is it possible?
Docker version 19.03.7, build 7141c199a2 on Linux, btw
task name, at the end, if it helps
ok, i got the problem, it isn't really related to spaces or local vs remote. It is the presence of characters like
! . Indeed the artifacts on GCS are created converting
%21 and are tracked succesfully like this on the server. When the request is sent to actually download the artifacts or to see pictures in Debug samples the
%21 is converted back to
! and there is no such object in GCS with
! . Hope it's clear. Not a big deal to me, can just avoid spe...
yes, looks like. Is it possible?
Whats the exact project/task name?
And what is the output_uri?
task_name="Run from CD + FS"
the output_uri isn't set, but the fileserver is set to the GCS location in
trains.conf and indeed the artifacts and the metrics are correctly stored where supposed to be
Are you working with venv or docker mode?
sorry, important info! Docker mode
Also notice that is you need all gpus you can pass
yes, i know, but i need to use 2 out of 4 for a queue
what i can say is that when tasks are running locally the task name can have spaces, when executed remotely they cannot. I tired to remove the spaces in a remote execution and the artifacts are linked without problems (in both cases they are created just fine on GCS, it's just a matter of linking them in the Server UI)
indeed, i managed to make a
docker run command to work with the fix you mentioned (
docker run --gpus '"device=1,2"' nvidia/cuda:9.0-base nvidia-smi ) but
trains-agent just appends to
--gpus device= and there is no way to make the quoting like this
thanks! I'll consider