CleanPigeon16 , just making sure, docker is installed and configured on the host machine (i.e. Azure machine)?
Right... apparently the nvidia-docker wasn't set up. Thanks!
in the agent’s clearml.conf
file, set agent.docker_force_pull
to true
.
You can also try in the machine running the ClearML agent to run:docker pull nvidia/cuda:10.1-runtime-ubuntu18.04
which docker image do you use? can you try pulling the image manually?
nope, the experiment is stuck in RUNNING state
CLEARML_DOCKER_IMAGE=nvidia/cuda:10.1-runtime-ubuntu18.04
How do I pull the image using the agent?
Hi CleanPigeon16 .
Do you get anything in the UI regarding this failure (in the RESULTS -> CONSOLE section)?