Reputation
Badges 1
75 × Eureka!I start clearml-session on my mac this way:clearml-session --queue gpu --docker registry.gitlab.com/periplo-innovation/project-precog/clearml_config
Good idea. I can just ssh into the container of task execution, right?
Pytorch is configured on the machine that’s running the agent. It’s also in requirements
I am doing clearml-agent --docker … --foreground --gpus 1
Let me get the exact error for you
This issue was resolved by setting the correct clearml.conf
(replacing localhost with a public hostname for the server) 🙂
The agent is started with this command:clearml-agent --debug daemon --queue gpu --gpus 0 --foreground --docker <gitlab org registry>/project-precog/clearml_config
Well I don’t want that! My local machine is a Mac with no GPU. But I want to execute my code on a server with GPUs. I don’t want my local environment, I want the one configured for the agent!
Upgraded, the issue persists
Sure, will so tomorrow
The issue was that nvidia-docker2
was not installed on the machine where I was trying to run the agent. Following this guide fixed it:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
Ok, it makes sense. But it’s running in docker mode and it is trying to ssh into the host machine and failing
Yes, I created a token and out it into aget.git_pass
Is there some minimal example of a docker env agent I can run, just to see that it works?
Btw it seems the docker runs in network=host