I suggest running it in docker mode with a docker image that already has cuda installed
Hi @<1734020162731905024:profile|RattyBluewhale45> , what version of pytorch are you specifying?
Isn't the problem that CUDA 12 is being installed?
I think it tries to get the latest one. Are you using the agent in docker mode? you can also control this via clearml.conf with agent.cuda_version
Just try as is first with this docker image + verify that the code can access cuda driver unrelated to the agent
In the config file it should be something like this: agent.cuda_version="11.2" I think
Solved that by setting docker_args=["--privileged", "--network=host"]
Just to make sure, run the code on the machine itself to verify that python can actually detect the driver