Hello ClearML friends. I'm trying to setup a clearml agent on my workstation to queue jobs on my GPU.
$ pip3 install clearml-agent $ clearml-agent init $ clearml-agent daemon --detached --gpus 0 --queue <my queue name> --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04However, when trying to queue a job I'm getting
docker: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/create: dial unix /var/run/docker.sock: connect: permission denied. See 'docker run --help'. 2021-06-14 16:20:10 Process failed, exit code 126How can I give the clearml agent permission to run Docker commands? I enabled non-root access to Docker for my user (which I used to deploy the clearml agent). Did I miss something?
Setup on the workstation:
Ubuntu 20.04 GPU 0 - Titan RTX GPU 1 - GeForce RTX 2080 ti Python 3.8.5 CUDA Version: 11.2
(Note, one thing I notice is that my CUDA versions don't match. However, I would have expected that to give me a CUDA related error, not a Docker sock permissions error)