hmmmm, maybe I missed some UI Element, I can't locate any ID
Now my problem is clearml-agent pick up the job but fail to run the docker.
First thing to make sure is that this is indeed your default queue's ID - perhaps the agent configuration is incorrect and the agent is connecting to a different server?
Do you see any change in the URL if you click on you "test" queue?
Well, go to the Workers and Queues section in the WebApp, click on Queues, than click on your default queue - the queue ID should appear in the URL
I'm not sure, but I suspect it might be an issue... perhaps AgitatedDove14 knows?
Digest: sha256:407714e5459e82157f7c64e95bf2d6ececa751cca983fdc94cb797d9adccbb2f Status: Downloaded newer image for nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.
Hi EnviousStarfish54 , did you use --foreground
? By default, the agent will output it's log to a log file, unless explicitly requested to do otherwise
Sorry, let me get back to you tomorrow. Maybe I did something wrong now the entire UI crash
Yes, i did use foreground.
I tested in a older "trains" server, it will show up log like this if no job is pick up. While my new "clearml-agent" shows nothing
No tasks in queue bb1bb1673f224fc98bbc8f86779be802
No tasks in Queues, sleeping for 5.0 seconds
Not sure why my elasticsearch & mongodb crashed. I have to remove and recreate all the dockers. Then clearml-agent works fine too
Hi EnviousStarfish54
docker on windows , with nvidia runtime support is only with WSL (I think)
https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-wip
https://medium.com/@dalgibbard/docker-with-gpu-support-in-wsl2-ebbc94251cf5
I am running on Window 10 Machine, is this not compatible?