Reputation
Badges 1
10 × Eureka!The 2nd attached screenshot is for URL http://localhost:8080/settings/workspace-configuration .
Hi AgitatedDove14 , I am trying to run clearml-session
on my laptop, but it seems to keep running at "Waiting for environment setup to complete [usually about 20-30 seconds]" for several minutes. How could I debug and resolve it?
I do not see any error in https://app.community.clear.ml/projects/368fb3c4fcdd419e8b597ed100c29d69/experiments/bf78f1c303c74062986384cd74f0e542/info-output/log?columns=selected&columns=type&columns=name&columns=status&columns=project.name&columns=users&colu...
AgitatedDove14 Yes I have an agent running. Otherwise, it would keep running at "Waiting for remote machine allocation . [Status]"
I do not know how to check the TCP connection?
BTW, I just tried the command clearml-session
again, and now it would stop with error "docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].". It means I could not use a remote machine without GPU?
AgitatedDove14 Yes thanks, it seems relevant. So, how to run without docker? We'd like to try it without docker first.
AgitatedDove14 Hi, thanks for the response.
I tried to change the IP address as indicated above, but now clearml-session
is showing the error:ssh: connect to host 10.19.20.15 port 10022: Connection refused
Info to help you reproduce FYI:clearml-session
: version 0.3.2 Ubuntu: version 20.04.2 LTS docker specified for the interactive session: nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 command-line used to spin the agent: clearml-session --docker nvidia/cuda:10.1-cudnn7-runtim...
AgitatedDove14 Hi, for remote machine, I'm switching to Ubuntu server + docker + NVIDIA GPU, instead of using Windows. I run the clearml-agent with docker on the Ubuntu server.
Now everything looks fine on the server after I started the clearml-session on my laptop, which means SSH/VSCode/Jupyter servers are created and I got the URLs.
However, on my laptop it is showing error:
` Remote machine is ready
Setting up connection to remote session
Starting SSH tunnel
ssh: connect to host 172.17...
Hi AgitatedDove14
I tried the commands you suggested. The first command works fine, but the second command failed with the following message:docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.
Our remote machi...
SmugDolphin23 Thanks! get_task_log
is working fine 🎉
Just wonder why download_task_log
is not working? Is it something to be fixed?
You're right.
I have set up 2 self-hosted clearml server, and I'm accessing both using SSH tunneling & the same domain ( http://localhost:8080
). After logging out from the 1st clearml website, the 2nd website did redirect me to the login page.
Therefore, the previous error of the 2nd web app is that it was using a wrong cookie from the 1st web app.
CostlyOstrich36
CostlyOstrich36 Thanks! After loging in, everything works without any error.
So it seems the root cause is just that I haven't logged in for the 1st time?
I'm curious too.
It did redirect when I set it up last time few days ago (on a different Ubuntu server). However, this time it seems not redirecting me. How could I further debug or investiage this isssue?