Hi, How Can I Check If My Clearml-Agent Is Running Probably? I Setup A Local Server To Test, But Seems It Does Not Pick Up Any Job. In The Ui, I Saw The New Agent Was Registered (It Shown Up In The "Workers" Page) The Terminal Looks A Bit Weird, After S

Answered

Hi, how can I check if my clearml-agent is running probably? I setup a local server to test, but seems it does not pick up any job.

In the UI, I saw the new agent was registered (It shown up in the "workers" page)

The terminal looks a bit weird, After seeing this message, no new log is pop up and it looks stuck.

Running in Docker mode (v19.03 and above) - using default docker image: nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Votes Newest

Answers 17

Do you see this message now as well?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Sorry, let me get back to you tomorrow. Maybe I did something wrong now the entire UI crash

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Yes, i did use foreground.

I tested in a older "trains" server, it will show up log like this if no job is pick up. While my new "clearml-agent" shows nothing

No tasks in queue bb1bb1673f224fc98bbc8f86779be802
No tasks in Queues, sleeping for 5.0 seconds

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

This is the only log I see.

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Hi EnviousStarfish54 , did you use --foreground ? By default, the agent will output it's log to a log file, unless explicitly requested to do otherwise

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Not sure why my elasticsearch & mongodb crashed. I have to remove and recreate all the dockers. Then clearml-agent works fine too

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Hi EnviousStarfish54
docker on windows , with nvidia runtime support is only with WSL (I think)
https://docs.nvidia.com/cuda/wsl-user-guide/index.html#installing-wip
https://medium.com/@dalgibbard/docker-with-gpu-support-in-wsl2-ebbc94251cf5

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I am running on Window 10 Machine, is this not compatible?

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Digest: sha256:407714e5459e82157f7c64e95bf2d6ececa751cca983fdc94cb797d9adccbb2f Status: Downloaded newer image for nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04 docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:459: container init caused: Running hook #0:: error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: initialization error: driver error: failed to process request: unknown.

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

First thing to make sure is that this is indeed your default queue's ID - perhaps the agent configuration is incorrect and the agent is connecting to a different server?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Well, go to the Workers and Queues section in the WebApp, click on Queues, than click on your default queue - the queue ID should appear in the URL

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Now my problem is clearml-agent pick up the job but fail to run the docker.

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Do you see any change in the URL if you click on you "test" queue?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

hmmmm, maybe I missed some UI Element, I can't locate any ID

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

How do I confirm this?

  				
Posted 
	3 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

I'm not sure, but I suspect it might be an issue... perhaps AgitatedDove14 knows?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Write your answer

1K Views

17 Answers

3 years ago

11 months ago