Hello, I Hope You Can Help Me With This:

Answered

Hello, I hope you can help me with this:
First , I have self hosted clearML-server on my own server and it is up and running so no issues.

So, the task that i want to run will executed inside a docker container on a remote DGX machine, and to kick an experiment normally what i would do is :
docker run -i registry/project/image_name:tag bash -c " python path/to/file.py "

I installed clearml-agent on the dgx and configured it with my clearml-server If i add the two lines to my file.py :
from clearml import Task task = Task.init(project_name="my project", task_name="my task") Nothing will happen on the web UI after the docker run, on the other hand getting rid of docker and executing python path/to/file.py locally is a success and i can see everything on the web UI.

PS: i cannot get rid of docker because we are working in a containerized environnement and the task at the end will be executed inside a JOB in a Kubernetes Cluster.

I tried to execute clearml-agent daemon --queue default --docker registry/project/image_name:tag --force-current-version first, then the docker run command, BUT i get that the results are at :
ClearML results page: etc etcAnd not on my own clearml-server.

I would be grateful if anyone can guide me through this, surely i am missing something out.
Ideally i want to use the the clearml-task command to kick my experiment if possible.

Thank you.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedTurtle16
				
					0
					 × 1

Votes Newest

Answers 20

Thank you for the quick response, then should i mount the file clearml.conf inside the container ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedTurtle16
				
					0
					 × 1

Hello again and sorry fo the delay,
I tried what you have told me and got it to work but one issue i have is that i want to use ssh when cloning the repo:

clearml-task --project name --name task_name --repo git@gitlab.com:username/project.git --commit commit_sha --script path/to/script.py --queue queueThis doesn't work saying that :

Error: Script entrypoint file mailto:/home/usename/git@gitlab.com :username/project.git/path/to/script.py' could not be found.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedTurtle16
				
					0
					 × 1

Are you running a ClearML Agent on your DGX?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

So the flow, if you want to use clearml-task , is as follows:
You install ClearML Agent on your worker machine (DGX, in your case). The agent monitors the ClearML Server for a specific queue(s) and wait for tasks to be enqueued there. The Agent should be configured with the correct clearml.conf file in order to be able to access the server. You use clearml-task to create new tasks. clearml-task will create a task as you specify, and will enqueue it to the queue of your choice. The Agent will pick up the task, and start executing it on the machine, using the same configuration file you provided to the agent

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

AgitatedTurtle16 could you check with the latest clearml RC (I remember a similar issue was fixed).
pip install clearml==0.17.5rc3Then run again
clearml-task ...

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Yes

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedTurtle16
				
					0
					 × 1

As for (2), I'm not sure I understand the use-case - when using ClearML Agent, the agent will take experiments waiting in a queue, so I'm not sure I understand your intention when you first run the agent and than run the docker manually next to it

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Hi AgitatedTurtle16 ,
For (1), It sounds as if the ClearML SDK running inside the docker simply can't find the clearml.conf configuration file

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Thank you it worked. so i am half way through.
Is there a solution where i can use the clearml-task command directly? it would help to kick the experiment from the gitlab ci.

clearml-task --project ++ --name ++ --docker ++ -- --script ++

I don't think it is possible right ? giving that i should mount the config file every time ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedTurtle16
				
					0
					 × 1

Make sense?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

This is consistent with the
ClearML results page: etc etcmessage since the default mode for ClearML is using the demo server if no other server is configured

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

clearml-task will create the task, than you should have an Agent to execute it - as long as the agent has the correct configuration, it would work

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

how can i do that ? honestly i am confused as how to make it work in my case.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedTurtle16
				
					0
					 × 1

Exactly 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Thank you, i will try it and ping you later , many thanks

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedTurtle16
				
					0
					 × 1

For configuring a specific docker image to use when running tasks in the ClearML Agent Docker mode, see default_docker here: https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_agent_install_configure.html#adding-clearml-agent-to-a-configuration-file

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

The ClearML agent can be configured to run in docker mode, meaning it will run tasks inside docker containers (you can specify which docker container the agent will use when running the tasks)

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

ClearML looks for this file in the home folder (i.e. ~ )

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Use the -v option 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

See here: https://allegro.ai/clearml/docs/docs/use_cases/clearml_agent_use_case_examples.html#running-workers

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Write your answer

1K Views

20 Answers

3 years ago

2 years ago