Hello Everyone! Today I Found The Following Challenge. I Do Run Locally The Clearml Server And The Clearml Agent. I Would Like To Have The Option Of Running Some Experiments In Docker Mode(--Docker). For Example, When Someone Uses A Different Version Of

Answered

Hello everyone!

Today I found the following challenge. I do run locally the clearML server and the clearML agent. I would like to have the option of running some experiments in docker mode(--docker). For example, when someone uses a different version of python or any other dependency. Is it meaningful? As far as I understand I have to have this docker build already. Do I understand everything correctly:

this Docker should have all the dependencies + clearML + clearML agent
how do I report all the metrics to my local clearML server? Should I copy my config and do the port forwarding when running the docker? Is there any example?
What about the dataset? Do I just mount the path for the data when running the docker?

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

Votes Newest

Answers 23

Here are the logs.

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

I can see the docker in docker ps but it seems like it never gets to code execution. I do not have an idea where it got from. Seems like somewhere it gets "pip" + "pip".

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

this is not a typical setup...

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Yes api server is on the same machine -> running in container
web_server: http://localhost:8080
api_server: http://localhost:8008
files_server: http://localhost:8081

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

Is the apiserver running in a docker container on the same machine?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

ExasperatedCrocodile76 the agent is responsible for installing itself, clearml and any other requirement you have on the task when starting the docker container. The agent also makes sure the same settings it uses (server address, credentials etc.) are passed to the task running inside the docker, so you just need to make sure the agent is configured properly

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

What do you mean by requirements by the docker? You can set the default docker in clearml.conf but you can always specify a different docker image on the Task level that will override this

  				
Posted 
	2 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

I'm at the point where it looks like the clearml-agent is stuck (How i execute the agent: clearml-agent daemon --queue "default" --gpus 0 --foreground --docker. After the last message: " Successfully installed:<dependencies>" nothing really happens. I do attach logs from experiment. And I also do provide the config:

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

This is usually the part where the agent starts to run within the container... btw, what is "pippip" ?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Ok I will try this feature and let you know if I will see any problems. Thank you ! 🙂

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

So probably you are right - > nc -vz localhost 8080
Output when run locally not in docker: Connection to localhost (127.0.0.1) 8080 port [tcp/http-alt] succeeded!
Output when inside docker bash: localhost [127.0.0.1] 8080 (http-alt) : Connection refused

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

I know, I just meant most people don't install the server on the same machine they use for running experiments in docker mode 🙂
In any case, if you make sure the docker containers are using the same docker network, you could refer to the apiserver as http://apiserver:8008

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Are you sure the server is reachable from within the docker container using the provided URL?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

ExasperatedCrocodile76 hi, try to pass “--network=host” to --docker_args
example:

clearml-task --project project --name name --script run.py --queue queue --requirements requirements.txt --docker python:3.7.13-bullseye --docker_args "--cpus=8 --memory=16g --network=host"

  				
Posted 
	2 years ago

					More  		
  Report
		
					EnviousPanda91
				
					0
					 × 1

In that case, you probably can't use localhost and you have to use the docker network to access it

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Hi ExasperatedCrocodile76 ,

When running in docker mode the agent should handle all the points you raised above and just work 🙂

  				
Posted 
	2 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

So, I just just define all the requirements for the docker in clearml.conf in default_docker part?

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

RE: When people do not install the server on the same machine, how is it possible for them then ? I cant reach apiserver / clearml-apiserver.
After new installation of clearml-agent and clearml I still do have the same problem.

Example: I have a simple python script and defined default_docker in clearml.config. When i clone this experiment and run it from clearml dashboard my clearml-agent running in docker mode should execute this task in docker. However, it is stucked after dependencies in the docker were successfully installed.

I tried to set up the API addresses in clearml.config based on docker ip addresses (from docker inspect) but still I am stucked there.

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

also, if you run the server on the same machine and ports are exposed outside of the docker networkm you can just reference localhost?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Well, if the machine you're installing on has a public name, you typically simply use it

  				
Posted 
	2 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Do I have to do some port forwarding / add extra parameters ? Copy clearml.conf inside of the docker ? And all the stuff ? Because it does not seems to be done automatically.

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

I just set up my server from following url : https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server_linux_mac/

  				
Posted 
	2 years ago

					More  		
  Report
		
					ExasperatedCrocodile76
				
					0
					 × 1

Write your answer

2K Views

23 Answers

2 years ago