Can Someone Point Me Whether/How The Services-Agent The Starts With The Clearml-Server Mounts The

Answered

Can someone point me whether/how the services-agent the starts with the clearml-server mounts the .ssh directory for accessing a private git repository from a docker? E.g. to run https://github.com/allegroai/clearml/blob/master/examples/services/cleanup/cleanup_service.py
To me it seems like it is never mounted.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Votes Newest

Answers 30

Ah, very cool! Then I will try this, too.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

It seems like the services-docker is always started with Ubuntu 18.04, even when I use
task.set_base_docker( "continuumio/miniconda:latest -v /opt/clearml/data/fileserver/:{}".format( file_server_mount ) )

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

I followed https://github.com/allegroai/clearml-server#upgrading-

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

No (this is deprecated and was removed because it was confusing)
https://github.com/allegroai/clearml-agent/blob/cec6420c8f40d92ab1cd6cbe5ca8f24cf351abd8/docs/clearml.conf#L101

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

In my case I use the conda freeze option and do not even have CUDA installed on the agents.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

(only works for pyroch because they have diff wheeks for diff cuda versions)

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

It is not explained there, but do you mean
CLEARML_API_ACCESS_KEY: ${CLEARML_API_ACCESS_KEY:-} CLEARML_API_SECRET_KEY: ${CLEARML_API_SECRET_KEY:-}?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

I just updated my server to 1.0 and now the services agent is stuck in restarting:

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

In the new version, we made it so that the default agent credentials embedded in the ClearML Server are disabled is the server is not in the open mode (i.e. requires user/password to login). This is since having those default credentials available in this mode basically means anyone without a password can actually send commands to the server (since these credentials are hard-coded)

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

I see. I was just wondering what the general approach is. I think PyTorch used to ship the pip package without CUDA packaged into it. So with conda it was nice to only install CUDA in the environment and not the host. But with pip, you had to use the host version as far as I know.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Also, what kind of authentication are you using? Fixed users?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Can you try to get the agent log?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

I use fixed users!

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

I got the idea from an error I got when the agent was configured to use pip and tried to install BLAS (for PyTorch I guess) and it threw an error.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

` ocker-compose ps
Name Command State Ports

clearml-agent-services /usr/agent/entrypoint.sh Restarting
clearml-apiserver /opt/clearml/wrapper.sh ap ... Up 0.0.0.0:8008->8008/tcp, 8080/tcp, 8081/tcp
clearml-elastic /usr/local/bin/docker-entr ... Up 9200/tcp, 9300/tcp
clearml-fileserver /opt/clearml/wrapper.sh fi ... Up 8008/tcp, 8080/tcp, 0.0.0.0:8081->8081/tcp
clearml-mongo docker-entrypoint.sh --set ... Up 27017/tcp
clearml-redis docker-entrypoint.sh redis ... Up 6379/tcp
clearml-webserver /opt/clearml/wrapper.sh we ... Up 0.0.0.0:8080->80/tcp, 8008/tcp, 8080/tcp,
8081/tcp `

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Now the pip packages seems to ship with CUDA, so this does not seem to be a problem anymore.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Yeah, I do 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Ah, perfect. Did not know this. Will try! Thanks again! 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Oh, so I think I know what might have happened

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

BTW: the agent will resolve pytorch based on the install CUDA version.

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

In that case I suggest you turn on the venv cache, it will accelerate the conda environment building because it will cache the entire conda env.

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I see. Thanks a lot!

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

I was wrong: I think it uses the agent.cuda_version , not the local env cuda version.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

You'll need to set the agent key and secret using environment variables, as explained here (in step #11): https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_linux_mac.html#deploying

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Do you mean venv_update ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Oh, you're right - I'll make sure we add it there 😄

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

You can simply generate another set of credentials in the profile page, and set them up in these environment variable.
Alternatively, you can add another fixed user, and use its username/password for these values

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Is it also possible to specify different user/api_token for different hosts? For example I have a github and a private gitlab that I both want to be able to access.

ReassuredTiger98 my apologies I just realize you can use ~/.git-credentials for that. The agent will automatically map the host .git-credentials into the docker :)

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

How can I get the agent log?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

However, to use conda as package manager I need a docker image that provides conda.

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Write your answer

1K Views

30 Answers

3 years ago

2 years ago

Answers 30

` ocker-compose psName Command State Ports

` ocker-compose ps
Name Command State Ports