Hello! Thanks Again For All The Hard Work You'Ve Put In To Bring Such A Great Mlops Tool To The Community! I Have A Question About Clearml Agent Config File, Specifically About

Answered

Hello! Thanks again for all the hard work you've put in to bring such a great MLops tool to the community!
I have a question about ClearML agent config file, specifically about default_docker:{arguments: ''} parameters. I used 2 different ways to create a new task. One is when I do it in the script using Task.init() api, and the other one by cloning an existing task in the UI. When the task is created through Task.init() it works as expected and uses docker arguments from a config file default_docker:{arguments: ''} , but if I clone the task and clear everything under CONTAINER tab ( IMAGE and ARGUMENTS parameters), when I trigger such experiment through the the UI after that it only uses IMAGE from config file, but ARGUMENTS stay empty. I was under an impression that it will use arguments form the config file when the task doesn't have specified image and arguments, but apparently it doesn't. Is there a way to make it use both image and arguments from the config file?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BurlyRaccoon64
				
					0
					 × 1

Votes Newest

Answers 11

Hi BurlyRaccoon64 , where is this config file located? Also, when you trigger the experiment from the UI, where is it executed? Do you have clearml-agent configured somewhere? If so, where is it running?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

I have a gpu machine in the local network where clearml-agent is running. I send tasks for execution to the queue configured on the agent (either through the UI, or through the script with lines Task.execute_remotely(queue_name=...) running on another machine in the same network). Config file is located in the /home/{username} folder on the machine where agent is running

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BurlyRaccoon64
				
					0
					 × 1

Do you perhaps have some arguments as part of the docker image itself?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Image was built with the following command docker build --build-arg PAT=$(shell echo ${PAT}) -t $(IMAGE_NAME) . Do you think the fact that --build-arg argument is used may be a problem here? I was thinking that ARGUMENTS parameter is used in combination with docker run to start a container and has nothing to do with image build arguments

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BurlyRaccoon64
				
					0
					 × 1

when I trigger such experiment through the the UI after that it only uses

IMAGE

from config file

BurlyRaccoon64 I was actually asking about the value you have configured in the default image in the config file

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Ahh, sorry about that. I have both image and arguments values in the config file:
default_docker: { image: {our_custom_image_name} arguments: ["--ipc=host", "-v", "/home/{username}/clearml.conf:/workdir/clearml.conf", "-v", "/home/{username}/.ssh:/root/.ssh", "-e", "CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1"] }And, as I said, when the Task is cloned and sent for execution in the UI (and it doesn't have anything in the IMAGE or ARGUMENTS fields on the execution tab initially) only IMAGE field is parsed from the config file and filled with {our_custom_image_name} but ARGUMENTS field stays empty

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BurlyRaccoon64
				
					0
					 × 1

But when a new task is created in the code with Task.init() both fields are parsed correctly

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BurlyRaccoon64
				
					0
					 × 1

And the original task (the one you clone) has the correct image/arguments?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Yes, and if I clone it as it is, everything works as expected, but if I clone it and clear both IMAGE and ARGUMENTS only IMAGE will be parsed when such task is sent for execution. You may ask why do I need to clear something if it works fine with just cloned tasks, but the reason to do it, as we have some old template tasks added by other members of my team (before we switched to user custom image and docker running agent) with empty IMAGE and ARGUMENTS fields and they won't be executed correctly, because of the issue above

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BurlyRaccoon64
				
					0
					 × 1

I'm trying to understand if the clear itself is bad... After you do the clear, can you perhaps check the task's entry in mongodb?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Hi again SuccessfulKoala55 Sorry for a late response. Thanks for your help so far! I understand that it's a weird problem and probably it won't be resolved in this discussion but just in case I've checked the mongodb entry for the task after cloning, after clearing of the fields and after sending it for execution.
After clone:
{ "_id" : "b94657b00adf41dd971426f51d7b9373", "container" : { "image" : "{image_name}", "arguments" : "--ipc=host -v /home/{username}/clearml.conf:/workdir/clearml.conf -e CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1", "setup_shell_script" : "" } }
After clear:
{ "_id" : "b94657b00adf41dd971426f51d7b9373", "container" : { "image" : "", "arguments" : "", "setup_shell_script" : "" } }
After enqueue:
{ "_id" : "b94657b00adf41dd971426f51d7b9373", "container" : { "image" : "{image_name}", "arguments" : "", "setup_shell_script" : "" } }(actual image name and username are replaced with {image_name} and {username} in the message)

I've sent it just in case. Anyway, appreciate you spent a lot of your time on me already, so it's ok to drop it I guess.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					BurlyRaccoon64
				
					0
					 × 1

Write your answer

1K Views

11 Answers

2 years ago

one year ago