Hello! The Agent-Services Present In Clearml Server'S Docker-Compose Is Only For Cleanup Tasks, Right ? For Training I Would Need To Run Another Instance Of Clearml-Agent Alongside The Docker-Compose ?

Answered

Hello!

The agent-services present in ClearML server's docker-compose is only for cleanup tasks, right ? For training I would need to run another instance of clearml-agent alongside the docker-compose ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ShinyRabbit94
				
					0
					 × 1

Votes Newest

Answers 7

ShinyRabbit94 , Hi 🙂

Yes. Please note that the machine you run the agent on needs to have all the resources to run your experiments (gpu etc).

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

To add to Natan's answer, you can run on the services docker anything depending on the HW. We don't recommend training with it as the server's machine might get overloaded. What you can do is simple stuff like cleanup or any other routines 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AnxiousSeal95
				
					0
					 × 1

Maybe it is some sort of misunderstanding from my side ? I thought :
Task.enqueue(task, queue_name="training_queue")is what starts the execution of the task. Do I need another function ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ShinyRabbit94
				
					0
					 × 1

Does it enqueue the task? From what you posted it should simply create a task and then enqueue it without any further action

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Thank you! Is there a way to test the agent on a machine without GPU ?
When running this little script, I can see my agent installing the requirements, but it does not seem to ever start running the task.
task = Task.create( project_name="train", task_name="train", requirements_file="./requirements.txt", repo="") task.set_script(entry_point="./test.py") Task.enqueue(task, queue_name="training_queue")The logs are as follows :
` Starting Task Execution:

ClearML results page:

Leaving process id 1863263
DONE: Running task '89359e55ffe942a3bfa7cc72b2e0357d', exit status 0 `

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ShinyRabbit94
				
					0
					 × 1

It seems the agent does not like working with scripts located inside a git repository, I moved the requirements and the script in a folder without a .git and it works now, thank you!

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ShinyRabbit94
				
					0
					 × 1

Enqueue simply puts the task in the queue to be picked up by an agent. You need an agent running and listening to the queue for it to be picked up 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Write your answer

1K Views

7 Answers

3 years ago

one year ago