Hi, I Am Running A File Like This

Answered

Hi, I am running a file like this
python train_it.py
Task.force_requirements_env_freeze(False) task = Task.init(project_name='playground', task_name='base') task.set_base_docker('ultralytics/yolov5:latest') args={ 'alpha':0.3} args=task.connect(args) task.execute_remotely(queue_name='gpu_glue_q')This gets executed in a remote k8 pod. Since I have already set a docker image, in the remote pod, i dont want all packages in the requirements.txt to get installed again. Even though i edited the requirements.txt file to just have clearml (removed the rest), it still was installing all the packages (like pytorch etc)..
What settings should I change ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					DeliciousBluewhale87
				
					0
					 × 1

Votes Newest

Answers 14

Well, you need to make sure the agent runs with the same settings - as these are in the agent section, they won't affect your locally running SDK

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Hi AgitatedDove14 , This isnt the issue. With or without specifying the queue, I have this error when I do the "Create version" as compared to the "Init version".
I wonder whether this is some issue with using the Create version together with execute_remotely() ..

  				
Posted 
	3 years ago

					More  		
  Report
		
					DeliciousBluewhale87
				
					0
					 × 1

Hi DeliciousBluewhale87 ,
In your agent configuration file, set agent.package_manager.system_site_packages: true

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Do this by mapping a clearml.conf file to the agent

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Hi SuccessfulKoala55
I have set it True some time back already

  				
Posted 
	3 years ago

					More  		
  Report
		
					DeliciousBluewhale87
				
					0
					 × 1

Using clearml-task, I am able to pass in the exact requirements.txt file, I am not sure how we can accomplish that when you using the Python train_it.py and execute_remotely() option.
AgitatedDove14

  				
Posted 
	3 years ago

					More  		
  Report
		
					DeliciousBluewhale87
				
					0
					 × 1

Hi DeliciousBluewhale87
You can achieve the same results programmatically with Task.create
https://github.com/allegroai/clearml/blob/d531b508cbe4f460fac71b4a9a1701086e7b6329/clearml/task.py#L619

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hmm... AgitatedDove14 any idea?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Notice that in your execute_remotely() you did not specify a queue to put the current Task into
What it does is it stops the current running code and it puts the newly created task into the specified queue, if you do not specify a queue , it will just abort it, and wait for you to Manually enqueue it.
To solve it:
task.execute_remotely(queue_name='my_queue')

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

That's in the agent, not in your local settings when running the task, right?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

The warning just let's you know the current processes stopped and itis being launched on a remote machine.
What am I missing? Is the agent failing to run the job that you create manually ?
(notice that when creating a job manually, there is no "execute_remotely", you just enqueue it, as it is not actually "running locally")
Make sense ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi AgitatedDove14 , Attached my create version compared to init version..
When I enqueue both the init and create version into my clearmlQueue, it seems the create version doesnt execute at all.
It just mentions "2021-05-26 16:02:13,053 - clearml - WARNING - Terminating local execution process" and says it has completed successfully.

  				
Posted 
	3 years ago

					More  		
  Report
		
					DeliciousBluewhale87
				
					0
					 × 1

The above screenshot is from my local settings... My agents run in the k8s system (like in a pod)

  				
Posted 
	3 years ago

					More  		
  Report
		
					DeliciousBluewhale87
				
					0
					 × 1

I just downloaded the logs from the Failed task. Seem I have set the agent.package_manager.system_site_packages: true in the agent as well.

  				
Posted 
	3 years ago

					More  		
  Report
		
					DeliciousBluewhale87
				
					0
					 × 1

Write your answer

1K Views

14 Answers

3 years ago

2 years ago