How Can I Add My Requirements.Txt File To The Pipeline Instead Of Each Tasks?

I think I'm missing the connection between the hash-ids and the txt file, or in other words why is the txt file containing full path not relative path

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I have a pipeline which I am able to run locally, the pipeline has a pipeline controller along with 4 tasks, download data, training, testing and predict. How do I run execute this whole pipeline remotely so that each task is executed sequentially?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Can you explain how running two agents would help me run the whole pipeline remotely? Sorry if its a very basic question

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

correct. notice you need two gents one for the pipeline (logic) and one for the pipeline components.
that said you can run two agents on the same machine 🙂

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Run clearml-agent and enqueue the pipeline ? What am i missing?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

but actually that path doesn't exist and it is giving me an error

So you are saying you only uploaded the "meta-data" i.e. a text file with links to the files, and this is why it is missing?

Is there a way to change the path inside the .txt file to clearml cache, because my images are stored in clearml cache only

I think a good solution would be to store the path in the txt file as relative path, i.e. instead of /Users/adityachaudhry/data/folder... as ./data/folder

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Yes exactly like a Task (pipeline is a type of task)
'''
clonedpipeline=Task.clone(pipeline_uid_here)
Task.enqueue(...)
'''

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

For running the pipeline remotely I want the path to be like /Users/adityachaudhry/.clearml/cache/......

I'm not sure I follow, if you are getting a path with all your folders from get_local_copy , that's exactly what you are looking for, no?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

My git repo only contains the hash-ids which are used to download the dataset into my local machine

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

One more thing in my git repo there is a dataset folder that contains hash-ids, these hash-ids are used to download the dataset. When I am running the pipeline remotely the files/images are downloaded in the cloned git repo inside the .clearml/venvs but when I check inside that venvs folder there are not images present.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Is there a way to change the path inside the .txt file to clearml cache, because my images are stored in clearml cache only

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Okk, thanks!

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Is there a way to clone the whole pipeline, just like we clone tasks

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

The issue I am facing is when i do get_local_copy() the dataset(used for tarining yolov8) is downloaded inside the clearml cache (my image dataset contains images, labels, .txt files which has path to the images and a .yaml file). The downloaded .txt files shows that the image files are downloaded in the git repo present inside the clearml venvs, but actually that path doesn't exist and it is giving me an error

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Is there a way to work around this?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Hi @<1610083503607648256:profile|DiminutiveToad80>
You mean the pipeline logic? It should autodetect the imports of the logic function (like any Task.init call)
You can however call Task.force_requirements_env_freeze and pass a local requiremenst.txt
Make sure to call it before create the Pipeline object
None

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

when I am running the pipeline remotely, I am getting the following error message

There appear to be 6 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Yeah you can ignore those, this is some python GC stuff, seems to be related with the OS and python version

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I want to understand what's happening at the backend. I want to know how running the pipeline logic and the tasks on separate agents gonna sync everything up

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

I am uploading the dataset (for Yolov8 training) as an artifact, when I am downloading the artifact (.zip file) from the UI the path to images is something like /Users/adityachaudhry/.clearml/cache/......, but when I am doing .get_local_copy() I am getting the local folder structure where I have my images locally in my system as path. For running the pipeline remotely I want the path to be like /Users/adityachaudhry/.clearml/cache/......

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

@<1610083503607648256:profile|DiminutiveToad80> I think you need some backround on how the agents work, see here: None
None
None

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

So for my project I have a dataset present in my local system, when I am running the pipeline remotely is there a way the remote machine can access it?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

, when I am running the pipeline remotely is there a way the remote machine can access it?

Well for the dataset to be accessible, you need to upload it with Dataset class, then the remote machine can do Dataset.get(...).get_local_copy() to get the actual data on the remote machine

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

So I should clone the pipeline, run the agent and then enqueue the cloned pipeline?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					DiminutiveToad80
				
					0
					 × 1

Answers 24