Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
How Can I Add My Requirements.Txt File To The Pipeline Instead Of Each Tasks?

How can I add my requirements.txt file to the pipeline instead of each tasks?

  
  
Posted 7 months ago
Votes Newest

Answers 24


The issue I am facing is when i do get_local_copy() the dataset(used for tarining yolov8) is downloaded inside the clearml cache (my image dataset contains images, labels, .txt files which has path to the images and a .yaml file). The downloaded .txt files shows that the image files are downloaded in the git repo present inside the clearml venvs, but actually that path doesn't exist and it is giving me an error

  
  
Posted 7 months ago

One more thing in my git repo there is a dataset folder that contains hash-ids, these hash-ids are used to download the dataset. When I am running the pipeline remotely the files/images are downloaded in the cloned git repo inside the .clearml/venvs but when I check inside that venvs folder there are not images present.

  
  
Posted 7 months ago

For running the pipeline remotely I want the path to be like /Users/adityachaudhry/.clearml/cache/......

I'm not sure I follow, if you are getting a path with all your folders from get_local_copy , that's exactly what you are looking for, no?

  
  
Posted 7 months ago

My git repo only contains the hash-ids which are used to download the dataset into my local machine

  
  
Posted 7 months ago

but actually that path doesn't exist and it is giving me an error

So you are saying you only uploaded the "meta-data" i.e. a text file with links to the files, and this is why it is missing?

Is there a way to change the path inside the .txt file to clearml cache, because my images are stored in clearml cache only

I think a good solution would be to store the path in the txt file as relative path, i.e. instead of /Users/adityachaudhry/data/folder... as ./data/folder

  
  
Posted 7 months ago

Is there a way to change the path inside the .txt file to clearml cache, because my images are stored in clearml cache only

  
  
Posted 7 months ago

Hi @<1610083503607648256:profile|DiminutiveToad80>
You mean the pipeline logic? It should autodetect the imports of the logic function (like any Task.init call)
You can however call Task.force_requirements_env_freeze and pass a local requiremenst.txt
Make sure to call it before create the Pipeline object
None

  
  
Posted 7 months ago

Is there a way to clone the whole pipeline, just like we clone tasks

  
  
Posted 7 months ago

Okk, thanks!

  
  
Posted 7 months ago

I have a pipeline which I am able to run locally, the pipeline has a pipeline controller along with 4 tasks, download data, training, testing and predict. How do I run execute this whole pipeline remotely so that each task is executed sequentially?

  
  
Posted 7 months ago

correct. notice you need two gents one for the pipeline (logic) and one for the pipeline components.
that said you can run two agents on the same machine 🙂

  
  
Posted 7 months ago

Yes exactly like a Task (pipeline is a type of task)
'''
clonedpipeline=Task.clone(pipeline_uid_here)
Task.enqueue(...)
'''

  
  
Posted 7 months ago

So I should clone the pipeline, run the agent and then enqueue the cloned pipeline?

  
  
Posted 7 months ago

Run clearml-agent and enqueue the pipeline ? What am i missing?

  
  
Posted 7 months ago

@<1610083503607648256:profile|DiminutiveToad80> I think you need some backround on how the agents work, see here: None
None
None

  
  
Posted 7 months ago

I want to understand what's happening at the backend. I want to know how running the pipeline logic and the tasks on separate agents gonna sync everything up

  
  
Posted 7 months ago

Can you explain how running two agents would help me run the whole pipeline remotely? Sorry if its a very basic question

  
  
Posted 7 months ago

when I am running the pipeline remotely, I am getting the following error message

There appear to be 6 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

  
  
Posted 7 months ago

So for my project I have a dataset present in my local system, when I am running the pipeline remotely is there a way the remote machine can access it?

  
  
Posted 7 months ago

Yeah you can ignore those, this is some python GC stuff, seems to be related with the OS and python version

  
  
Posted 7 months ago

I am uploading the dataset (for Yolov8 training) as an artifact, when I am downloading the artifact (.zip file) from the UI the path to images is something like /Users/adityachaudhry/.clearml/cache/......, but when I am doing .get_local_copy() I am getting the local folder structure where I have my images locally in my system as path. For running the pipeline remotely I want the path to be like /Users/adityachaudhry/.clearml/cache/......

  
  
Posted 7 months ago

, when I am running the pipeline remotely is there a way the remote machine can access it?

Well for the dataset to be accessible, you need to upload it with Dataset class, then the remote machine can do Dataset.get(...).get_local_copy() to get the actual data on the remote machine

  
  
Posted 7 months ago

Is there a way to work around this?

  
  
Posted 7 months ago

I think I'm missing the connection between the hash-ids and the txt file, or in other words why is the txt file containing full path not relative path

  
  
Posted 7 months ago
605 Views
24 Answers
7 months ago
7 months ago
Tags