Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
As Soon As I Refactor My Project Into Multiple Folders, Where On Top-Level I Put My Pipeline File, And Keep My Tasks In A Subfolder, The Clearml Agent Seems To Have Problems:

As soon as I refactor my project into multiple folders, where on top-level I put my pipeline file, and keep my tasks in a subfolder, the clearml agent seems to have problems:

  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/commands/worker.py", line 3225, in install_requirements_for_package_api
    raise ValueError("Could not install task requirements!\n{}".format(e))
ValueError: Could not install task requirements!

Is there anything I need to take care of when using multiple directories?

  
  
Posted 4 months ago
Votes Newest

Answers 5


Yes, I do have my files in the git repo. Although I have not quite understood which part it takes from the remote git repo, and which part it takes from my local system.
It seems that one also needs to explicitly hand in the git repo in the pipeline and task definitions via PipelineController, since otherwise, the agent would start the task process in some random working directory, where, of course, it cannot find any own module.

And a second thing: When starting a pipeline, it seems that ClearML would take my local virtual environment and extract its dependencies from that, independent of the requirements.txt or pyproject.toml I have in my repo. I noticed that, because I happened to installed my own project locally into my virtual env (let's call it foo==0.1), and when I started the pipeline, on the agent it looked for a package called foo==0.1 which it didn't find on PyPI (obviously) and thus the pipeline failed. Can I tell ClearML to please not take my local virtual env, but rather install what it needs directly only from the req or .toml files?

And a third thing: The pipeline installs dependencies. Also, the tasks install dependencies, too. How do I avoid redundant dependency installations?

  
  
Posted 4 months ago

Yes, I do have my files in the git repo. Although I have not quite understood which part it takes from the remote git repo, and which part it takes from my local system.

it will do "git pull" on the remote machine and then apply any uncommitted changes it has stored in the Task

It seems that one also needs to explicitly hand in the git repo in the pipeline and task definitions via PipelineController,

Correct, unless the pipeline logic and the steps are the same git repo, you can verify that if you click on the detials of each step and check what is listed under the repo section in the execution tab

And a second thing: When starting a pipeline, it seems that ClearML would take my local virtual environment and extract its dependencies from that, independent of the requirements.txt

Correct, if you want to disable this behaviour set pass " packages=False " to the decorator and the agent will default to the packages in your git repo

  
  
Posted 4 months ago

Hi @<1724960468822396928:profile|CumbersomeSealion22> , what was the structure that worked previously for you and what is the new structure?

  
  
Posted 4 months ago

Actually, it already does not work if my tasks themselves have imports on modules which I wrote myself. I.e., let's say I have the following files in my repo:

pipeline.py
task1.py
task2.py
utils.py

and pipeline.py has imports on task1 and task2, and e.g. task1 does import utils . Then I would get an error on the remote agent ModuleNotFoundError .

  
  
Posted 4 months ago

Hi @<1724960468822396928:profile|CumbersomeSealion22>

As soon as I refactor my project into multiple folders, where on top-level I put my pipeline file, and keep my tasks in a subfolder, the clearml agent seems to have problems:

Notice that you need to specify the git repo for each component. If you have a process (step) with more than a single file, you have to have those files inside a git repository, otherwise the agent will not be able to bring them to the remote machine

  
  
Posted 4 months ago
464 Views
5 Answers
4 months ago
4 months ago
Tags