Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
I Uncommented The Line

I uncommented the line path: ~/.clearml/venvs-cache in the remote agent’s clearml.conf but the remote agent still keeps re-installing all pkgs in the venv. (The agent uses the requirements.txt ) Any idea why?
I also confirmed that there is a cached venv created, and it says so in the logs
Adding venv into cache: /home/pchalasani/.clearml/venvs-builds/3.8

  
  
Posted 2 years ago
Votes Newest

Answers 28


I mean it is in Pip mode and the agent installs deps from git repo that it pulls

  
  
Posted 2 years ago

I see, so there’s no way to launch a variant of my last run (with say some config/code tweaks) via CLI, and have it re-use the cached venv?

  
  
Posted 2 years ago

I mean from the requirements .txt it gets from the git repo

  
  
Posted 2 years ago

… but I have a feeling they will not give me the “instant venv activation” behavior I’m looking for.

  
  
Posted 2 years ago

Yes after installing , it listed the installed packages in the console , with version of each

  
  
Posted 2 years ago

But “cloning” via UI runs an exact copy of the code/config, not a variant, unless I edit those via UI (which is not ideal). So it looks like the following workflow that is trivial to do locally is not possible via remote agents:
run exp tweak code/configs in IDE, or tweak configs via CLI have it re-rerun in exact same venv (with no install overhead etc)
So maybe the remote agents are more meant for enqueuing a whole collection of settings (via code) and checking back in a few hours (in which case we don’t care about pip install times) ?

  
  
Posted 2 years ago

So net-net does this mean it’s behaving as expected, or is there something I need to do enable “full venv cache”? It spends nearly 2 mins starting from
created virtual environment CPython3.8.10.final.0-64 in 97ms creator CPython3Posix(dest=/home/pchalasani/.clearml/venvs-builds/3.8, clear=False, global=False)and then printing several lines lines like this
` Successfully installed pip-20.1.1
Collecting Cython
Using cached Cython-0.29.30-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
Installing collected packages: Cython
Successfully installed Cython-0.29.30
Trying pip install: /home/pchalasani/.clearml/venvs-builds/3.8/task_repository/repo.git/requirements.txt
Collecting numpy==1.23.1
Using cached numpy-1.23.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
Installing collected packages: numpy
Successfully installed numpy-1.23.1
2022-07-16 14:11:47
Found PyTorch version torch==1.12.0; python_full_version >= "3.7.0" and python_version >= "3.7" matching CUDA version 0
Collecting torch==1.12.0+cpu
File was already downloaded /home/pchalasani/.clearml/pip-download-cache/cu0/torch-1.12.0+cpu-cp38-cp38-linux_x86_64.whl
Successfully downloaded torch

Collecting aiofiles==0.8.0
Using cached aiofiles-0.8.0-py3-none-any.whl (13 kB)
Collecting aiohttp-retry==2.5.2
Using cached aiohttp_retry-2.5.2-py3-none-any.whl (8.0 kB) and finally prints Adding venv into cache: /home/pchalasani/.clearml/venvs-builds/3.8
Running task id [a699bbe2e5314cb9bd598ba0bfae8ef0]: and Summary - installed python packages:
pip:

  • absl-py==1.1.0
  • aim==3.11.2
  • aim-ui==3.11.2
  • aimrecords==0.0.7
  • aimrocks==0.2.1
    ... `It seems to me if it really used the cached virtual env, it shouldn’t be spending 2 mins “installing” already cached/downloaded pkgs. I was expecting it to simply activate the cached venv and get down to running the script. So I think I am missing something here
  
  
Posted 2 years ago

Oh I think I know what missed. When I set --project … --name … they did not match the names I used when I did task.init( ) in my code

  
  
Posted 2 years ago

will try what you suggested above

  
  
Posted 2 years ago

So net-net does this mean it’s behaving as expected,

It is as expected.
If no "Installed Packages" are listed, then it cannot pull a cached venv (because requirements.txt is not a full env, and it never analyzed it)).
It does however create a venv cache based on it (after installing it)
The Clone of this Task (i.e. right click on the UI clone experiment, enqueue it, Will use the cached copy becuase the full packages are listed in the "Installed Packages" section of the Task.
Make sense?

BTW:

it shouldn’t be spending 2 mins “installing” already cached/downloaded pkgs.

unfortunately pip is not very efficient in installing packages, even if they are already downloaded, hence the cache in the first place

  
  
Posted 2 years ago

A quick note for others who may visit this… it looks like you have to do:
Task.force_requirements_env_freeze(force=True, requirements_file="requirements.txt")to ensure any changes in requirements.txt are reflected in the remote venv

  
  
Posted 2 years ago

Actually that did not help. Still same behavior

  
  
Posted 2 years ago

I see, so there’s no way to launch a variant of my last run (with say some config/code tweaks) via CLI, and have it re-use the cached venv?

Try:
clearml-task ... --requirements requirements.txtYou can also clone / override args with
clearml-task --base-task-id <ID-of-original-task-post-agent> --args ...See full doc: https://clear.ml/docs/latest/docs/apps/clearml_task/

  
  
Posted 2 years ago

BTW: why use CLI? the idea of clearml it becomes part of the code, even in the development process, this means add "Task.init(...)" at the beginning of the code, this creates the Tasks and logs them as part of the development. Which means that xecuting them is essentially cloning and enqueuing in the UI. Of course you can automate it directly as part of the code.

  
  
Posted 2 years ago

This works great, thanks AgitatedDove14 👍

  
  
Posted 2 years ago

Actually with base-task-id it uses the cached venv, thanks for this suggestion! Seems like this is equivalent to cloning via UI.
And I will look into the non-cli workflow you’re suggesting.

  
  
Posted 2 years ago

HurtWoodpecker30

The agent uses the

requirements.txt

)

what do you mean by that? aren't the package listed in the "Installed packages" section of the Task?
(or is it empty when starting, i.e. it uses the requirements.txt from the github, and then the agent lists them back into the Task)

  
  
Posted 2 years ago

Actually with

base-task-id

it uses the cached venv, thanks for this suggestion! Seems like this is equivalent to cloning via UI.

exactly !

But “cloning” via UI runs an exact copy of the code/config, not a variant,

You can override the commit/branch and get the latest ...

run exp tweak code/configs in IDE, or tweak configs via CLI have it re-rerun in exact same venv (with no install overhead etc)So you can actually launch it remotely directly from the code:
Basically at any point in your code (after you see evrything is okay and not crashing), add:
task.execute_remotely(queue_name="my_execution_queue_here)This will stop the current execution, and relaunch it on the remote agent. This process also automatically captures the packages you are using in your own setup, so it does not rely on "requirements.txt" that people often forget to update. Obviously it will also cache the venv.
wdyt?

  
  
Posted 2 years ago

I'm not familiar with “installed package s” list in the task

  
  
Posted 2 years ago

Thanks for the quick response . Will look into this later , I think I understand

  
  
Posted 2 years ago

HurtWoodpecker30 in order to have the venv cache activated, it uses the full "pip freeze" it stores on the "installed packages", this means that when you cloned a Task that was already executed, you will see it is using the cached venv.
(BTW: the packages themselves are cached locally, meaning no time is spent on downloading just on installing, but this is also time consuming, hence the full venv cache feature).
Make sense ?

  
  
Posted 2 years ago

I think I’m starting to “get” this 🙂

  
  
Posted 2 years ago

I use a CLI arg remote=True so depending on that it will run locally or remotely.

  
  
Posted 2 years ago

The cached venv path .clearml/venvs-cache is nowhere mentioned in the logs

  
  
Posted 2 years ago

Nice ! 🙂
btw: clone=True means creating a copy of the running Task, but basically there is no need for that , with clone=False, it will stop the running process, and launch it on the remote host, logging everything on the original Task.

  
  
Posted 2 years ago

I have a strong attachment to a workflow based on CLI, nice zsh auto-suggestions, Hydra and the like. Hence why I moved away from dvc 🙂

  
  
Posted 2 years ago

That seems pretty powerful, will give that a try, thanks

  
  
Posted 2 years ago

I used
task.execute_remotely(queue_name=..., clone=True)and indeed it instantly activates the venv on the remote. I assume clone=True is fine

  
  
Posted 2 years ago
1K Views
28 Answers
2 years ago
2 years ago
Tags