I mean it is in Pip mode and the agent installs deps from git repo that it pulls
I see, so there’s no way to launch a variant of my last run (with say some config/code tweaks) via CLI, and have it re-use the cached venv?
I mean from the requirements .txt it gets from the git repo
… but I have a feeling they will not give me the “instant venv activation” behavior I’m looking for.
Yes after installing , it listed the installed packages in the console , with version of each
But “cloning” via UI runs an exact copy of the code/config, not a variant, unless I edit those via UI (which is not ideal). So it looks like the following workflow that is trivial to do locally is not possible via remote agents:
run exp tweak code/configs in IDE, or tweak configs via CLI have it re-rerun in exact same venv (with no install overhead etc)
So maybe the remote agents are more meant for enqueuing a whole collection of settings (via code) and checking back in a few hours (in which case we don’t care about pip install times) ?
So net-net does this mean it’s behaving as expected, or is there something I need to do enable “full venv cache”? It spends nearly 2 mins starting fromcreated virtual environment CPython3.8.10.final.0-64 in 97ms creator CPython3Posix(dest=/home/pchalasani/.clearml/venvs-builds/3.8, clear=False, global=False)
and then printing several lines lines like this
` Successfully installed pip-20.1.1
Collecting Cython
Using cached Cython-0.29.30-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.9 MB)
Installing collected packages: Cython
Successfully installed Cython-0.29.30
Trying pip install: /home/pchalasani/.clearml/venvs-builds/3.8/task_repository/repo.git/requirements.txt
Collecting numpy==1.23.1
Using cached numpy-1.23.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
Installing collected packages: numpy
Successfully installed numpy-1.23.1
2022-07-16 14:11:47
Found PyTorch version torch==1.12.0; python_full_version >= "3.7.0" and python_version >= "3.7" matching CUDA version 0
Collecting torch==1.12.0+cpu
File was already downloaded /home/pchalasani/.clearml/pip-download-cache/cu0/torch-1.12.0+cpu-cp38-cp38-linux_x86_64.whl
Successfully downloaded torch
Collecting aiofiles==0.8.0
Using cached aiofiles-0.8.0-py3-none-any.whl (13 kB)
Collecting aiohttp-retry==2.5.2
Using cached aiohttp_retry-2.5.2-py3-none-any.whl (8.0 kB) and finally prints
Adding venv into cache: /home/pchalasani/.clearml/venvs-builds/3.8
Running task id [a699bbe2e5314cb9bd598ba0bfae8ef0]: and
Summary - installed python packages:
pip:
- absl-py==1.1.0
- aim==3.11.2
- aim-ui==3.11.2
- aimrecords==0.0.7
- aimrocks==0.2.1
... `It seems to me if it really used the cached virtual env, it shouldn’t be spending 2 mins “installing” already cached/downloaded pkgs. I was expecting it to simply activate the cached venv and get down to running the script. So I think I am missing something here
Oh I think I know what missed. When I set --project … --name …
they did not match the names I used when I did task.init( )
in my code
So net-net does this mean it’s behaving as expected,
It is as expected.
If no "Installed Packages" are listed, then it cannot pull a cached venv (because requirements.txt is not a full env, and it never analyzed it)).
It does however create a venv cache based on it (after installing it)
The Clone of this Task (i.e. right click on the UI clone experiment, enqueue it, Will use the cached copy becuase the full packages are listed in the "Installed Packages" section of the Task.
Make sense?
BTW:
it shouldn’t be spending 2 mins “installing” already cached/downloaded pkgs.
unfortunately pip is not very efficient in installing packages, even if they are already downloaded, hence the cache in the first place
A quick note for others who may visit this… it looks like you have to do:Task.force_requirements_env_freeze(force=True, requirements_file="requirements.txt")
to ensure any changes in requirements.txt are reflected in the remote venv
Actually that did not help. Still same behavior
I see, so there’s no way to launch a variant of my last run (with say some config/code tweaks) via CLI, and have it re-use the cached venv?
Try:clearml-task ... --requirements requirements.txt
You can also clone / override args withclearml-task --base-task-id <ID-of-original-task-post-agent> --args ...
See full doc: https://clear.ml/docs/latest/docs/apps/clearml_task/
BTW: why use CLI? the idea of clearml it becomes part of the code, even in the development process, this means add "Task.init(...)" at the beginning of the code, this creates the Tasks and logs them as part of the development. Which means that xecuting them is essentially cloning and enqueuing in the UI. Of course you can automate it directly as part of the code.
This works great, thanks AgitatedDove14 👍
Actually with base-task-id
it uses the cached venv, thanks for this suggestion! Seems like this is equivalent to cloning via UI.
And I will look into the non-cli workflow you’re suggesting.
HurtWoodpecker30
The agent uses the
requirements.txt
)
what do you mean by that? aren't the package listed in the "Installed packages" section of the Task?
(or is it empty when starting, i.e. it uses the requirements.txt from the github, and then the agent lists them back into the Task)
Actually with
base-task-id
it uses the cached venv, thanks for this suggestion! Seems like this is equivalent to cloning via UI.
exactly !
But “cloning” via UI runs an exact copy of the code/config, not a variant,
You can override the commit/branch and get the latest ...
run exp tweak code/configs in IDE, or tweak configs via CLI have it re-rerun in exact same venv (with no install overhead etc)So you can actually launch it remotely directly from the code:
Basically at any point in your code (after you see evrything is okay and not crashing), add:task.execute_remotely(queue_name="my_execution_queue_here)
This will stop the current execution, and relaunch it on the remote agent. This process also automatically captures the packages you are using in your own setup, so it does not rely on "requirements.txt" that people often forget to update. Obviously it will also cache the venv.
wdyt?
I'm not familiar with “installed package s” list in the task
Thanks for the quick response . Will look into this later , I think I understand
HurtWoodpecker30 in order to have the venv cache activated, it uses the full "pip freeze" it stores on the "installed packages", this means that when you cloned a Task that was already executed, you will see it is using the cached venv.
(BTW: the packages themselves are cached locally, meaning no time is spent on downloading just on installing, but this is also time consuming, hence the full venv cache feature).
Make sense ?
I use a CLI arg remote=True so depending on that it will run locally or remotely.
The cached venv path .clearml/venvs-cache
is nowhere mentioned in the logs
Nice ! 🙂
btw: clone=True
means creating a copy of the running Task, but basically there is no need for that , with clone=False, it will stop the running process, and launch it on the remote host, logging everything on the original Task.
I have a strong attachment to a workflow based on CLI, nice zsh auto-suggestions, Hydra and the like. Hence why I moved away from dvc 🙂
That seems pretty powerful, will give that a try, thanks
I usedtask.execute_remotely(queue_name=..., clone=True)
and indeed it instantly activates the venv on the remote. I assume clone=True is fine