Reputation
Badges 1
43 × Eureka!as a workaround I just stick the epoch number in the series argument of report_scatter2d , with the same title name
Of course conda needs to be installed, it is using a pre-existing condaΒ env, no?! what am I missing
its not a conda env, just a regular venv (poetry in this specific case)
And the assumption is the code is also there ?
yes. The user is responsible for the entire setup. the agent just executes python <path to script> <current hpo args>
Regardless, it would be very convenient to add a flag to the agent which point it to an existing virtual environment and bypassing the entire setup process. This would facilitate ramping up new users to clearml who don't want the bells and whistles and would just a simple HPO from an existing env (which may not even exist as part of a git repo)
I see what you mean. So in a simple "all-or-nothing" solution I have to choose between potentially starving either the single node tasks (high priority + wait) or multi-node tasks (wait for a time when there are enough available agents and only then allocate the resource).
I actually meant NCCL. nvcc is the CUDA compiler π
NCCL communication can be both inter- and intra- node
It's a very convenient way of doing a parameter sweep on with minimal setup effort
another question - when running a non-dockerized agent and setting CLEARML_AGENT_SKIP_PIP_VENV_INSTALL , I still see things being installed when the experiment starts. Why does that happen?
I think so. IMHO all API calls should maybe reside in a different module since they usually happen inside some control code
You mean running everything on a single machine (manually)?
Yes, but not limited to this.
I want to be able to install the venv in multiple servers and start the "simple" agents in each one on them. You can think of it as some kind of one-off agent for a specific (distributed) hyperparameter search task
Thanks AgitatedDove14 . I'll try that
great!
Is there a way to add this for an existing task's draft via the web UI?
that was my next question π
How does this design work with a stateful search algorithm?
An easier fix for now will probably be some kind of warning to the user that a task is created but not connected
oops. I used create instead of init π³
The legacy version worked just before I mv ed the folder but now (after reverting to the old name) that doesn't work also π’
I'm trying to achieve a workflow similar to the one in wandb for parameter sweep where there are no venvs involved other than the one created by the user π
the hack doesn't work if conda is not installed π
hows does this work with HPO?
the tasks are generated in advance?