Hi VirtuousFish83
Apologies for the documentation in the docs π It sounds complicated but actually should be relatively simple. Based on what I understand, you already have the server setup and you code integrated. The question is "can you see an experiment in the UI"? If you do, then you can right click it, clone the experiment , edit parameters and send for execution (enqueue). If the experiment is not in the UI you can either (1) run the code with the Task.init call, it ill automatically populate everything you need, python packages git repo etc. (2) manually provide the necessary definition with clearml-task (i.e. git repo, python packages if you are not using requirements.txt, arguments etc)
Was this what you were looking for?
not exactly, I want to launch the script (create a new experiment, not clone an existing one in the UI), how can I do it ?
Ohh, two options:
From the script itself you can do:from clearml import Task task = Task.init(...) task.execute_remotely(queue='default')
Then run the script locally, it will get until the "execute_remotely call, quit the process and re-launch it on the "default" queue.
Option B:
Use the cleaml-task
$ clearml-task --folder <where the script is> --project ...
See https://github.com/allegroai/clearml/blob/master/docs/clearml-task.md#launching-a-job-from-a-local-script
Thanks ! I think .execute_remotely()
is exactly what I need
The task is registered and is started by the agent, the env seems to be installed well, but then it fails on /home/ubuntu/.clearml/venvs-builds/3.8/bin/python: can't open file 'fastai_classifier.py': [Errno 2] No such file or directory
Do you have an idea of what could be wrong ? The agent launch the script in the wrong working dir ? The repo is not copied ? (This script is inside a private git repo, that clearml detects correctly).
I also tried launching the script from the root of the repo but it's the same: /home/ubuntu/.clearml/venvs-builds/3.8/bin/python: can't open file 'train/classification/fastai_classifier.py': [Errno 2] No such file or directory
Could it be the code is not in a git repository ?clearml
support either a single script or a git repository, but Not a collection of standalone files. wdyt?
The script is inside a git repo (and it's the one I launch, I would get an importerror if it was something else missing)
VirtuousFish83
Hmm that is odd, could you send the full log?
I think I found the problem, if the file is untracked by git, it is not saved by clearml
However I have another problem, my git repo is installed with pip install -e .
and I import it in my script, but on a task executed by a clearml-agent the module appears not to be installed ?
Does clearml-agent install the repo with pip install -e .
if it should be ? (i.e. my local repo is installed with pip install -e .
where I launch my script which calls Task.init
and .execute_remotely()
).
if the file is untracked by git, it is not saved by clearml
Yep π
Does clearml-agent install the repo withΒ
pip install -e .
It is supported, but the path to the repo cannot be absolute (as it will probably be something else in the agent env)
You can add "git+ https://github.com ...." to the "installed packages" The root path of your repository is always added to the PYTHONPATH when the agents executes it, so in theory there is no need to install it with pipwdyt?
Hmm apparently if I launch the script from the root of the repo (CWD: myrepo python train/classif-custom/train.py
) it works, but from its dir it doesn't work (CWD: myrepo/train/classif-custom python train.py
)
You can change it the CWD folder, if you put .
in working dir it will be the root git repo, but you can do any subfolder, obviously you need to change the script path to match the folder, e.g. ./folder/script.py
etc.
I think didn't understand, if I'm not at the root of the repo, I have to specify the working dir ?
How does the folder structure look like, and where is the "package" and the entry script ?