As for (2), I'm not sure I understand the use-case - when using ClearML Agent, the agent will take experiments waiting in a queue, so I'm not sure I understand your intention when you first run the agent and than run the docker manually next to it
Thank you it worked. so i am half way through.
Is there a solution where i can use the clearml-task command directly? it would help to kick the experiment from the gitlab ci.
++ -- --script
I don't think it is possible right ? giving that i should mount the config file every time ?
So the flow, if you want to use
clearml-task , is as follows:
You install ClearML Agent on your worker machine (DGX, in your case). The agent monitors the ClearML Server for a specific queue(s) and wait for tasks to be enqueued there. The Agent should be configured with the correct
clearml.conf file in order to be able to access the server. You use
clearml-task to create new tasks.
clearml-task will create a task as you specify, and will enqueue it to the queue of your choice. The Agent will pick up the task, and start executing it on the machine, using the same configuration file you provided to the agent
Hello again and sorry fo the delay,
I tried what you have told me and got it to work but one issue i have is that i want to use ssh when cloning the repo:
clearml-task --project name --name task_name --repo firstname.lastname@example.org:username/project.git --commit commit_sha --script path/to/script.py --queue queueThis doesn't work saying that :
Error: Script entrypoint file mailto:/email@example.com
:username/project.git/path/to/script.py' could not be found.
For configuring a specific docker image to use when running tasks in the ClearML Agent Docker mode, see
default_docker here: https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_agent_install_configure.html#adding-clearml-agent-to-a-configuration-file