ShinyRabbit94 , Hi 🙂
Yes. Please note that the machine you run the agent on needs to have all the resources to run your experiments (gpu etc).
Enqueue simply puts the task in the queue to be picked up by an agent. You need an agent running and listening to the queue for it to be picked up 🙂
To add to Natan's answer, you can run on the services docker anything depending on the HW. We don't recommend training with it as the server's machine might get overloaded. What you can do is simple stuff like cleanup or any other routines 🙂
Maybe it is some sort of misunderstanding from my side ? I thought :Task.enqueue(task, queue_name="training_queue")
is what starts the execution of the task. Do I need another function ?
It seems the agent does not like working with scripts located inside a git repository, I moved the requirements and the script in a folder without a .git
and it works now, thank you!
Thank you! Is there a way to test the agent on a machine without GPU ?
When running this little script, I can see my agent installing the requirements, but it does not seem to ever start running the task.task = Task.create( project_name="train", task_name="train", requirements_file="./requirements.txt", repo="") task.set_script(entry_point="./test.py") Task.enqueue(task, queue_name="training_queue")
The logs are as follows :
` Starting Task Execution:
ClearML results page:
Leaving process id 1863263
DONE: Running task '89359e55ffe942a3bfa7cc72b2e0357d', exit status 0 `
Does it enqueue the task? From what you posted it should simply create a task and then enqueue it without any further action