Hey Guys,
Context: I was able to host the ClearML server (web server, file server, API server) in an AWS EC2 instance via terraform IaC and GitHub actions and the web server is accessible at the http://<Public-IP of the EC2 instance>:8080/
as I followed referring to this documentation page , And I was able to run experiments from my local and make API calls to the ClearML server hosted in EC2 and the experiments ran fine in this setup and were getting updated in the web server UI as well and teh experiments ar erunning on teh EC2 instance's compute.
Problem: Then I tried exploring ClearML Agentic runs for remote execution of tasks (a demo python file) for which I created a queue and ran clearml-agent daemon --queue <queue_name>
, which assigned a worker to it, and I then enqueued a task to that queue, and the worker picked that task up for execution, and it starts spinning up the same environment as in my local machine's VSCode by installing the dependencies cloning the GitHub repo and specific branch, but it isn't able to clone the whole thing properly I think so
I would really appreciate some help.
Here is the full log: