And regarding model deployment you mean serving the model through a serving engine such as triton?
Can you try deleting the cache folder? It should be somewhere around ~/.clearml
you can find the different cache folders that clearml uses in ~/clearml.conf
What happens during the run is that plotly plots are shown during run on your computer but they don't show in UI and ONLY after the run is finished the plots show up?
Are your runs long?
Hi @<1739455977599537152:profile|PoisedSnake58> , you can run the agent in docker mode as long as the image is available on your machine. You can also use clearml-agent build
, please see more here - None
Hi @<1670964701451784192:profile|SteepSquid49> , that sounds like the correct setup š
What were you thinking of improving or do you have some pain points in your current setup?
How would the ec2 instance get the custom package code to it?
In the HPO application I see the following explanation:
'Maximum iterations per experiment after which it will be stopped. Iterations are based on the experiments' own reporting (for example, if experiments report every epoch, then iterations=epochs)'
Hi BoredBat47 , use the --foreground tag to see the logs š
I don't think so. However you can use the API as well š
What version of ClearML are you using? Is there anything special about this git repository?
From the looks of this example this should be connected automatically actually
https://github.com/allegroai/clearml/blob/master/examples/frameworks/hydra/hydra_example.py
I think I misunderstood your problem at the start. let me take another look š
Hi JitteryCoyote63 , I think this is what you're looking for:
https://clear.ml/docs/latest/docs/references/sdk/task#move_to_project
Hi PetiteRabbit11 , can you please elaborate on what you mean?
Hi CooperativeOtter46 ,
I think the best way would be to use the API (You can use the SDK but I don't think it is so easy to filter times)
Use https://clear.ml/docs/latest/docs/references/api/tasks#post-tasksget_all api call to to get all the tasks according to the time frame & filtering you want and then sum up the run time from all experiments returned š
To use the SDK see here:
https://clear.ml/docs/latest/docs/references/sdk/task#taskget_all
Can you try with Task.connect()
?
https://clear.ml/docs/latest/docs/references/sdk/task#connect
Are you running inside a docker?
I don't think you need to mix. For example if you have a pre-prepared environment then it should something like export
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=<PATH_TO_ENV_BINARY>
You'll need to assign an agent to run on the queue, something like this: 'clearml-agent daemon -- foreground --queue services'
Then add a screenshot of the info section
You ran the same exact command for agent one with --docker
and one without --docker
and the one without managed to reach the files?
Can you try running it via agent without the docker?
Regarding 1 & 2 - I suggest always keeping the API docs handy - https://clear.ml/docs/latest/docs/references/api/definitions
I love using the API since it's so convenient. So to get to business -
To select all experiments from a certain project you can use tasks.get_all with filtering according to the API docs (I suggest you also use the web UI as reference - if you hit F12 you can see all the API calls and their responses. This can really help to get an understanding of it's capabilities ...
Before injecting anything into the instances you need to spin them up somehow. This is achieved by the application that is running and the credentials provided. So the credentials need to be provided to the AWS application somehow.
Hi @<1750327622178443264:profile|CleanOwl48> , you need to set the output_uri
in Task.init()
for example to True
to upload to the files server or to a string if you want to use s3 for example