Can clearml-serving does helm install or upgrade? We have cases where the ml models do not come from the ml experiments in clearml. But would like to tap on clearml q to enable resource queuing.
I figured out that it maybe possible to do theseexperiment_task = Task.current_task()OutputModel(experiment_task ).update_weights(' http://model.pt ') to attach it to the ClearML experiment task.
Thanks @<1523701205467926528:profile|AgitatedDove14> . what I could think of is to write a task that may run python subproecss to do "helm install". In those python script, we could point to /download the helm chart from somewhere (e.g. nfs, s3).
Does this sound right to u?
Anything that I was wondering is if we could pass the helm charts /files when we uses clearml sdk, so we could minimise the step to push them to the nfs/s3.
This is what I got. and when I see http400 error in the console.
When I run as regular remote task it works. But when I run as a step in pipeline, it cannot access the same folder in my local machine.
It gets rerouted to http://app.clearml.home.ai/dashboard . with the same network error.
Do u have an example of how I can define the packages to be installed for every steps of the pipeline?
A more advanced case will be to decide how long this job should run amd terminate after that. This is to improve the usage of gpu
Not exactly sure yet but I would think user tag for deployed make sense as it should be a deliberated user action. And additional system state is required too since a deployed state should have some pre-requitise system state.
I would also like to ask if clearml has different states for a task, model, or even different task types? Right now I dun see differences, is this a deliberated design?
I guess we need to understand the purpose of the various states. So far only "archive, draft, publish". Did I miss any?
Hi @<1523701070390366208:profile|CostlyOstrich36> , basically
- I uploaded dataset using clearml Datasets. The output_uri is pointed to my s3, thus the dataset is stored in s3. My s3 is setup with http only.
- When I retrieve the dataset for training, using
Dataset.get(), I encountered ssl cert error as the url to retrieve data washttps://<s3url>/...instead ofs3://<s3url>/...which is http. This is weird as the dataset url is without https. - I am not too sure why and I susp...
Thanks AgitatedDove14 and TimelyMouse69 . The intention was to have some traceability between the two setups. I think the best way is to enforce some naming convention (for project and name) so we can know how they are related? Any better suggestions?
I see. Was wondering any advantage to do it any of the ways.
Hi SuccessfulKoala55 Thanks for pointing me to this repo. Was using this repo.
I didn't manage to find in this repo that if we still require to label the node app=clearml, like what was mentioned in the deprecated repo. Although from the values.yaml, the node selector is empty. Would u be able to advise?
How is the clearml data handled now then? Thanks
https://clear.ml/docs/latest/docs/integrations/storage/
Try add the <path to your cert> for s3.credentials.verify.
@<1523701070390366208:profile|CostlyOstrich36> Yes. I'm running on k8s
Hi Bart, yes. Running with inference container.
@<1523701070390366208:profile|CostlyOstrich36> This is output_uri or where do I put this url?
Yes. But I not sure what's the agent running. I only know how to stop it if I have the agent id
Nice. It is actually dataset.id .
And just a suggestion which maybe I can post in GitHub issue too.
It is not very clear what are the purpose of the project name and name, even after I read the --help. Perhaps this is something that can be made clearer when updating the docu?
@<1523701205467926528:profile|AgitatedDove14> I looking at a queue system which clearml q offers that allow user to queue job to deploy an app / inference service. This cam be as simple as a pod or a more complete helm chart.
Nice. That should work. Thanks
Example i build my docker image using a image in docker hub. In this image, i installed torch and cupy packages. But when i run my experiment in this image, the packages are not found.
Yes, I ran the experiment inside.
Thanks AgitatedDove14 . Specifically, I wanted to use my own clearml server and Triton. Thus, I attempted to use --engine-container-args during launch but error saying no such flag. Looked into --help but I guessed it is not updated yet.
SuccessfulKoala55 Nope. I didn't even get to enter my name. I suspect there is some mistake in mapping the data folder.
Was using the template in https://github.com/allegroai/clearml-helm-charts to deploy.