Reputation
Badges 1
121 × Eureka!Yeah, currently we are evaulating Seldon.. But was wondering whether clearml enterprise version wud do something similar ?
kkie.. was checking in the forum (if anyone knows anything) before asking them..
Mostly DL, but I suppose there could be ML use cases also
We have to do it in-premise.. Cloud providers are not allowed for the final implementation. Of course, now we use Cloud to test out our ideas.
Hi SuccessfulKoala55 , kkie..
1)Actually, now i am using AWS. I am trying to set up Clearml server in K8. However, clearml-agents will be just another ec2-instance/docker image.
2) For phase 2, I will try Clearml AWS AutoScaler Service.
3) At this point, I think I will have a crack at JuicyFox94 's solution as well.
Our main goal, maybe I shld have stated prior. We are data scientists who need a mlops environment to track and also run our experiments..
nice.. this looks a bit friendly.. 🙂 .. Let me try it.. Thanks
sure, I'll post some questions once I wrap my mind around it..
Essentially, while running on k8s_glue, I want to pull the docker image/container, then pip install the additional requirements.txt into them...
Thanks JuicyFox94 .
Not really from devops background, Let me try to digest this.. 🙏
We have k8s on ec2 instances in the cloud. I'll try it there 2morrow and report back..
Hi, using the pipeline examples, withstep1_dataset_artifact.py, step2_data_processing.py, step3_train_model.py ==> pipeline_controller.py
In the above example, the pipeline_controller is stringing together 3 python files, instead could it string together 3 containers instead. Of course, we can manually compile each into a docker image, but does clearml has some similar approach baked in.
let me run the clearml-agent outside the k8 system.. and get back to u
Btw, this is just the example code from clearml repo..
No no, I mean now i can export a csv file into clearml-data. I was wondering if it possible to export directly from a sql database.
Maybe more of data repository than a model repository...
Just to add on, I am using minikube now.
Hi AgitatedDove14 , imho links are def better, unless someone decides to archive their Tasks.. Just wondering about the possibility only..
It is like generating a report per Task level (esp for Training Jobs).. It's like packaging a report out per Training job..
This is where I downloaed the log. Seems like some docker issue, though i cant seem to figure it out. As an alternative, I spawned a clearml-agent outside the k8 environment and it was able to execute well.
When I push a job to an agent node, i got this error.
"Error response from daemon: network None not found"
Hi, sorry for the delayed response. Btw, all the pods are running all good.
Hi, will proceed to close this thread. We found some issue with the underlying docker in our machines. We've have not shifted to another k8 of ec2 instances in AWS.