CostlyFox64

2 Questions, 20 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Badges 1

4 × Eureka!

Questions 2
Answers 20

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hi There! We'Ve Deployed Clearml In Our Eks Cluster As A Poc With The Helm Chart Mostly On Default Values. We Loved The Poc And I'M Now Tasked With Setting Up The Deployment Robustly. Clearml Depends On Three External Databases; Redis, Mongo & Elastic. I

Hi there! We've deployed ClearML in our EKS cluster as a POC with the helm chart mostly on default values. We loved the POC and I'm now tasked with setting u...

clearml

3 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hi! We'Re Trying To Orchestrate An Experiment Through The Clearml Ui With A Remote Agent On K8S. Our Training Job Requires A Package Only Available On Our Internal Pypi Server. How Should We Go About This Because The Job Keeps Failing With:

Hi! We're trying to orchestrate an experiment through the ClearML ui with a remote agent on k8s. Our training job requires a package only available on our in...

mlops

3 years ago

0 Hello Everyone! I’Ve Installed Clearml On My Kubernetes Cluster Using The Helm Chart. I Then Proceeded To Clone An Example Experiment (3D Plot Reporting) And Executed It, Expecting A K8S Job To Be Run, But Instead I Noticed That The Clearml-Agent Containe

What is the error?

3 years ago

Ah I see it! I made a mistake in the helm chart 🙈

3 years ago

No problem! Thank you for finding a bug in the chart 🤓

I have some other improvements to the k8sagent I want to submit a PR for soon, so be sure the monitor the chart repo for updates!

3 years ago

As you'll probably run into issues as soon you want to start running experiments from private repos

3 years ago

Oh btw, did you restart the k8sagent pod after applying the new template?

3 years ago

https://github.com/allegroai/clearml-helm-charts/pull/54

3 years ago

0 Hi! Does Clearml Self-Hosted Supports Any Managed Solutions For Its Es, Mongo And Redis Dependencies?

https://github.com/elastic/elasticsearch-py/issues/1666

And this you'll run into ☝

3 years ago

0 Hi There! We'Ve Deployed Clearml In Our Eks Cluster As A Poc With The Helm Chart Mostly On Default Values. We Loved The Poc And I'M Now Tasked With Setting Up The Deployment Robustly. Clearml Depends On Three External Databases; Redis, Mongo & Elastic. I

Thanks a lot!

3 years ago

0 Hi Folks, Any Of You Has Experience In Deploying Clearml To Kubernetes Using Argocd? I Managed To Make It Run Pointing It To The Clearml-Charts-Repo, It Recognizes The Helm Chart And It Works. But I Am Struggling A Bit To Write My Own Definition To Make I

Hi Luca. We have ClearML deployed through ArgoCD and have the following configs:

Chart.yaml
` apiVersion: v2
name: clearml
description: A Helm chart for Kubernetes
version: 0.0.1
dependencies:

name: clearml
version: 3.5.1
repository: http://values.dev .yaml # insert your own config here `
Both of above files are pushed into our own private gitops repository.

Apply the following file with kubectl:
clearml-argocd.yaml
` apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
...

3 years ago

0 We'Re Trying To Use The Aws Autoscaler And Have Managed To Get It Up And Running With Spinning Up Instances. However, It Does Not Seem To Pull Any Of The Tasks For The Remote Instances. We See It Gets

If you have GPU autoscaling nodes in your k8s cluster already, you could also give the k8s glue agent a go https://github.com/allegroai/clearml-helm-charts/blob/9c15a8a348898aed5504420778d0e815b41642e5/charts/clearml/values.yaml#L300 ?

With the correct tolerations/nodeselectors you can have k8s take care of the autoscaling for you by just spinning up a new pod

3 years ago

0 Clearml Server V1.2.0 Has Just Been Released!

Nice!

One small remark, the contributions from open-source are not mentioned in the release notes 😇

3 years ago

0 Clearml Server V1.2.0 Has Just Been Released!

Don't worry about it, thx for changing! 😄

3 years ago

0 Hi! Does Clearml Self-Hosted Supports Any Managed Solutions For Its Es, Mongo And Redis Dependencies?

Do not go with the AWS managed mongo & ES, both will not work I'm afraid and are a pain to setup, speaking from experience

https://github.com/allegroai/clearml-server/issues/101

3 years ago

Because if not, the k8sagent pod is still using the old version

3 years ago

0 Hi! We'Re Trying To Orchestrate An Experiment Through The Clearml Ui With A Remote Agent On K8S. Our Training Job Requires A Package Only Available On Our Internal Pypi Server. How Should We Go About This Because The Job Keeps Failing With:

Worked like a charm, thanks SuccessfulKoala55 !!! 😄

3 years ago

Hi CostlyOstrich36 . Would it also be possible to set those values through env vars?

Because I am using the https://github.com/allegroai/clearml-helm-charts/blob/06070a5c20691aaf83fc919b1bf07a822c212d5a/charts/clearml/values.yaml#L330 on Kubernetes and can thus far only configure it through env variables

3 years ago

https://github.com/allegroai/clearml-helm-charts/blob/9c15a8a348898aed5504420778d0e815b41642e5/charts/clearml/values.yaml#L313

Should have been tolerations: [] , I'll send a PR soon to fix it.

In the meantime you can solve it by setting the value to k8sagent.podTemplate.tolerations: []

3 years ago

SmugHippopotamus96 the new version of the helm chart should fix all the issues you mentioned!

3 years ago

Could be the cause of your error

3 years ago

Thanks for your swift response CostlyOstrich36 !

We're a startup where about 10 people will use ClearML as the experiment logging backend with agents running on 4 on-prem GPU machines. We strive to always have experiments running to not have idle GPUs but this isn't always the case.

Alright so the Redis instance is too mission critical (I'll probably deploy this with the helm chart). The mongo and elastic are necessary and I'd like to deploy these as managed instances in AWS. Do you have ...

3 years ago