SarcasticSquirrel56

16 Questions, 144 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

137 × Eureka!

Questions 16
Answers 144

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Hi Folks, I Have A Question Related To The Storage Of Artifacts, As It Is Not Entirely Clear To Me Where To Configure It. If I Read The Documentation

Hi folks, I have a question related to the storage of artifacts, as it is not entirely clear to me where to configure it. If I read the documentation https:/...

clearml

3 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

Good morning folks, I am setting up ClearML on a (self-hosted) K8s cluster using the https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearm...

mlops

3 years ago

0 Votes

14 Answers

2K Views

0 Votes 14 Answers 2K Views

Hi Folks, I Have Installed Clearml On Kubernets Using The Helm Chart, But I Had To Specify Three Different Domains For The Ui, Apiserver And Fileserver. Is There Any Way To Let Clearml Know That The Apiserver Is At

Hi folks, I have installed ClearML on kubernets using the helm chart, but I had to specify three different domains for the ui, apiserver and fileserver. Is t...

clearml

3 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi Folks, Good Morning

Hi folks, good morning 🙂 In our setup we have a set of queues that do not use any GPU resources. Yet, when I run an experiment in such queues, we see a Warn...

clearml

3 years ago

0 Votes

15 Answers

2K Views

0 Votes 15 Answers 2K Views

Hi Everybody, I Am Having An Issues With A Self-Hosted Clearml Server... I Am Having A Problem Enqueuing Experiments Whose Code Is In A Git Repository, They Are In A Pending State And Proceed... However If I Copy The Same Code Out In A Folder With No Rep

Hi everybody, I am having an issues with a self-hosted clearml server... I am having a problem enqueuing experiments whose code is in a git repository, they ...

clearml

3 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi Folks, Any Of You Has Experience In Deploying Clearml To Kubernetes Using Argocd? I Managed To Make It Run Pointing It To The Clearml-Charts-Repo, It Recognizes The Helm Chart And It Works. But I Am Struggling A Bit To Write My Own Definition To Make I

Hi folks, any of you has experience in deploying ClearML to Kubernetes using ArgoCD? I managed to make it run pointing it to the clearml-charts-repo, it reco...

clearml

3 years ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

I Do Have One Questions About Using The Helm Chart, Is There Any Way To Specify The Users In The Values.Yaml?

I do have one questions about using the helm chart, is there any way to specify the users in the values.yaml?

clearml

3 years ago

0 Votes

19 Answers

2K Views

0 Votes 19 Answers 2K Views

Hi Folks, One Question: I Have A Script That Looks Like:

Hi folks, one question: I have a script that looks like: import clearml as cml import numpy as np from sklearn.linear_model import LogisticRegression from sk...

clearml

3 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

Hi Folks, I Have A Question On Something That It'S Not Clear To Me Reading The Documentation At

Hi folks, I have a question on something that it's not clear to me reading the documentation at https://clear.ml/docs/latest/docs/clearml_agent/ From what I ...

clearml

3 years ago

0 Votes

8 Answers

2K Views

0 Votes 8 Answers 2K Views

Hi Folks, A Question Regarding The Clearml-Agent With K8S Glue. In The Agents We Mount An Nfs Volume So That Some Artifacts And Data Would Be Available For Training. I Have Seen That The K8S Glue Runs As Root (I Guess To Be Able To Spawn New Pods?), But

Hi folks, a question regarding the clearml-agent with k8s glue. In the agents we mount an nfs volume so that some artifacts and data would be available for t...

clearml

3 years ago

0 Votes

17 Answers

2K Views

0 Votes 17 Answers 2K Views

Hi Folks I Have A Problem I Can'T Understand. Plots Are Not Shown When Experiments Are Executed From The Ui. For Example, If I Run The Code On My Laptop, And I Go To The Experiment Page I Can See Correctly The Plots: But If I Then Clone The Task, And Ex

Hi folks I have a problem I can't understand. Plots are not shown when experiments are executed from the UI. For example, if I run the code on my laptop, and...

clearml

3 years ago

0 Votes

7 Answers

2K Views

0 Votes 7 Answers 2K Views

Hi Folks, I Am Having An Issue I Can'T Properly Understand: I Have Tried To Run The "Dataset" Example From The Official Clearml Repository (From My Laptop) For Some Reason It Got Stuck, So I Killed The Process, But In Clearml Ui It Still Results As "Runn

Hi folks, I am having an issue I can't properly understand: I have tried to run the "dataset" example from the official clearml repository (from my laptop) F...

clearml

3 years ago

0 Votes

31 Answers

128K Views

0 Votes 31 Answers 128K Views

Hi Folks, I Did A Deployment Of Clearml Using The K8S Helm Chart, And I Set The Agent Using K8S Glue. I Run A Task Locally, And I Went To The Ui Cloned The Experiment And Scheduled It In The Default Queue. After Doing This, I See That The Experiment Is Q

Hi folks, I did a deployment of ClearML using the K8s helm chart, and I set the agent using K8s Glue. I run a task locally, and I went to the UI cloned the e...

mlops

3 years ago

0 Votes

31 Answers

124K Views

0 Votes 31 Answers 124K Views

Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

Hi folks, occasionally when I clone a job and enqueue it, instead of being processed by the expected queue, a new queue (with some id that looks like an hash...

clearml

3 years ago

0 Votes

31 Answers

121K Views

0 Votes 31 Answers 121K Views

Hi Folks, I Am Trying To Run Clearml On A Local Kubernetes (Created With Rancher Desktop), But I Am Running Into Issues. I Admit I Am A Complete Noob To K8S, Helm, Clearml Etc. Etc. So I Am Not Sure If I Did Some Very Stupid Mistakes. I Have A K8S Cluste

Hi folks, I am trying to run ClearML on a local kubernetes (created with Rancher Desktop), but I am running into issues. I admit I am a complete noob to K8s,...

kubernetes

3 years ago

0 Votes

31 Answers

109K Views

0 Votes 31 Answers 109K Views

Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

Hi folks, I just deployed a ClearML agent using the Helm chart. I have a few doubts: after the deployment, I see a new queue called k8s_scheduler, which I di...

mlops

3 years ago

0 Hi Folks, I Have Installed Clearml On Kubernets Using The Helm Chart, But I Had To Specify Three Different Domains For The Ui, Apiserver And Fileserver. Is There Any Way To Let Clearml Know That The Apiserver Is At

Hi Jake unfortunately I realized we put a loadbalancer, so any address like addess.domain, would ping

3 years ago

0 Hi Everybody, I Am Having An Issues With A Self-Hosted Clearml Server... I Am Having A Problem Enqueuing Experiments Whose Code Is In A Git Repository, They Are In A Pending State And Proceed... However If I Copy The Same Code Out In A Folder With No Rep

Yes, I still see those errors, but queues are working :)

3 years ago

0 Hi Folks, I Did A Deployment Of Clearml Using The K8S Helm Chart, And I Set The Agent Using K8S Glue. I Run A Task Locally, And I Went To The Ui Cloned The Experiment And Scheduled It In The Default Queue. After Doing This, I See That The Experiment Is Q

Martin I told you I can't access the resources in the cluster unfortunately

3 years ago

0 I Do Have One Questions About Using The Helm Chart, Is There Any Way To Specify The Users In The Values.Yaml?

and one more question, in the values, I also see the values for the default tokens:

` credentials:
apiserver:
# -- Set for apiserver_key field
accessKey: "5442F3443MJMORWZA3ZH"
# -- Set for apiserver_secret field
secretKey: "BxapIRo9ZINi8x25CRxz8Wdmr2pQjzuWVB4PNASZqCtTyWgWVQ"
tests:
# -- Set for tests_user_key field
accessKey: "ENP39EQM4SLACGD5FXB7"
# -- Set for tests_user_secret field
secretKey: "lPcm0imbcBZ8mwgO7tpadutiS3gnJD05x9j7a...

3 years ago

And I see that it is moved to the k8s_scheduler one instead (though I see that in the "default" queue I do have jobs)

3 years ago

0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

thanks for the help!

3 years ago

Hi Jack, yes we had to customize the default one for some tools we use internally

3 years ago

0 Hi Folks, I Have A Question Related To The Storage Of Artifacts, As It Is Not Entirely Clear To Me Where To Configure It. If I Read The Documentation

thanks, yes it makes sense!

3 years ago

0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

ah I see, I'll give it a try then

3 years ago

0 Hi Folks, I Am Having An Issue I Can'T Properly Understand: I Have Tried To Run The "Dataset" Example From The Official Clearml Repository (From My Laptop) For Some Reason It Got Stuck, So I Killed The Process, But In Clearml Ui It Still Results As "Runn

but for sigterm you should be able to set cleanup steps, no?

3 years ago

0 Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

Hi SuccessfulKoala55 I can confirm that the "id-like" queue created by ClearML
actually correspond to the id of queue "k8s_scheduler" (so it looks like that instead of submitting the experiment to the scheduler to be enqueued to the right queue), a new queue whose name corresponds to the id of the k8s_scheduler is created instead.

Hope this helps 🙂

3 years ago

0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

What I still don't get, is how you would create different queues, targeting different nodes with different GPUs, and having them using the appropriate Cuda image.
Looking at the template, I don't understand how that's possible.

3 years ago

Thanks Jake!

3 years ago

0 Hi Folks, I Am Trying To Run Clearml On A Local Kubernetes (Created With Rancher Desktop), But I Am Running Into Issues. I Admit I Am A Complete Noob To K8S, Helm, Clearml Etc. Etc. So I Am Not Sure If I Did Some Very Stupid Mistakes. I Have A K8S Cluste

thanks a lot! So as long as we have the storageclass in our kubernetes cluster configured correctly, the new helm chart should work out of the box?

3 years ago

It did work Valeriano!

3 years ago

I can see the outputs from argo, so I know if some resource has been created but I can't inspect the full logs,
the ones I have available are all records similar to
No tasks in queue 80247f703053470fa60718b4dff7a576

3 years ago

0 Hi Folks, Good Morning

Thanks CostlyOstrich36 I was thinking more to a setting of the environment, for example the documentation mentions the "--cpu-only" flag (which I am not sure I can use as I am using the helm charts from AllegroAI, I don't think I can override the command), or to set the env var NVIDIA_VISIBLE_DEVICES to an empty string (which I did, but I can still see the message)

3 years ago

And yes, I am using the agents that come with the Helm chart from Clearml repository

3 years ago

0 Hi Folks, Good Morning

As much as possible, I'd like removing the burden off the shoulders of people writing their models

3 years ago

0 Hi Folks I Have A Problem I Can'T Understand. Plots Are Not Shown When Experiments Are Executed From The Ui. For Example, If I Run The Code On My Laptop, And I Go To The Experiment Page I Can See Correctly The Plots: But If I Then Clone The Task, And Ex

Hi Josh, the agents are running on top of K8s (I used the helm chart to deploy them, it uses K8s glue).

I'll add a sleep so that I have time to enter the pod, and get the clearml.conf and will send you the diff in a few minutes

3 years ago

0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

because while I can run kubectl commands from within the agent pod, clearml doesn't seem to pick the right value:

` 2022-08-05 12:09:47
task 29f1645fbe1a4bb29898b1e71a8b1489 pulled from 51f5309bfb1940acb514d64931ffddb9 by worker k8s-agent-cpu
2022-08-05 12:12:59
Running kubectl encountered an error: Unable to connect to the server: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2022-08-05 15:15:07
task 29f1645fbe1a4bb29898b1e71a8b1489...

3 years ago

0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

Effectively kubectl commands don't work from within the agent pod, I'll try to figure out why

3 years ago

0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

yes, the curl returned a 503 error

3 years ago

0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

Thanks for pitching in JuicyFox94 . For the connectivity, I used the "public" names for the various server
(e.g. we set clearml.internal.domain.name, clearml-apiserver.internal.domain.name and clearml-apiserver.internal.domain.name)

So in the agent values.yaml I set the parameters:
# -- Reference to Api server url apiServerUrlReference: " ` "

-- Reference to File server url

fileServerUrlReference: " "

-- Reference to Web server url

webServerUrlReference: " " `to ...

3 years ago

The workaround that works for me is:
clone the experiment that I run on my laptop in the newly cloned experiment, modify the hyperparameters and configurations to my need in user properties set "k8s-queue" to "cpu" (or the name of queue I want to use) enqueue the experiment to the same queue I just set...
When I do like that in the K8sGlue pod for the cpu queue I can see that it has been correctly picked up:
` No tasks in queue 54d3edb05a89462faaf51e1c878cf2c7
No tasks in Queues, sleeping fo...

3 years ago

0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

Hi Martin, thanks. My doubt is:
if I configure manually the pods for the different nodes, how do I make clearml server aware that those agents exist? This step is really not clear to me from the documentation (it talks about user, and it uses interactive commands which would mean entering in the agents manually) I will try also the k8s glue, but I would like first to understand how to configure a fixed number of agents manually

3 years ago

0 Hi Folks, Good Morning

that disabled gpu for me

3 years ago

0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

JuicyFox94 apparently to make it work I'll have to add a "kubeconfig" file, but I can't see any obvious way to mount it in the agent pod, am I wrong?

3 years ago

I think it's because the proxy env var are not passed to the container (I thought they were the same as the extraArgs from the agentservice, but it doesn't look like that's the case)

3 years ago

0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

Thanks Martin, so if I understand correctly, when I do the clearml-agent init command (I have to check the syntax), by providing the apiserver webeserver and fileserver url they'll be registered to the clearml cluster?

3 years ago

Show more results