Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
SarcasticSquirrel56
Moderator
16 Questions, 144 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

137 × Eureka!
0 Votes
14 Answers
936 Views
0 Votes 14 Answers 936 Views
Hi folks, I have installed ClearML on kubernets using the helm chart, but I had to specify three different domains for the ui, apiserver and fileserver. Is t...
2 years ago
0 Votes
17 Answers
1K Views
0 Votes 17 Answers 1K Views
2 years ago
0 Votes
15 Answers
1K Views
0 Votes 15 Answers 1K Views
2 years ago
0 Votes
19 Answers
1K Views
0 Votes 19 Answers 1K Views
Hi folks, one question: I have a script that looks like: import clearml as cml import numpy as np from sklearn.linear_model import LogisticRegression from sk...
2 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Good morning folks, I am setting up ClearML on a (self-hosted) K8s cluster using the https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearm...
2 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
2 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
2 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
I do have one questions about using the helm chart, is there any way to specify the users in the values.yaml?
2 years ago
0 Votes
13 Answers
1K Views
0 Votes 13 Answers 1K Views
Hi folks, I have a question related to the storage of artifacts, as it is not entirely clear to me where to configure it. If I read the documentation https:/...
2 years ago
0 Votes
6 Answers
995 Views
0 Votes 6 Answers 995 Views
Hi folks, good morning 🙂 In our setup we have a set of queues that do not use any GPU resources. Yet, when I run an experiment in such queues, we see a Warn...
2 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi folks, I have a question on something that it's not clear to me reading the documentation at https://clear.ml/docs/latest/docs/clearml_agent/ From what I ...
2 years ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
2 years ago
0 Votes
31 Answers
20K Views
0 Votes 31 Answers 20K Views
Hi folks, I just deployed a ClearML agent using the Helm chart. I have a few doubts: after the deployment, I see a new queue called k8s_scheduler, which I di...
2 years ago
0 Votes
31 Answers
22K Views
0 Votes 31 Answers 22K Views
2 years ago
0 Votes
31 Answers
21K Views
0 Votes 31 Answers 21K Views
2 years ago
0 Votes
31 Answers
23K Views
0 Votes 31 Answers 23K Views
2 years ago
0 Hi Folks I Have A Problem I Can'T Understand. Plots Are Not Shown When Experiments Are Executed From The Ui. For Example, If I Run The Code On My Laptop, And I Go To The Experiment Page I Can See Correctly The Plots: But If I Then Clone The Task, And Ex

This is the list of all the environment variables (starting with CLEARML) available in the Pod spawned by the K8s Glue Agent:
` CLEARML_MONGODB_PORT_27017_TCP_PORT
CLEARML_FILESERVER_PORT_8081_TCP_ADDR
CLEARML_ELASTIC_MASTER_PORT_9200_TCP
CLEARML_APISERVER_PORT_8008_TCP_PROTO
CLEARML_FILESERVER_PORT_8081_TCP_PORT
CLEARML_ELASTIC_MASTER_SERVICE_PORT_TRANSPORT
CLEARML_WEBSERVER_PORT_80_TCP
CLEARML_ELASTIC_MASTER_SERVICE_PORT
CLEARML_MONGODB_PORT_27017_TCP_ADDR
CLEARML_FILESERVER_PORT_8081_TCP_P...

2 years ago
0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

What I still don't get, is how you would create different queues, targeting different nodes with different GPUs, and having them using the appropriate Cuda image.
Looking at the template, I don't understand how that's possible.

2 years ago
0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

Thanks, I'll try to understand how the default agent coming with the helm chart is configured and try to copy how to setup a different one from there then

2 years ago
0 Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

If now I abort the experiment (which is in a pending state and not running), and re-enqueue it again -- no parameters modifications this time...
and I re-enqueue it to the CPU queue, I see that it is sent to the right queue, and after a few seconds the job enters a running state and it completes correctly

2 years ago
0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

I guess to achieve what I want, I could disable the agent using the helm chart values.yaml
and then define pods for each of the agent on their respective nodes

2 years ago
0 Hi All! Question Around Resource Management Using

Hi Martin, I admit I don't know about MIG I'll have to ask some of our engineers.

As for the memory, yes the reasoning is clear, the main thing we'll have to see is hot define the limits, because we have nodes with quite different resources available, and this might get tricky, but I'll try and let's see what happens 🙂

We actually plan to create different queues for different types of workloads, we are a bit seeing what the actual usage is to define what type of workloads make sense for us.

2 years ago
0 Hi Everyone, I'Ve Seen That When Re-Running A Script It Sometimes Overwrites A Previous Task In The Dashboard Instead Of Creating A New Task. How Does Clearml Decides Whether To Create A New Task Or Overwrite An Existing?

My understanding is that in Task.init, you have a reuse_last_task_id (or similar name) that defaults to True.. In that case if your experiment wasn't "published" it will be overwritten, (based on project and experiment name). However, if you do publish it, a new experiment would be created

2 years ago
0 Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

I have tried this several time and the behaviour is always the same. It looks like when I modify some hyperparameter, when I enqueue the experiment to one queue, things don't work if I didn't make sure to have previously set the value of k8s-queue to the name of the queue that I want to use. If I don't modify the configuration (e.g. I abort, or reset the job and enqueue it again, or clone and enqueue it without modifying the hyperparameters) then everything works as expected.

2 years ago
0 Hi All! Question Around Resource Management Using

Hi Martin, thanks for the explanation! I work with Maggie and help with the ClearML setup.

Just to be sure, currently, the PodTemplate contains:

resources: limits: nvidia.com/gpu: 1
you are suggesting to add also, something like:
requests: memory: "100Mi" limits: memory: "200Mi"is that correct?

On a related note, I am a bit puzzled by the fact that all the 4 GPUs are visible.
In the https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/ , i...

2 years ago
0 Hi Folks, Any Of You Has Experience In Deploying Clearml To Kubernetes Using Argocd? I Managed To Make It Run Pointing It To The Clearml-Charts-Repo, It Recognizes The Helm Chart And It Works. But I Am Struggling A Bit To Write My Own Definition To Make I

I see in bitnami's gh-pages branch a file https://github.com/bitnami-labs/sealed-secrets/blob/gh-pages/index.html to do the redirect that contains:

` <html>

<head> <meta http-equiv="refresh" content="0; url= ` ` "> </head> <p><a href=" ` ` ">Redirect to repo index.yaml</a></p> </html> ` A similar file is missing in the ` clearml-helm-chart ` ` gh-pages ` branch.

2 years ago
0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

Hi AgitatedDove14 I have spent some time going through the helm charts but I admit I still haven't clear how things should work.

I see that with the default values (mostly what I am using), the K8s Glue agent is deployed (which is what you suggested to use).

2 years ago
0 Good Morning Folks, I Am Setting Up Clearml On A (Self-Hosted) K8S Cluster Using The

Thanks Martin, so if I understand correctly, when I do the clearml-agent init command (I have to check the syntax), by providing the apiserver webeserver and fileserver url they'll be registered to the clearml cluster?

2 years ago
0 Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

also, if I clone an experiment on wich I had to set the k8s-queue user property manually to run experiments on a queue, say cpu, and enqueue it to a different queue, say gpu, the property is not updated, and the experiment is enqueued in a queue with a random hash like name. I either have to delete the attribute, or set it to the right queue name, before enqueuing it, to have it run in the right queue

2 years ago
0 Hi Folks I Have A Problem I Can'T Understand. Plots Are Not Shown When Experiments Are Executed From The Ui. For Example, If I Run The Code On My Laptop, And I Go To The Experiment Page I Can See Correctly The Plots: But If I Then Clone The Task, And Ex

OK, it wasn't the clearml.conf settings...

In the deployment I was referring to the fileserver, apiserver, etc. with the internal kubernetes dns names.
I changed them to the one exposed to the users (the same I have in my local clearml.conf) and things work.

But I can't really figure out why that would be the case...

2 years ago
Show more results compactanswers