Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
SarcasticSquirrel56
Moderator
16 Questions, 144 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

137 × Eureka!
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
Hi folks, good morning 🙂 In our setup we have a set of queues that do not use any GPU resources. Yet, when I run an experiment in such queues, we see a Warn...
2 years ago
0 Votes
19 Answers
1K Views
0 Votes 19 Answers 1K Views
Hi folks, one question: I have a script that looks like: import clearml as cml import numpy as np from sklearn.linear_model import LogisticRegression from sk...
2 years ago
0 Votes
13 Answers
2K Views
0 Votes 13 Answers 2K Views
Hi folks, I have a question related to the storage of artifacts, as it is not entirely clear to me where to configure it. If I read the documentation https:/...
2 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
2 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
Hi folks, I have a question on something that it's not clear to me reading the documentation at https://clear.ml/docs/latest/docs/clearml_agent/ From what I ...
3 years ago
0 Votes
4 Answers
1K Views
0 Votes 4 Answers 1K Views
I do have one questions about using the helm chart, is there any way to specify the users in the values.yaml?
2 years ago
0 Votes
14 Answers
1K Views
0 Votes 14 Answers 1K Views
Hi folks, I have installed ClearML on kubernets using the helm chart, but I had to specify three different domains for the ui, apiserver and fileserver. Is t...
2 years ago
0 Votes
15 Answers
2K Views
0 Votes 15 Answers 2K Views
2 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
Good morning folks, I am setting up ClearML on a (self-hosted) K8s cluster using the https://github.com/allegroai/clearml-helm-charts/blob/main/charts/clearm...
2 years ago
0 Votes
17 Answers
2K Views
0 Votes 17 Answers 2K Views
2 years ago
0 Votes
6 Answers
2K Views
0 Votes 6 Answers 2K Views
3 years ago
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
2 years ago
0 Votes
31 Answers
78K Views
0 Votes 31 Answers 78K Views
3 years ago
0 Votes
31 Answers
74K Views
0 Votes 31 Answers 74K Views
Hi folks, I just deployed a ClearML agent using the Helm chart. I have a few doubts: after the deployment, I see a new queue called k8s_scheduler, which I di...
2 years ago
0 Votes
31 Answers
84K Views
0 Votes 31 Answers 84K Views
2 years ago
0 Votes
31 Answers
84K Views
0 Votes 31 Answers 84K Views
2 years ago
0 Hi All! Question Around Resource Management Using

Hi Martin, I admit I don't know about MIG I'll have to ask some of our engineers.

As for the memory, yes the reasoning is clear, the main thing we'll have to see is hot define the limits, because we have nodes with quite different resources available, and this might get tricky, but I'll try and let's see what happens 🙂

We actually plan to create different queues for different types of workloads, we are a bit seeing what the actual usage is to define what type of workloads make sense for us.

2 years ago
0 Hi Folks, I Have A Question On Something That It'S Not Clear To Me Reading The Documentation At

but I don't understand the comment on GPUs as the documentation makes a lot of references on GPU configurations for agents

3 years ago
0 Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

If I now reset the experiment, and enqueue the experiment to the gpu queue (but in the experimet, the user-properties configuration for k8s-glue is still set to cpu) the experiment is left in a Pending state... and in the K8sGlue Agent for the gpu queue, I can see a similar error as the one in the cpu agent....

` No tasks in Queues, sleeping for 5.0 seconds
No tasks in queue 75174e0e7ac047f195ab4dce6e9f03f7
No tasks in Queues, sleeping for 5.0 seconds
FATAL ERROR:
Traceback (most recent call...

2 years ago
0 Hi Folks, One Question: I Have A Script That Looks Like:

Thanks Martin! If I end up having sometime I'll dig into the code and check if I can bake something!

2 years ago
0 Hi Folks, One Question: I Have A Script That Looks Like:

OK, so... when executed locally "train" prints:
` train:
SepalLength SepalWidth PetalLength PetalWidth Species
122 7.7 2.8 6.7 2.0 2.0
86 6.7 3.1 4.7 1.5 1.0
59 5.2 2.7 3.9 1.4 1.0
4 5.0 3.6 1.4 0.2 0.0
77 6.7 3.0 5.0 1.7 1.0
.. ... ... ... ... ......

2 years ago
0 Hi All! Question Around Resource Management Using

Hi Martin, thanks for the explanation! I work with Maggie and help with the ClearML setup.

Just to be sure, currently, the PodTemplate contains:

resources: limits: nvidia.com/gpu: 1
you are suggesting to add also, something like:
requests: memory: "100Mi" limits: memory: "200Mi"is that correct?

On a related note, I am a bit puzzled by the fact that all the 4 GPUs are visible.
In the https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/ , i...

2 years ago
0 Hi Folks, One Question: I Have A Script That Looks Like:

Oh I see... for some reason I thought that all the dependencies of the environment would be tracked by ClearML, but it's only the ones that actually get imported...

If locally one detects that pandas is installed and can be used to read the csv, wouldn't it be possible to store this information in the clearml server so that it can be implicitly added to the requirements?

2 years ago
0 Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

no, there's no task with a name of cpu or gpu... Where can I find the id of the queue to check?2. what do you mean by initial log dumps, the very early row when it's being deployed?

Anyway, sure I can send it to you, but I just turned off my laptop :) and won't be able for a few days.

2 years ago
0 Hi Everyone, I'Ve Seen That When Re-Running A Script It Sometimes Overwrites A Previous Task In The Dashboard Instead Of Creating A New Task. How Does Clearml Decides Whether To Create A New Task Or Overwrite An Existing?

My understanding is that in Task.init, you have a reuse_last_task_id (or similar name) that defaults to True.. In that case if your experiment wasn't "published" it will be overwritten, (based on project and experiment name). However, if you do publish it, a new experiment would be created

2 years ago
0 Hi Folks, One Question: I Have A Script That Looks Like:

Thanks Martin.. I'll add this and check whether it fixes the issue, but I don't get quite well this though.. The local code doesn't need to import pandas, because the get method returns a DataFrame object that has a .loc method.
I was expecting the remote experiment to behave similarly, why do I need to import pandas there?

2 years ago
0 Hi Folks, I Just Deployed A Clearml Agent Using The Helm Chart. I Have A Few Doubts:

Effectively kubectl commands don't work from within the agent pod, I'll try to figure out why

2 years ago
0 Hi Folks, Occasionally When I Clone A Job And Enqueue It, Instead Of Being Processed By The Expected Queue, A New Queue (With Some Id That Looks Like An Hash) Is Created Instead, And The Experiment Hangs In A "Pending" State. When This Happens, If I Abor

The workaround that works for me is:
clone the experiment that I run on my laptop in the newly cloned experiment, modify the hyperparameters and configurations to my need in user properties set "k8s-queue" to "cpu" (or the name of queue I want to use) enqueue the experiment to the same queue I just set...
When I do like that in the K8sGlue pod for the cpu queue I can see that it has been correctly picked up:
` No tasks in queue 54d3edb05a89462faaf51e1c878cf2c7
No tasks in Queues, sleeping fo...

2 years ago
Show more results compactanswers