Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
CooperativeKitten94
Moderator
0 Questions, 55 Answers
  Active since 01 August 2024
  Last activity one year ago

Reputation

0
0 Hi

@<1736194540286513152:profile|DeliciousSeaturtle82> when you copy the folder on the new pod, it crashes almost instantly?

one year ago
0 Hello! I Had Trouble Running Clearml-Agent On K8S. I Fixed It By Modifying The Helm Chart To Allow Specifying Runtimeclassname (Which Is Needed When Using Nvidia Gpu Operator). I Did This,

Hello @<1523708147405950976:profile|AntsyElk37> 🙂
You are right, the spec.runtimeClassName field is not supported in the Agent at the moment, I'll work on your Pull Request ASAP.
Could you elaborate a bit about why you need Tasks Pods to specify the runtimeclass to use GPUs?
Usually, you'd need to specify a Pod's container with, for example, resources.limits.nvidia.com/gpu : 1 , and the Nvidia Device Plugin would itself assign the correct device to the container. Will that work?

5 months ago
0 Hello! I Had Trouble Running Clearml-Agent On K8S. I Fixed It By Modifying The Helm Chart To Allow Specifying Runtimeclassname (Which Is Needed When Using Nvidia Gpu Operator). I Did This,

Hi @<1523708147405950976:profile|AntsyElk37> - Yes, having the runtimeClass makes sense. I am handling your PR soon 🙂

5 months ago
0 Hi Guys, I Have A Question About Elasticsearch Connection, We Are On Kubernetes Environment (Clearml Is Deployed With Helm Chart), In The Secret We Have This :

If that doesn't work, try removing the auth from the connection string and instead define two extraEnvs for the apiserver :

apiserver:
  extraEnvs:
    - name: CLEARML_ELASTIC_SERVICE_USERNAME
      value: "elastic"
    - name: CLEARML_ELASTIC_SERVICE_PASSWORD
      value: "toto"
11 months ago
0 I'M Using The Clearml Helm Charts, And So Far So Good, Now I'M Looking To Give One Of Our Agents Access To A Private Github Repo, But Could Not Find Where To Configure This ->

Do you mean the Python version that is installed on the clearml agent itself? Or do you mean the Python version available in tasks that will be run from the agent?

one year ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

Oh, okay, not sure this will be the only issue but you'll need these credentials to be valid, since they are used by the ClearML Agent to connect to the ClearML Server 🙂
The easiest way to generate credentials is to open the ClearML UI in the browser, login with an Admin user, then navigate to the Settings located on the top right corner when clicking on the user icon. From there go to "Workspace" and click "Create new credentials" and use the value provided

one month ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

Can you try with these values? For instance the changes are: not using clearmlConfig, not overriding the image and use default, not defining resources

agentk8sglue:
  apiServerUrlReference: 

  clearmlcheckCertificate: false
  createQueueIfNotExists: true
  fileServerUrlReference: 

  queue: default
  webServerUrlReference: 

clearml:
  agentk8sglueKey: 8888TMDLWYY7ZQJJ0I7R2X2RSP8XFT
  agentk8sglueSecret: oNODbBkDGhcDscTENQyr-GM0cE8IO7xmpaPdqyfsfaWear...
one month ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

So if you now run helm get values clearml-agent -n <NAMESPACE> where <NAMESPACE> is the value you have in the $NS variable, can you confirm this is the full and only output? Of course the $VARIABLES will have their real value

agentk8sglue:
  # Try newer image version to fix Python 3.6 regex issue
  image:
    repository: allegroai/clearml-agent-k8s-base
    tag: "1.25-1"
    pullPolicy: Always
  apiServerUrlReference: "http://$NODE_IP:30008"
  fileServerUrlReference: "ht...
one month ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

In your last message, you are referring to pod security context and admission controllers enforcing some policies such as a read-only filesystem. Is that the case in your cluster?
Or was this some output of a GPT-like chat? If yes, please do not use LLMs to generate values for the helm installation as they're usually not providing a useful or real config

one month ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

I understand, I'd just like to make sure if that's the root issue and there's no other bug, and if so then you can think of how to automate it via API

one month ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

It's a bit hard for me to provide support here with the additional layer of Argo.
I assume the server is working fine and you can open the clearml UI and log in, right? If yes, would it be possible to extract the Agent part only, out of Argo, and proceed installing it through standard helm?

one month ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

Hi @<1811208768843681792:profile|BraveGrasshopper38> , following up on your last message, are you running in an OpenShift k8s cluster?

one month ago
0 Hi Everyone, I'M Deploying Clearml Server Helm Chart With 3 Ingress Controller For Api, Web And Files. I Saw That One Of The Api Backend Is Unhealthy And I Get This Error In Apiserver Pod:

Hi @<1798162812862730240:profile|PreciousCentipede43> 🙂
When you say

one of the api backend is UNHEALTHY

do you mean you have multiple replicas of the apiserver component (i.e. you set the values apiserver.replicaCount > 1) and one of them is not ready?
Could you please share the output of the kubectl describe command for the ClearML apiserver Deployment?

4 months ago
0 Hi! I’Ve Checked The Docs For Clearml-Helm Charts, And I’Ve Seen The Possibility To Add Additional Volumes/Volume Mounts. Could You Please Provide An Example Of How To Do It Properly?

Hi @<1523701907598610432:profile|ReassuredArcticwolf33> - Are you referring to the clearml helm chart or to the clearml-agent one?
Either case, the respective values.yaml file is self-documented and contains example. Here I am reporting an example for adding additional volumes and volume mounts to the apiserver component of the clearml chart:

apiserver:
  # -- # Defines extra Kubernetes volumes to be attached to the pod.
  additionalVolumes:
    - name: ramdisk
      empty...
8 months ago
0 Hello! I Had Trouble Running Clearml-Agent On K8S. I Fixed It By Modifying The Helm Chart To Allow Specifying Runtimeclassname (Which Is Needed When Using Nvidia Gpu Operator). I Did This,

Hi @<1523708147405950976:profile|AntsyElk37> - There's a few points missing for the PR to be completed, let's follow-up on GitHub. See my comments here None

5 months ago
0 Anyonw Know How Can I Make Clearml Server Serving File Urls With External Domain, And Not The Internal Kubernetes Cluster Hostnames? Runnin Both The Server And Agents In K8S On-Prem, The Url Giving Is Not Reachable Because It Try To Present It As The Url

So, when the UI gets a debug image, it gets the URL for that image, which was created in runtime by the running SDK (by the Agent, in this case), so using the fileserver URL provided by the agent.
You will need to pass the external reference:

agentk8sglue:
  fileServerUrlReference: "
"

and work around the self-signed cert. You could try mounting your custom certificates to the Agent using volumes and volumeMounts, storing your certificate in a configmap or similarly

3 months ago
0 Anyonw Know How Can I Make Clearml Server Serving File Urls With External Domain, And Not The Internal Kubernetes Cluster Hostnames? Runnin Both The Server And Agents In K8S On-Prem, The Url Giving Is Not Reachable Because It Try To Present It As The Url

@<1726047624538099712:profile|WorriedSwan6> - When deploying the ClearML Agent, could you try passing the external fileserver url to the configuration you previously mentioned? Like this:

agentk8sglue:
  fileServerUrlReference: "
"
3 months ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

Oh no worries, I understand 😄
Sure, if you could share the whole values and configs you're using to run both the server and agent that would be useful.
Also what about other Pods from the ClearML server, are there any other crash or similar error referring to a read-only filesystem? Are the server and agent installed on the same K8s node?

one month ago
0 Hello, I Am First Timer In Clearml And Try To Deploy Locally A Clear Ml Server (Successfully) And Then Agent In My Kubernetes Cluster. I Follow The Helm Chart From "Helm Repo Add Clearml

Also, in order to simplify the installation, can you use a simpler version of your values for now, something like this should work:

agentk8sglue:
  apiServerUrlReference: 

  clearmlcheckCertificate: false
  createQueueIfNotExists: true
  fileServerUrlReference: 

  queue: default
  resources:
    limits:
      cpu: 500m
      memory: 1Gi
    requests:
      cpu: 100m
      memory: 256Mi
  webServerUrlReference: 

clearml:
  agentk8sglueKey: <NEW_KEY>...
one month ago
Show more results compactanswers