Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello All, I Installed Self-Hosted Server And Queue(Cosumes 1 Gpu) On Kubernetes. I Have An Issue Regarding Gpu Monitoring. I Checked The Process Is Using Gpu In The Pod, But Gpu Usage Is Not Being Displayed On Workers & Queues Dashboard, Whereas Cpu Usag

Hello all,
I installed self-hosted server and queue(cosumes 1 gpu) on kubernetes.
I have an issue regarding gpu monitoring.
I checked the process is using gpu in the pod, but gpu usage is not being displayed on WORKERS & QUEUES Dashboard, whereas CPU usage is. what is wrong?
image

  
  
Posted one year ago
Votes Newest

Answers 40


pod log is too long. would it be ok if i upload pod log file here??

  
  
Posted one year ago

@<1524922424720625664:profile|TartLeopard58> the agent running the task is v1.5.2 (as shown in the log), so the whole point is lost - we need to see the v1.5.3rc2 or v1.5.3rc3 running there... how did you set up the helm chart for the new agent?

  
  
Posted one year ago

This is clearml-agent helm chart values.yaml file i used to install

  
  
Posted one year ago

I set CLEARML_AGENT_UPDATE_VERSION=1.5.3rc2 ` in agentk8sglue.basePodTemplate.env as i mentioned
image

  
  
Posted one year ago

Try using K8S_GLUE_POD_AGENT_INSTALL_ARGS=1.5.3rc2

  
  
Posted one year ago

I tried using K8S_GLUE_POD_AGENT_INSTALL_ARGS=1.5.3rc2 instead of CLEARML_AGENT_UPDATE_VERSION=1.5.3rc2 , but it’s same. doesn’t read gpu usage.. 🥲
image

  
  
Posted one year ago

image

  
  
Posted one year ago

This was so we can see the agent log

  
  
Posted one year ago

Can you share it?

  
  
Posted one year ago

here is the agent, task log file~!

  
  
Posted one year ago
44K Views
40 Answers
one year ago
one year ago
Tags