Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello All, I Installed Self-Hosted Server And Queue(Cosumes 1 Gpu) On Kubernetes. I Have An Issue Regarding Gpu Monitoring. I Checked The Process Is Using Gpu In The Pod, But Gpu Usage Is Not Being Displayed On Workers & Queues Dashboard, Whereas Cpu Usag

Hello all,
I installed self-hosted server and queue(cosumes 1 gpu) on kubernetes.
I have an issue regarding gpu monitoring.
I checked the process is using gpu in the pod, but gpu usage is not being displayed on WORKERS & QUEUES Dashboard, whereas CPU usage is. what is wrong?
image

  
  
Posted 2 years ago
Votes Newest

Answers 40


pod log is too long. would it be ok if i upload pod log file here??

  
  
Posted 2 years ago

@<1524922424720625664:profile|TartLeopard58> the agent running the task is v1.5.2 (as shown in the log), so the whole point is lost - we need to see the v1.5.3rc2 or v1.5.3rc3 running there... how did you set up the helm chart for the new agent?

  
  
Posted 2 years ago

This is clearml-agent helm chart values.yaml file i used to install

  
  
Posted 2 years ago

I set CLEARML_AGENT_UPDATE_VERSION=1.5.3rc2 ` in agentk8sglue.basePodTemplate.env as i mentioned
image

  
  
Posted 2 years ago

Try using K8S_GLUE_POD_AGENT_INSTALL_ARGS=1.5.3rc2

  
  
Posted 2 years ago

I tried using K8S_GLUE_POD_AGENT_INSTALL_ARGS=1.5.3rc2 instead of CLEARML_AGENT_UPDATE_VERSION=1.5.3rc2 , but it’s same. doesn’t read gpu usage.. 🥲
image

  
  
Posted 2 years ago

image

  
  
Posted 2 years ago

This was so we can see the agent log

  
  
Posted 2 years ago

Can you share it?

  
  
Posted 2 years ago

here is the agent, task log file~!

  
  
Posted 2 years ago
152K Views
40 Answers
2 years ago
2 years ago
Tags