Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
"5451Af93E0Bf68A4Ab09F654B222Ccae": { "1B790A3Da2E8D6Cd939Cf271694Fe81B": { "Metric": ":Monitor:Gpu", "Variant": "Gpu_0_Utilization", "Value": 0.0, "Min_Value": 0.0,

"5451af93e0bf68a4ab09f654b222ccae": { "1b790a3da2e8d6cd939cf271694fe81b": { "metric": ":monitor:gpu", "variant": "gpu_0_utilization", "value": 0.0, "min_value": 0.0, "max_value": 3.542 }, "409d4e6ad9b69b3224fceeac6e265ddc": { "metric": ":monitor:gpu", "variant": "gpu_0_mem_used_gb", "value": 0.0, "min_value": 0.0, "max_value": 0.0 }, "74646afee0e0ab18d3cbd08ce1ff6aa3": { "metric": ":monitor:gpu", "variant": "gpu_0_mem_usage", "value": 0.002, "min_value": 0.002, "max_value": 54.739 }, "abdb01e1de566d2165e902fe0839465e": { "metric": ":monitor:gpu", "variant": "gpu_0_mem_free_gb", "value": 47.461, "min_value": 21.482, "max_value": 47.461 },Do we know if gpu_0_mem_usage and gpu_0_mem_used_gb, both shows current GPU usage?
How to know from this how much GPU is reserved for the task if this task is in progress?

(gpu_0_mem_used_gb)/(gpu_0_mem_used_gb+gpu_0_mem_free_gb) should give gpu memory % usage?

  
  
Posted 2 years ago
Votes Newest

Answers 7


Hi DrabCockroach54

Do we know if gpu_0_mem_usage and gpu_0_mem_used_gb, both shows current GPU usage?

the first is percentage used (memory % used at any specific moment) and the second is memory used GiB , both for the video memory

How to know from this how much GPU is reserved for the task if this task is in progress?

What do you mean by how much is reserved ? Are you running with an agent?

  
  
Posted 2 years ago

We are running workers as bare metal and clearml-server on Kubernetes. I was trying to find, what are those min and max value for above metrics.

What do you mean by how much is reserved ? Are you running with an agent?

  
  
Posted 2 years ago

Yeah exactly. Scalar tab have those but I need to add track in the alert if GPU utilization/gpu memory not in use and experiment in progress then alert. Can I get gpu usage over time frame via API also?

  
  
Posted 2 years ago

AgitatedDove14

  
  
Posted 2 years ago

Thanks for the reply. If gpu_0_mem_usage is % of GPU memory in use, what is gpu_0_utilization ?

Is gpu_0_utilization also in % then?

  
  
Posted 2 years ago

. Can I get gpu usage over time frame via API also?

task.get_reported_scalarsBut this will get you All the scalars, I think the next version of the server supports asking a specific one as well.
How are you implementing the alert monitoring?
Is is a stateless process starting every X min, or is it a state-full process running and monitoring ?

  
  
Posted 2 years ago

Is gpu_0_utilization also in % then?

Correct 🙂

I was trying to find, what are those min and max value for above metrics.

Oh that makes sense, notice that you can get the values over time, so you can track the usage over the experiment lifetime (you can of course see it in the Scalar tab of the experiment)

  
  
Posted 2 years ago
930 Views
7 Answers
2 years ago
one year ago
Tags