Answered

Two Questions Today. First, Is There Some Way To Calculate The Number Of Gpu-Hours Used For A Project? Could I Select All Experiments And Count Up The Number Of Gpu-Hours/Gpu-Weeks? I Realize I Could Do This Manually By Looking At The Gpu Utilization Grap

Two questions today. First, is there some way to calculate the number of GPU-hours used for a project? Could I select all experiments and count up the number of GPU-hours/GPU-weeks? I realize I could do this manually by looking at the GPU utilization graphs, etc., but I've got about 100-200 experiments I'd like to sum up

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Votes Newest

Answers 21

You can do this quite easily with some code and the API 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

Here's the hours/days version, corrected now lol:
gpu_hours = {} gpu_days = {} for gpu_type, gpu_time_seconds in gpu_seconds.items(): gpu_time_hours = gpu_time_seconds/3600 gpu_hours[gpu_type] = gpu_time_hours gpu_days[gpu_type] = gpu_time_hours/24

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

CostlyOstrich36 I get some weird results, for "active duration".

For example, several of the experiments show that their active duration is more than 90 days, but I definitely didn't run them that long.

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

SmallDeer34 Hi 🙂
I don't think there is a way out of the box to see GPU hours per project, but it can be a pretty cool feature! Maybe open a github feature request for this.

Regarding on how to calculate this, I think an easier solution for you would be to sum up the runtime of all experiments in a certain project rather than looking by GPU utilization graphs

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

Ok, will do once I get back to the office, thanks for the heads up! 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

OK, definitely fix that in the snippet, lol

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

The UI uses also the API so any data you see the the UI you can directly extract from the API that's why I personally love using it so much for similar tasks

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

Hang on, CostlyOstrich36 I just noticed that there's a "project compute time" on the dashboard? Do you know how that is calculated/what that is?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Regarding 1 & 2 - I suggest always keeping the API docs handy - https://clear.ml/docs/latest/docs/references/api/definitions

I love using the API since it's so convenient. So to get to business -
To select all experiments from a certain project you can use tasks.get_all with filtering according to the API docs (I suggest you also use the web UI as reference - if you hit F12 you can see all the API calls and their responses. This can really help to get an understanding of it's capabilities I'm not sure the runtime sits in the database as an exact number, but you can easily calculate it as 'completed' minus 'started' time of experiments.

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

I suppose the flow would be something like:
select all experiments from project x with iterations greater than y, pull runtime for each one add them all up. I just don't know what API calls to make for 1 and 2

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Hang on,

I just noticed that there's a "project compute time" on the dashboard? Do you know how that is calculated/what that is?

Are you referring to the to the example in services?

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

CostlyOstrich36 I made a code snippet for you:
` from clearml import Task

figuring out the project ID

project_list = Task.get_projects() # get all the projects
project_id = Task.get_project_id("your project name here")

getting all the tasks for a project

tasks = Task.get_all(project=[project_id]).response.tasks

loop through and get approximate maximum gpu-seconds by type.

import random
from collections import defaultdict
task = random.choice(tasks)
print(dir(task))
print(task.runtime)

gpu_seconds = defaultdict(int)
for task in tasks:
if task.runtime.get("gpu_count", 0) >0:
gpu_type = task.runtime.get("gpu_type")
active_duration = task.active_duration # in seconds, according to
print(f"active_duration of {gpu_type} is {active_duration} seconds")
gpu_seconds[gpu_type] = gpu_seconds[gpu_type] + active_duration Result of printing gpu_hours now will look like defaultdict(int,
{'A100-PCIE-40GB': 3563829,
'GeForce RTX 2080 Ti': 119211,
'Quadro P5200': 80997,
'TITAN RTX': 484239,
'Tesla P100-PCIE-16GB': 99454,
'Tesla T4': 193,
'Tesla V100-SXM2-16GB': 1278}) `

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Ah I see. I'm guessing UI is summing up runtimes of experiments in project.

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

CostlyOstrich36 nice, thanks for the link. I know that in "info" on the experiments dashboard it includes gpu_type and started/completed times, I'll give it a go based on that

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

I might not be able to get to that but if you create an issue I'd be happy to link or post what I came up with, wdyt?

Taking a look at your snippet, I wouldn't mind submitting a PR for such a cool feature 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

One has an active duration of 185502. dividing that by 60 gives you minutes, oh I did the math wrong. Need to divide by 60 again to get hours,

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

If you get GPU-hours per project stats it would be really cool if you added this as a pull request

  				
Posted 
	3 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

Or examples of, like, "select all experiments in project with iterations > 0"?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Good point! Any pointers to API docs to start looking?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

I might not be able to get to that but if you create an issue I'd be happy to link or post what I came up with, wdyt?

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

CostlyOstrich36 at the bottom of the screenshot it says "Compute Time: 440 days"

  				
Posted 
	3 years ago

					More  		
  Report
		
					SmallDeer34
				
					0
					 × 1

Write your answer

1K Views

21 Answers

3 years ago

2 years ago