I Have A Question Regarding Reducing Execution Time Of Pulling Results From The Server With The Python Api. As Part Of Some Pipeline, After Running Hpo I Am Pulling All The Results From My Optimizer Task And Also Pulling All The Scalars Associated With Th

Answered

I have a question regarding reducing execution time of pulling results from the server with the python API.
As part of some pipeline, after running HPO I am pulling all the results from my optimizer task and also pulling all the scalars associated with that. and it is very very slow.. ~30 min for 1000 experiments
I am running something like:
top_tasks = an_optimizer.get_top_experiments(n_exp) task_scalars = dict() for task in top_tasks: task_scalars[task.id] = task.get_last_scalar_metrics()Is there a way to make it faster?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

Votes Newest

Answers 24

kind of on the same topic, it would be very useful if some kind of verbosity will be enabled.. some kind of progress bar for get_top_experiments()

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

DepressedChimpanzee34 , Hi!

The part you want to do faster is the code snippet you provided? Also, I'll check regarding the verbosity 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

DepressedChimpanzee34 something along the lines of:
from multiprocessing.pool import ThreadPool p = ThreadPool() def get_last_metric(t): return t.get_last_scalar_metrics() task_scalars_list = p.map(get_last_metric, top_tasks) p.close()We parallelized network connection as I'm assuming the delay is fetching

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 thanks, I actually experimented with similar parallel pool approach but the overhead seem to even out the benefit..
is there something you can think of for the first part though? pulling all the experiments get_top_experiments()

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

You can try just pulling the "metric" section of the Task, but I cannot imaging the network bandwidth is the issue?
Could it be load on the clearml-server (i.e. it needs to handle lots of requests ?)

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

You can try direct API call for all the Tasks together:
Task._query_tasks(task_ids=[IDS here], only_fields=['last_metrics'])

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

thanks, I'll try this. Is there an efficient way to get the IDs first?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

this?
ids = [t.id for t in top_task]

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I mean to get top_tasks

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

that is the heaviest part for me

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

optimizer.get_top_experiments(n)

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

Hmm check if this one works:
optimizer._get_child_tasks_ids( parent_task_id=optimizer._job_parent_id or optimizer._base_task_id, order_by=optimizer._objective_metric._get_last_metrics_encode_field(), additional_filters={'page_size': int(top_k), 'page': 0})If it does, let's PR it as a dedicated function

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

it seem to be orders of magnitude faster!

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

I have a small question about the response structure, each of the metrics has this structure:
metric_id: { ... "value": 0.0006447011, "min_value": 8.6326945e-06, "max_value": 0.001049518, ... } what does value refer to? the last reported?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

AgitatedDove14 , for creating a dedicated function I would suggest also including the actual sampled point in the HP space. This would be the most common use case, and essentially the reason for running the HPO understanding the sensitivity of metrics with respect to hyper-parameters

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

for me at the moment it means "manually" filtering the keys I've put in for the HP space. I find it a bit strange that they are not saved as part of the optimizer object..
the optimizer_task seem to have an attribute called hyper_parameters but its empty in my case..

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

or creating a dedicated function I would suggest also including the actual sampled point in the HP space.

Could you expand ?

This would be the most common use case, and essentially the reason for running the HPO understanding the sensitivity of metrics with respect to hyper-parameters

Does this relates to:
https://github.com/allegroai/clearml/issues/430

manually" filtering the keys I've put in for the HP space. I find it a bit strange that they are not saved as part of the optimizer object..

what do you mean?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 , I am referring to some generic HPO scenario where you define some HP space lets say:
param1 = np.linspace(lower_bound, upper_bound, n) param2 = np.linspace(lower_bound, upper_bound, n)then you run an optimization that samples this HP space,
For each trial a sample is pulled from the space, some experiment is performed and you get a score. Then to analyze the behavior of your objective you want to understand the relation between the params and objective score.
Then if you pull the trials metrics, you most likely want to know to which HP they belong.
So the bottom line is that when pulling results you are interested in the metrics values + HP point (param1=values, param2=values, ...) of the trial

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

AgitatedDove14 , what I meant by manually filtering, at the moment, to combine the information of metric values + HP point, I pull all the parameters, and then manually filter on the HP keys (manually=I have to plug them in, they are not part of optimizer object)

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

AgitatedDove14 , the issue you mention does not relate to this discussion

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

I pull all the parameters, and then manually filter on the HP keys (manually=I have to plug them in, they are not part of optimizer object)

So is this an improvement to optimizer._get_child_tasks_ids(...) interface ?
e.g. return a structure like:
[ { 'id': task_id, 'hp1': value, 'hp2': value, 'hp3': value, 'objective': dict(title='title', series='series', value=42 }, ]

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 , definitely so, this is very generic and very useful
In many cases the objective is just one of multiple metrics of interest, so for me almost always I would want to combine it with the rest of the scalar metrics

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

Sounds good to me. DepressedChimpanzee34 any chance you can add a github feature request, so we do not forget to add it?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 , done
https://github.com/allegroai/clearml/issues/473

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					DepressedChimpanzee34
				
					0
					 × 1

Write your answer

1K Views

24 Answers

3 years ago

one year ago