Answered

Hey, I Had A Problem With

Hey, I had a problem with https://github.com/allegroai/trains/tree/master/examples/optimization/hyper-parameter-optimization from the repo. I ran the experiment in the function base_template_keras_simple.py and anything went well, then I ran the following code:
` import logging

from trains import Task
from trains.automation import (
DiscreteParameterRange, HyperParameterOptimizer, RandomSearch,
UniformIntegerParameterRange)

def job_complete_callback(
job_id, # type: str
objective_value, # type: float
objective_iteration, # type: int
job_parameters, # type: dict
top_performance_job_id # type: str
):
print('Job completed!', job_id, objective_value, objective_iteration, job_parameters)
if job_id == top_performance_job_id:
print('WOOT WOOT we broke the record! Objective reached {}'.format(objective_value))

Connecting TRAINS

task = Task.init(project_name='noam_hyperopt_optimization',
task_name='Automatic Hyper-Parameter Optimization',
task_type=Task.TaskTypes.optimizer,
reuse_last_task_id=False)

experiment template to optimize in the hyper-parameter optimization

args = {
'template_task_id': None,
'run_as_service': False,
}
args = task.connect(args)

Get the template task experiment that we want to optimize

if not args['template_task_id']:
args['template_task_id'] = Task.get_task(
project_name='noam_hyperopt_optimization', task_name='Keras HP optimization base').id

an_optimizer = HyperParameterOptimizer(base_task_id=args['template_task_id'],hyper_parameters=[
UniformIntegerParameterRange('layer_1', min_value=128, max_value=512, step_size=128),
UniformIntegerParameterRange('layer_2', min_value=128, max_value=512, step_size=128),
DiscreteParameterRange('batch_size', values=[96, 128, 160]),
DiscreteParameterRange('epochs', values=[30])],
objective_metric_title='epoch_accuracy', objective_metric_series='epoch_accuracy', objective_metric_sign='max', optimizer_class=RandomSearch)

an_optimizer.set_report_period(0.1)
an_optimizer.start(job_complete_callback=job_complete_callback)
an_optimizer.set_time_limit(in_minutes=12.0)
an_optimizer.wait()

top_exp = an_optimizer.get_top_experiments(top_k=3)
print([t.id for t in top_exp])
an_optimizer.stop()

print('Done') `

Which is almost the same as the code in the example, somehow all the task which are created by the optimizer are marked as 'pending' and time is going out without any task being running (as can be shown on the image). Do someone know what can ne the reason for that kind of problem?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UptightBeetle98
				
					0
					 × 1

Votes Newest

Answers 2

Hi UptightBeetle98
The hyper parameter example assumes you have agents ( trains-agent ) connected to your account. These agents will pull the jobs from the queue (which they are now, aka pending) setup the environment for the jobs (venv or docker+venv) and execute the job with the specific arguments the optimizer chose.

Make sense ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

It works, thanks!

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					UptightBeetle98
				
					0
					 × 1

Write your answer

952 Views

2 Answers

4 years ago

one year ago