Hey! I Would Like To Connect To Same Task From Multiple Consumer And Upload Debug Image. Is It Possibile? It Seems Like I Can Connect To The Task. Get The Logger But Nothing Is Uploaded.

Answered

Hey!
I would like to connect to same task from multiple consumer and upload debug image.
Is it possibile?
It seems like i can connect to the task. get the logger but nothing is uploaded.
AgitatedDove14 please help 🙂

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

Votes Newest

Answers 30

should OMPI_COMM_WORLD_NODE_RANK be number or can be some guid?

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

Correct, and that also means the code the runs is not auto-magically logged.

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

so how can i make it run with the "auto magic"

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

is it the same?

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

Logger.current_logger()Will return the logger for the "main" Task.
The "Main" task is the task of this process, a singleton for the process.
All other instances create Task object. you can have multiple Task objects and log different things to them, but you can only have a single "main" Task (the one created with Task.init).
All the auto-magic stuff is logged automatically to the "main" task.
Make sense ?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

so if i plot image with matplot lib..it would not upload? i need use the logger.

Correct, if you have no "main" task , no automagic 😞

so how can i make it run with the "auto magic"

Automagic logs a single instance... unless those are subprocesses, in which case, the main task takes care of "copying" itself to the subprocess.

Again what is the use case for multiple machines?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

os.environ['TRAINS_PROC_MASTER_ID'] = args.trains_idit should be '1:'+args.trains_id

os.environ['TRAINS_PROC_MASTER_ID'] = '1:{}'.format(args.trains_id)Also str(randint(1, sys.maxsize))

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

so if i load task and not init it..so it is not the main one?

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

i get subtask

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

trains_task.started()

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

how can I pass to task_init the task_id? should it also be in some env?

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

os.environ['TRAINS_PROC_MASTER_ID'] = '1:da0606f2e6fb40f692f5c885f807902a' os.environ['OMPI_COMM_WORLD_NODE_RANK'] = '1' task = Task.init(project_name="examples", task_name="Manual reporting") print(type(task))Should be: <class 'trains.task.Task'>

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

FranticCormorant35 As far as I understand what you have going is a multi-node setup, that you manage yourself. Something like Horovod Torch distributed or any MPI setup. Since Trains support all of the above standard multi-node. The easiest way is to do the following:

On the master Node set OS environment:
OMPI_COMM_WORLD_NODE_RANK=0Then on any client node:
OMPI_COMM_WORLD_NODE_RANK=unique_client_node_numberIn all processes you can Call Task.init - with all the automagic kicking in. The Master node will be the only one registering the execution section of the experiment (i.e. git arg parser etc.) while all the rest will be logged as usual (console output, tensorboard matplotlib etc.)

How does that sound?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

each consumer in different container

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

ValueError: Task object can only be updated if created or in_progress

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

yes...

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

so if i plot image with matplot lib..it would not upload? i need use the logger.

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

hhhhmm..then i can not get master task params

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

Ok, so it is master node and not consumer producer pattern..

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

is it the same?

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

integer

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

It has to be alive so all the "child nodes" could report to it....

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

it started and then ended

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

failed on init

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

Should work out of the box, as long as the task was started. You can forcefully start the task with:
task.mark_started()

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Notice that you have to Have the task already started by the Master process

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

yes you are correct, OS environment:
TRAINS_PROC_MASTER_ID=1:task_id_here

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

args = parser.parse_args() os.environ['TRAINS_PROC_MASTER_ID'] = f"1:{args.trains_id}" os.environ['OMPI_COMM_WORLD_NODE_RANK'] = str(randint(0, sys.maxsize)) trains_task = trains.Task.init(project_name=f'{ALG_NAME}-inference', task_name='inference')failes on init

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

i do not have another task

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

os.environ['TRAINS_PROC_MASTER_ID'] = args.trains_id os.environ['OMPI_COMM_WORLD_NODE_RANK'] = str(randint(0, sys.maxsize)) trains_task = trains.Task.init(project_name=f'{ALG_NAME}-inference', task_name='inference') print(type(trains_task))<class 'trains.task.Task.init.<locals>._TaskStub'>

  				
Posted 
	4 years ago

					More  		
  Report
		
					FranticCormorant35
				
					0
					 × 1

Write your answer

1K Views

30 Answers

4 years ago

2 years ago