Assuming I Have A

Answered

Assuming I Have A

Assuming I have a trains server installed on Machine A, a trains agent installed on Machine B and I'm developing on Machine C. Is it possibe to launch a task from Machine C to the queue that Machine B's agent is listening to?

Do I have to have anything installed (aside from the trains PIP package) on Machine C to do so?

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					WackyRabbit7
				
					0
					 × 1

Votes Newest

Answers 9

Is it possibe to launch a task from Machine C to the queue that Machine B's agent is listening to?

Yes, that's the idea

Do I have to have anything installed (aside from the

trains

PIP package) on Machine C to do so?

Nothing, pure magic 🙂

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

A few implementation / design details:
When you run code with Trains (and call init) it will record your environment (python packages, git code, uncommitted changes etc) Everything is stored on the Task object in the trains-server, when you clone a task you literally create a copy of the Task object (i.e. a second experiment). on the cloned experiment, you can edit everything (parameters, git, base docker image etc) When you enqueue a Task you add its ID to the execution queue list a trains-agent listens on the execution pops the Task ID and sets up the environment as written on the task, then it will launch and monitor the code Multiple trains-agent workers can listen on the same queue, or on multiple queues in a priority fashion (i.e. first try to pop from the first, then the seconds etc.)

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

(without having to execute it first on Machine C)

Someone some where has to create the definition of the environment...
The easiest to go about it is to execute it one.
You can add to your code the following line
task.execute_remotely(queue_name='default')This will cause you code to stop running and enqueue itself on a specific queue.
Quite useful if you want to make sure everything works, (like run a single step) then continue on another machine.
Notice that switching between cpu/gpu packages is taken care by the trains-agent, so you can run on cpu machine for testing, then enqueue the same code and the trains-agent on the GPU machine will install the correct version of the packages (TF or PyTroch)

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

The scenario I'm going for is never to run on the dev machine, so all I'll need to do once the server + agents are up is to add task.execute_remotely... after the Task.init line and after the execution of the script is called on the dev machine, it won't actually run but rather enqueue itself for the agent to run it?

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					WackyRabbit7
				
					0
					 × 1

That is correct.
Obviously once it is in the system, you can just clone/edit/enqueue it.
Running it once is a mean to populate the trains-server.
Make sense ?

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

👍

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					WackyRabbit7
				
					0
					 × 1

Continuing on this line of thought... Is it possible to call task.execute_remotely on a CPU only machine (data scientists' laptop for example) and make the agent that fetches this task to run it using GPU? I'm asking that because it is mentioned that it replicates the running environment on the task creator... which is exactly what I'm not trying to do 😄

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					WackyRabbit7
				
					0
					 × 1

WackyRabbit7 I guess we are discussing this one on a diff thread 🙂 but yes, should totally work, that's the idea

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

😄

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					WackyRabbit7
				
					0
					 × 1

Write your answer

2K Views

9 Answers

5 years ago

2 years ago