Hello, Is It Possible To Run Trains Offline Where There'S No Http Connection Between The Node Running The Job And Where The Web Ui Runs? I See In Your Diagram The Connection Between Training Machine And Trains Server (Which Contains The Web Ui) Is Over Ht

Answered

Hello, is it possible to run Trains offline where there's no HTTP connection between the node running the job and where the web UI runs? I see in your diagram the connection between Training Machine and Trains Server (which contains the web UI) is over HTTP. Is it possible to use a shared disk instead?

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

Votes Newest

Answers 22

Yes, it will always create a new Task.

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

yes.
Obviously when you import the offline session, you will need to set it to point to your server with the correct credentials

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

The import process actually creates a new Task every import, that said if you take a look here:
https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/trains/task.py#L1733
you can pass a pre-existing Task ID to "import_task" https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/trains/task.py#L1653

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

yes that will work 🙂

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Just to clarify, where do I run the second command?

Anywhere just open a python console and import the offline task:
from trains import TaskTask.import_offline_session('./my_task_aaa.zip')

Related, how to I specify in my code the cache_dir where the zip is saved?

This is the Trains cache folder, you can set it in the trains.conf file:
https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/docs/trains.conf#L24

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Because all my config will be in one place (my training script). I'd like the zipped runs to be at a specific place on the shared disk. Otherwise I'll have to manually copy/paste them so they are visible from the other node (the one running the server)

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

Hi ImpressionableRaven99
Yes, it is 🙂
Call this one before task.init, and it will run offline (at the end of the execution, you will get a link to the local zip file of the execution)
Task.set_offline(True)Then later you can import it to the system with:
Task.import_offline_session('./my_task_aaa.zip')

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Just to clarify, where do I run the second command?

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

copy paste the trains.conf from any machine, it just need the definition of the trains-server address.
Specifically if you run in offline mode, there is no need for the trains.conf and you can just copy the one on GitHub

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Thanks. On the second one, can I specify it my code? (I'd prefer to avoid a separate trains.conf file if possible)

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

Thanks, Martin.

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

Seems to work. Although if I import the same run at two different stages (because I want to see how it's doing while training, not only at the end). it doesn't recognise it's the same task/run just with more iterations. It just creates a new task...

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

not really 😞
Why would you want to set it up manually ? makes sense to have it in the cache folder, no?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

E.g. this one https://github.com/allegroai/trains/blob/master/docs/trains.conf

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

I see that to create a trains.conf file I need to run trains-init which requires a browser. What if I can't open a browser on that machine?

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

So copy train.conf from elsewhere to ~/train.conf?

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

BTW: you can quite easily add an option to set the offline folder, check here:
https://github.com/allegroai/trains/blob/10ec4d56fb4a1f933128b35d68c727189310aae8/trains/config/init.py#L31
PRs are always appreciated :)

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I see.
You can get the offline folder programmatically then copy the folder content (it's the same as the zip, and you can also pass a folder instead of zip to the import function)
task.get_offline_mode_folder()You can also have a soft link of the offline folder (if you are working on a linux machine:
ln -s myoffline_folder ~/.trains/cache/offline

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Is there a way to automatically deduplicate? I'm guessing tasks have some unique id in the zip (maybe even the file name)

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

Thanks. let me try it...

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

Related, how to I specify in my code the cache_dir where the zip is saved?

  				
Posted 
	4 years ago

					More  		
  Report
		
					ImpressionableRaven99
				
					0
					 × 1

But this will require some code changes...

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

1K Views

22 Answers

4 years ago

2 years ago