Hi There, I'M Training A Pytorch Model And Save It Every Epoch. It Seems Like The Model Wights Are Overridden And I Can'T Choose The Best Model After The Experiment Ends. This Feature Is Missing Or I'M Not Using The Library Well?

Answered

Hi there,
I'm training a pytorch model and save it every epoch. It seems like the model wights are overridden and I can't choose the best model after the experiment ends.
This feature is missing or I'm not using the library well?

  				
Posted 
	4 years ago

					More  		
  Report
		
					PompousBeetle71
				
					0
					 × 1

Votes Newest

Answers 9

AgitatedDove14 Yes.

  				
Posted 
	4 years ago

					More  		
  Report
		
					PompousBeetle71
				
					0
					 × 1

SuccessfulKoala55 please post here once the code is available in your pytorch_ignite 🙂

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

PompousBeetle71 just making sure, and changing the name solved it?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

PompousBeetle71 , check the n_saved parameter on the ModelCheckpoint creation.

  				
Posted 
	4 years ago

					More  		
  Report
		
					SteadyFox10
				
					0
					 × 1

SteadyFox10 AgitatedDove14 Thanks, I really did change the name.

  				
Posted 
	4 years ago

					More  		
  Report
		
					PompousBeetle71
				
					0
					 × 1

Well, I use ignite and trains-server with a logging similar to ignite.contrib.handlers so I will be very happy to test this integration.

  				
Posted 
	4 years ago

					More  		
  Report
		
					SteadyFox10
				
					0
					 × 1

Oh sorry, I was thinking about ignite (I don't know why) not trains. The only way I know is to use a different name when saving. I personnaly use f"{file_name}_{epoch}_{iteration}" .

  				
Posted 
	4 years ago

					More  		
  Report
		
					SteadyFox10
				
					0
					 × 1

SteadyFox10 ModelCheckpoint is not for pytorch I think, couldn't find anything like it.

  				
Posted 
	4 years ago

					More  		
  Report
		
					PompousBeetle71
				
					0
					 × 1

Hi PompousBeetle71 I'm with SteadyFox10 on this one. Unless you choose a file name based on epoch or step , you are literally overwriting the model file, which Trains will reflect. If you use epoch in the filename you will end up with all your models logged by Trains. BTW we are actively working on integration with pytorch ignite, so if you have any suggestions now is the time :)

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

1K Views

9 Answers

4 years ago

2 years ago