I Am Experiencing Performance Issues With Using Clearml Together With Pytorch Lightning Cli For Experiment Tracking. Essentially What We'Re Doing Is Fetching The Logger Object Through Task.Get_Logger() And Then Using The Reporting Methods. However, It Ad

Answered

I am experiencing performance issues with using ClearML together with Pytorch Lightning CLI for experiment tracking. Essentially what we're doing is fetching the logger object through task.get_logger() and then using the reporting methods. However, it adds a huge overhead to our model training in terms of time spent.

Attached is a picture comparing time-per-epoch for training with ClearML logging enabled or disabled.

I'm assuming this is because the logging is synchronous and thread-blocking. Is there any way to configure the logger to work in a background thread, or batch x number of messages before sending, etc.?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SoreHorse95
				
					0
					 × 1

Votes Newest

Answers

Hi SoreHorse95 ,

Does ClearML not automatically log all outputs?

Regarding logging maybe try the following setting in ~/clearml.conf sdk.network.metrics.file_upload_threads: 16

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Write your answer

1K Views

1 Answer

2 years ago

one year ago