Hello, I Am Looking For A Way To Increase Number Of Images Saved In Results>Debug Samples. Looks Like There Is A Limit Of 100 Images Per Experiment, And All Images Saved After Are Not Displayed In Web Client. I Like To Have First Batch With Predictions V

Answered

Hello, I am looking for a way to increase number of images saved in results>debug samples.
Looks like there is a limit of 100 images per experiment, and all images saved after are not displayed in web client.

I like to have first batch with predictions visualization, but if I have 4 different validations, I save 5 images per epoch and in 20 epochs I don't get new images in Debug Samples.
Are there any way to increase it? (I know that I can save once per 10 epochs, but still I want to increase amount of images at least to 300)

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

Votes Newest

Answers 26

Hi Martin, thank you for your reply.
Could you please show an example about image title/series?
My have names like Epoch_001_first_batch_train, Epoch_001_first_batch_val,
Epoch_001_first_batch_val_balanced,
Epoch_002_first_batch_train, and so on

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

I don't know for sure but this is what I understand from the code. But you need to have 100 experiment running at the same time, so unless you have access to 100 GPUs you should be fine

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

Hi MortifiedDove27

Looks like there is a limit of 100 images per experiment,

The limit is 100 unique combination of title/series per image.
This means that changing the title or the series name will add 100 more images (notice the 100 limit is for previous iterations)

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

What is the difference to

file_history_size

Number of unique files per titles/series combination (aka how many images to store in the history, when the iteration is constantly increasing)

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

from torch.utils.tensorboard import SummaryWriter
writer.add_figure('name', figure=fig)
where fig is matplotlib

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

I have to problem that "debug samples" are not shown anymore after running many iterations.

ReassuredTiger98 could you expand on it? What do you mean by "not shown anymore" ?
Can you see other reports ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Thanks, that makes sense. Can you also explain what task_log_buffer_capacity does?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Right, if this is the case, then just use 'title/name 001' it should be enough (I think this is how TB separates title/series or metric/variant )

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?

yes then the 100 limit is per "Epoch001" and another 100 limit for "Epoch002" etc. 🙂

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

But would this not have to be a server parameter instead of a clearml.conf parameter then? Maybe someone from clearml can confirm MortifiedDove27 's explaination?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Thanks for answering, but I still do not get it. file_history_size decides how many past files are shown? So if file_history_size=100 and I have 1 image/iteration and ran 1000 iterations, I will see images for iteration 900-1000?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Thank you very much!!!

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?

Thank you, I will try

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

AgitatedDove14

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

How do you currently report images, with the Logger or Tensorboard or Matplotlib ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I did my best in explanation.
You have buffer of tasks, for example 100. When you add task #101 the task under #1 is replaced with new and you keep now tasks from #2 to #101.
Because I have > 100 saved experiment, I don't think that anyone should bother to change it, unless you are running more than 100 experiments at the same time

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

MortifiedDove27 Sure did, but I do not understand it very well. Else I would not be asking here for an intuitive explanation 🙂 Maybe you can explain it to me?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Guys I think I lost context here 🙂 what are we talking about? Can I help in anyway ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Ohh sorry. task_log_buffer_capacity is actually internal buffer for the console output, on how many lines it will store before flushing it to the server.
To be honest, I can't think of a reason to expose / modify it...

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 I think Tim wanted to know what is task_log_buffer_capacity and what functionality it provides

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

Thanks for answering. I don't quite get your explanation. You mean if I have 100 experiments and I start up another one (experiment "101"), then experiment "0" logs will get replaced?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

AgitatedDove14 Could you elaborate?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

ReassuredTiger98 why don't you take 5 minutes time and check out source code? https://github.com/allegroai/clearml/blob/701fca9f395c05324dc6a5d8c61ba20e363190cf/clearml/backend_interface/task/log.py

this is pretty obvious, it replaces last task with new task when the buffer is full

  				
Posted 
	3 years ago

					More  		
  Report
		
					MortifiedDove27
				
					0
					 × 1

you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi ReassuredTiger98
So let's assume we call:
logger.report_image(title='training', series='sample_1', iteration=1, ...)And we report every iteration (keeping the same title.series names). Then in the UI we could iterate back on the last 100 images (back in time) for this title / series.
We could also report a second image with:
logger.report_image(title='training', series='sample_2', iteration=1, ...)which means that for each one we will have 100 past images to review ( i.e. same title/series only diff iteration, i.e. progress in time).
Make sense ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 I have to problem that "debug samples" are not shown anymore after running many iterations. What's appropriate to use here: A colleague told me increasing task_log_buffer_capacity worked. Is this the right way? What is the difference to file_history_size ?

  				
Posted 
	3 years ago

					More  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Write your answer

1K Views

26 Answers

3 years ago

2 years ago