Hi MortifiedDove27
Looks like there is a limit of 100 images per experiment,
The limit is 100 unique combination of title/series per image.
This means that changing the title or the series name will add 100 more images (notice the 100 limit is for previous iterations)
I have to problem that "debug samples" are not shown anymore after running many iterations.
ReassuredTiger98 could you expand on it? What do you mean by "not shown anymore" ?
Can you see other reports ?
Ohh sorry. task_log_buffer_capacity
is actually internal buffer for the console output, on how many lines it will store before flushing it to the server.
To be honest, I can't think of a reason to expose / modify it...
You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?
yes then the 100 limit is per "Epoch001" and another 100 limit for "Epoch002" etc. 🙂
You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?
Thank you, I will try
MortifiedDove27 Sure did, but I do not understand it very well. Else I would not be asking here for an intuitive explanation 🙂 Maybe you can explain it to me?
Thanks, that makes sense. Can you also explain what task_log_buffer_capacity
does?
Guys I think I lost context here 🙂 what are we talking about? Can I help in anyway ?
ReassuredTiger98 why don't you take 5 minutes time and check out source code? https://github.com/allegroai/clearml/blob/701fca9f395c05324dc6a5d8c61ba20e363190cf/clearml/backend_interface/task/log.py
this is pretty obvious, it replaces last task with new task when the buffer is full
Hi ReassuredTiger98
So let's assume we call:logger.report_image(title='training', series='sample_1', iteration=1, ...)
And we report every iteration (keeping the same title.series names). Then in the UI we could iterate back on the last 100 images (back in time) for this title / series.
We could also report a second image with:logger.report_image(title='training', series='sample_2', iteration=1, ...)
which means that for each one we will have 100 past images to review ( i.e. same title/series only diff iteration, i.e. progress in time).
Make sense ?
What is the difference toÂ
file_history_size
Number of unique files per titles/series combination (aka how many images to store in the history, when the iteration is constantly increasing)
Thanks for answering. I don't quite get your explanation. You mean if I have 100 experiments and I start up another one (experiment "101"), then experiment "0" logs will get replaced?
AgitatedDove14 I have to problem that "debug samples" are not shown anymore after running many iterations. What's appropriate to use here: A colleague told me increasing task_log_buffer_capacity
worked. Is this the right way? What is the difference to file_history_size
?
Hi Martin, thank you for your reply.
Could you please show an example about image title/series?
My have names like Epoch_001_first_batch_train, Epoch_001_first_batch_val,
Epoch_001_first_batch_val_balanced,
Epoch_002_first_batch_train, and so on
from torch.utils.tensorboard import SummaryWriter
writer.add_figure('name',
figure=fig)
where fig is matplotlib
I don't know for sure but this is what I understand from the code. But you need to have 100 experiment running at the same time, so unless you have access to 100 GPUs you should be fine
AgitatedDove14 I think Tim wanted to know what is task_log_buffer_capacity
and what functionality it provides
I did my best in explanation.
You have buffer of tasks, for example 100. When you add task #101 the task under #1 is replaced with new and you keep now tasks from #2 to #101.
Because I have > 100 saved experiment, I don't think that anyone should bother to change it, unless you are running more than 100 experiments at the same time
you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32
Right, if this is the case, then just use 'title/name 001'
it should be enough (I think this is how TB separates title/series or metric/variant )
Thanks for answering, but I still do not get it. file_history_size
decides how many past files are shown? So if file_history_size=100
and I have 1 image/iteration and ran 1000 iterations, I will see images for iteration 900-1000?
How do you currently report images, with the Logger or Tensorboard or Matplotlib ?
But would this not have to be a server parameter instead of a clearml.conf parameter then? Maybe someone from clearml can confirm MortifiedDove27 's explaination?