ReassuredTiger98 why don't you take 5 minutes time and check out source code? https://github.com/allegroai/clearml/blob/701fca9f395c05324dc6a5d8c61ba20e363190cf/clearml/backend_interface/task/log.py
this is pretty obvious, it replaces last task with new task when the buffer is full
I don't know for sure but this is what I understand from the code. But you need to have 100 experiment running at the same time, so unless you have access to 100 GPUs you should be fine
MortifiedDove27 Sure did, but I do not understand it very well. Else I would not be asking here for an intuitive explanation 🙂 Maybe you can explain it to me?
from torch.utils.tensorboard import SummaryWriter
writer.add_figure('name',
figure=fig)
where fig is matplotlib
You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?
Thank you, I will try
you can also increase the limit here:
https://github.com/allegroai/clearml/blob/2e95881c76119964944eaa0289549617e8afeee9/docs/clearml.conf#L32
Ohh sorry. task_log_buffer_capacity
is actually internal buffer for the console output, on how many lines it will store before flushing it to the server.
To be honest, I can't think of a reason to expose / modify it...
Guys I think I lost context here 🙂 what are we talking about? Can I help in anyway ?
Right, if this is the case, then just use 'title/name 001'
it should be enough (I think this is how TB separates title/series or metric/variant )
Hi MortifiedDove27
Looks like there is a limit of 100 images per experiment,
The limit is 100 unique combination of title/series per image.
This means that changing the title or the series name will add 100 more images (notice the 100 limit is for previous iterations)
You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?
yes then the 100 limit is per "Epoch001" and another 100 limit for "Epoch002" etc. 🙂
What is the difference toÂ
file_history_size
Number of unique files per titles/series combination (aka how many images to store in the history, when the iteration is constantly increasing)
Hi ReassuredTiger98
So let's assume we call:logger.report_image(title='training', series='sample_1', iteration=1, ...)
And we report every iteration (keeping the same title.series names). Then in the UI we could iterate back on the last 100 images (back in time) for this title / series.
We could also report a second image with:logger.report_image(title='training', series='sample_2', iteration=1, ...)
which means that for each one we will have 100 past images to review ( i.e. same title/series only diff iteration, i.e. progress in time).
Make sense ?
Thanks for answering. I don't quite get your explanation. You mean if I have 100 experiments and I start up another one (experiment "101"), then experiment "0" logs will get replaced?
AgitatedDove14 I think Tim wanted to know what is task_log_buffer_capacity
and what functionality it provides
Thanks, that makes sense. Can you also explain what task_log_buffer_capacity
does?
Hi Martin, thank you for your reply.
Could you please show an example about image title/series?
My have names like Epoch_001_first_batch_train, Epoch_001_first_batch_val,
Epoch_001_first_batch_val_balanced,
Epoch_002_first_batch_train, and so on
I have to problem that "debug samples" are not shown anymore after running many iterations.
ReassuredTiger98 could you expand on it? What do you mean by "not shown anymore" ?
Can you see other reports ?
But would this not have to be a server parameter instead of a clearml.conf parameter then? Maybe someone from clearml can confirm MortifiedDove27 's explaination?
Thanks for answering, but I still do not get it. file_history_size
decides how many past files are shown? So if file_history_size=100
and I have 1 image/iteration and ran 1000 iterations, I will see images for iteration 900-1000?
I did my best in explanation.
You have buffer of tasks, for example 100. When you add task #101 the task under #1 is replaced with new and you keep now tasks from #2 to #101.
Because I have > 100 saved experiment, I don't think that anyone should bother to change it, unless you are running more than 100 experiments at the same time
How do you currently report images, with the Logger or Tensorboard or Matplotlib ?
AgitatedDove14 I have to problem that "debug samples" are not shown anymore after running many iterations. What's appropriate to use here: A colleague told me increasing task_log_buffer_capacity
worked. Is this the right way? What is the difference to file_history_size
?