Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Dear Clearml Community, I Am Looking For A Way To Properly Resume A Training In A Way That Initial Scalars Get Reused And Expanded. Clearml Feature For Reusing The Same Task Works Fine (When Using

Dear ClearML Community,
I am looking for a way to properly resume a training in a way that initial scalars get reused and expanded. ClearML feature for reusing the same Task works fine (when using continue_last_task = True and reuse_last_task_id = <my-clearml-task-id> ) and my training orchestrator automatically retrieves my latest checkpoint, that's alright!
However, I systematically notice a jump of some number of "ghost iterations" when resuming my trainings...
As you can see on the picture below, I stopped my training at 4610 iterations using ClearML "Abort" button. You can't see it from the scalar, but my checkpoint was saved at iteration 3354.
What I observe is that, strangely, the amount of iterations it took for getting to my checkpoint (i.e., 3354) corresponds to the added number of "ghost iterations" before pursuing the plot of the scalar when resuming the training.
Has anyone of you ever encountered such a skip in the number of iterations when resuming a training reusing the same preceding Task? 🤔
Thank you so much in advance for your support! 🙏
image

  
  
Posted 9 months ago
Votes Newest

Answers 10


Do you think such a feature exists in ClearML?

Currently this is "fixed" for iterations (which is actually just a integer monotonic value) or the time stamp.
But I cannot see any reason why we could not allow users to control the x-axis title, and to be able to set it in code, I'm assuming this is what you have in mind?

  
  
Posted 9 months ago

Yeah I think this kind of makes sense to me, any chance you can open a GH issue on this feature request?

  
  
Posted 9 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> , this is it yes!
A solution for freely choosing the x-axis title in the UI depending on the scalar (e.g., as in the screenshot above, but with "Epochs" instead of "Iterations" for the plot on the left 😉 ).
Do you think such a feature exists in ClearML?

  
  
Posted 9 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> , that's a very good point!
I found issue " Possibility to choose any scalar for horizontal x-axis #1186 " opened one month ago that is pretty close to what I suggest. I will complement it with my graph screenshots to illustrate the issue!
Thank you for your recommendation 🙇

  
  
Posted 9 months ago

Oh I see, basically a UI feature.
I'm assuming this is not just changing the x-axis in the UI, but somehow store the x-axis as part of the scalars reported?

  
  
Posted 9 months ago

Yes, I mean in the UI, just for the title of the x-axis.
For instance, in the graphs below, I am reporting "mIoU" metric by epochs . It's ok for "time" for instance to leave the x-axis title as "Iterations", but for "mIoU", I was wondering if it would be possible to change "Iterations" to "Epochs" for clarity 🙄 .
Thank you again for your reactivity and support! 🙏
image

  
  
Posted 9 months ago

Hi @<1663354518726774784:profile|CrookedSeal85>

However, I systematically notice a jump of some number of "ghost iterations" when resuming my trainings...

Try the following:

task = Task.init(..., continue_last_task=0

from the Task.init docstring (Notice this value can be both boolean and integer)

        :param bool continue_last_task: Continue the execution of a 
...
          - An integer - Specify initial iteration offset (override the auto automatic last_iteration_offset). Pass 0, to disable the automatic last_iteration_offset or specify a different initial offset. You can specify a Task ID to be used with `reuse_last_task_id='task_id_here'`
  
  
Posted 9 months ago

Hi @<1523701205467926528:profile|AgitatedDove14> ,

That was exactly that! Thank you for the hint! ✅ My scalars now get pursued as they should when resuming a training from latest checkpoint! 🤩

Just one more question, do you have any idea about how I could change the x-axis label from "Iterations" to "Epochs" for some specific scalars only? I saw from " ClearML Doc > CearML Fundamentals > Logger > Types of Logged Results " that this should be effectively possible:

Scalars - Time series data. X-axis is always a sequential number, usually iterations but can be epochs or others.

I have checked ClearML code, among others the Reporter and Logger classes, but I can't find it in the code.

Thank you very much in advance for your help again! 🙏

  
  
Posted 9 months ago

Yes, this is that indeed 😉 , to be able to freely choose the a-axis title depending on whether we intend to log data according to iterations or epochs 😃 .

  
  
Posted 9 months ago

Just one more question, do you have any idea about how I could change the x-axis label from "Iterations" to "Epochs"

You mean in the UI (i.e. just the title) ? or are you actually reporting iterations instead of epochs? and if so is this auto connected to tensorboard or is it reported manually ?

  
  
Posted 9 months ago
858 Views
10 Answers
9 months ago
9 months ago
Tags