Unanswered
So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What
Could I use "register artifact"
I think this is somewhat deprecated and we should probably replace it with something similar to what you mentioned (i.e. watch a file change).
Right now the easiest way would e to manually upload the trainer_state.json
every checkpoint:Task.current_task().upload_artifact('trainer_state.json
, name='state') `
160 Views
0
Answers
3 years ago
one year ago