Unanswered
Hi All
Im Trying To Save My Model Checkpoints During Runtime But Am Running Into A Confusing Snag.
I'M Using The Huggingface Architecture For A Transformer. Using Their Training Module To Control Training. In The Training Args, I Have The
console output:
clearml.Task - INFO - Completed model upload to file_server/training.e5f99149b9b/models/optimizer.pt
clearml.Task - INFO - Completed model upload to file_server/training.e5f99149b9b/models/scheduler.pt
clearml.Task - INFO - Completed model upload to file_server/training.e5f99149b9b/models/rng_state.pth
save_model
somemodel/checkpoint-198
clearml.Task - INFO - Completed model upload to file_server/training.e5f99149b9b/models/training_args.bin
29 Views
0
Answers
2 months ago
2 months ago