AgitatedDove14 I'm making some progress on this. I've currently got the situation that my training run saved all of these files, and Task.get_task(param['TaskA']).models['output''][-1]
gets me just one of them, training_args.bin
. Then -2
gets me another, rng_state.pth
If I just get Task.get_task(param['TaskA']).models['output']
, I end up getting a huge list of, like,
[<clearml.model.Model object at 0x7fec2841c880>, <clearml.model.Model object at 0x7fec2841c8e0>, <clearml.model.Model object at 0x7fec2841c820>...
So I think I have a solution here, which is to just loop backwards through the list until I find the right file I want to load.
But I just noticed that for some reason pytorch_model.bin isn't there. I'm not sure why that wasn't saved. huh