I get the URL to the
checkpoint/weights can I use this to download the weights?
I mean what is the actual link?
File:// is a path to a file.
If your machine cannot access that path you get an error.
translates to /home/user/file.bin
If you do not have the file /home/user/file.bin on your machine you get an error.
GrievingTurkey78 make sense ?
Note that by default trains / clearml will not upload your weights file anywhere , only if you set "output_uri" to a specific location it will do that .
Makes sense! Then where would I have to add
output_uri to save the weights?
Yes, that sounds like the issue, is the file actually there ?
How can I check that Martin?
I just want to retrieve the weights on a script that tests models I have trained in the past
With pleasure 🙂
For option 2 do I have to configure it on all agents or on the server?
On the server through the command line?
Thanks so much AgitatedDove14 !
On all Agents
In your code:
Task.init(..., output_uri='s3://.../'2. Configure a default output_uri to be used by all tasks: https://github.com/allegroai/clearml/blob/64042f6c4fdaaf15b6c5f816f2fbf50f89c313e2/docs/clearml.conf#L156
3. In the UI after you clone a Task under Execution tab, "output" "destination"
In all cases output_uri can be:
/mnt/share/folder (if you have a shared folder between all machines. http://trains-server:8081/ gs://bucket azure://bucket/
task.models['output'][-1] should return the last stored model.
What do you have under under
get_weights(True) I get
ValueError: Could not retrieve a local copy of model weights <ID>, failed downloading <URL>