Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi! I Am Using The Modelcheckpoint Callback From Tensorflow To Save The Best Model. When The Experiment Finishes If I Go On The Server To Experiment > Artifacts > Output Model I Can See The Model And Subsequently By Clicking On It The Weights. How Can I

Hi! I am using the ModelCheckpoint callback from Tensorflow to save the best model. When the experiment finishes if I go on the server to Experiment > Artifacts > Output Model I can see the model and subsequently by clicking on it the weights. how can I retrieve the stored weights? I tried task.models['output'][0].get_local_copy() and task.models['output'][0].get_weights() but both return a None path.

  
  
Posted 3 years ago
Votes Newest

Answers 17


Hi GrievingTurkey78
task.models['output'][-1] should return the last stored model.
What do you have under under task.models['output'][-1].url

Documentation:
https://allegro.ai/clearml/docs/rst/references/clearml_python_ref/model_module/model_outputmodel.html?highlight=model#model-outputmodel

  
  
Posted 3 years ago

I get the URL to the checkpoint/weights can I use this to download the weights?

  
  
Posted 3 years ago

Using the get_weights(True) I get ValueError: Could not retrieve a local copy of model weights <ID>, failed downloading <URL>

  
  
Posted 3 years ago

I get the URL to the checkpoint/weights

Is it a valid URL ?
GrievingTurkey78 Do you have there http:// or is it file:// ?

  
  
Posted 3 years ago

It’s file://

  
  
Posted 3 years ago

Yes, that sounds like the issue, is the file actually there ?

  
  
Posted 3 years ago

How can I check that Martin?

  
  
Posted 3 years ago

On the server through the command line?

  
  
Posted 3 years ago

I mean what is the actual link?
File:// is a path to a file.
If your machine cannot access that path you get an error.
For example:
file:///home/user/file.bin
translates to /home/user/file.bin
If you do not have the file /home/user/file.bin on your machine you get an error.
GrievingTurkey78 make sense ?
Note that by default trains / clearml will not upload your weights file anywhere , only if you set "output_uri" to a specific location it will do that .

  
  
Posted 3 years ago

Makes sense! Then where would I have to add output_uri to save the weights?

  
  
Posted 3 years ago

I just want to retrieve the weights on a script that tests models I have trained in the past

  
  
Posted 3 years ago

Three options:
In your code: Task.init(..., output_uri='s3://.../'2. Configure a default output_uri to be used by all tasks: https://github.com/allegroai/clearml/blob/64042f6c4fdaaf15b6c5f816f2fbf50f89c313e2/docs/clearml.conf#L156
3. In the UI after you clone a Task under Execution tab, "output" "destination"

In all cases output_uri can be:
/mnt/share/folder (if you have a shared folder between all machines. http://trains-server:8081/ gs://bucket azure://bucket/

  
  
Posted 3 years ago

Thanks so much AgitatedDove14 !

  
  
Posted 3 years ago

With pleasure πŸ™‚

  
  
Posted 3 years ago

For option 2 do I have to configure it on all agents or on the server?

  
  
Posted 3 years ago

On all Agents

  
  
Posted 3 years ago

πŸ‘Œ Great

  
  
Posted 3 years ago