Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I Have A Question About The Model Registry. Here'S My Situation: I'M Using K8S_Example And Struggling With Uploading A Model. Should Models Be Uploaded To The Fileserver, Or Should I Create Another S3 Bucket As Mentioned In The Documentation?

Hi, I have a question about the Model Registry. Here's my situation: I'm using k8s_example and struggling with uploading a model. Should models be uploaded to the Fileserver, or should I create another S3 bucket as mentioned in the documentation?
sdk.development.default_output_uri = ...
Currently, models are being saved locally in the pod and are deleted when the pod is terminated, and I can't find the reason why.

  
  
Posted 3 months ago
Votes Newest

Answers 15


@<1523701070390366208:profile|CostlyOstrich36> Yes, I read this at documentation and tried it. But when I use "True" It changes path from " None ...." to " None ..." It's very strange behavior

  
  
Posted 3 months ago

Hi @<1523701070390366208:profile|CostlyOstrich36> , I tried this, but It doesn't work, should it be fileserver url?

  
  
Posted 3 months ago

Ok, guys, I done it, by manually uploading model.
task = Task.init(project_name='test', task_name='PyTorch MNIST train filserver dataset')
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri=" None ")
tmp_dir = os.path.join(gettempdir(), " mnist_cnn.pt ")
torch.save(model.state_dict(), tmp_dir)
output_model.update_weights(weights_filename=tmp_dir)

  
  
Posted 3 months ago

Hi @<1742355077231808512:profile|DisturbedLizard6> , you can use the output_uri parameter of Task.init() to specify where to upload models.
None

  
  
Posted 3 months ago

Hi @<1742355077231808512:profile|DisturbedLizard6> , not sure I get that, did you use torch.save (like in here ) or some other command to save the models? When running with the clearml-agent. you have a print of all the configurations at the beginning of the log, can you verify your values are as you configure it?

Additionally, which version of clearml , clearml-agent and torch are you using?

  
  
Posted 3 months ago

How were you saving the model with pytorch?

  
  
Posted 3 months ago

I'm currently unsure about the correct approach. Would you kindly review my attempts and point out where I might have made a mistake? Here's what I've tried:

  1. I've added the default url in agent helm chart
    clearml:
      ...
      clearmlConfig: |-
       sdk {
         development {
           default_output_uri: "
"
          }
       }
  1. I've added url in agent section:
    agentk8sglue:
      ...
      fileServerUrlReference: 
  1. In the Python file, when using Task.init, I've tried the 'output_uri' key argument with both 'True' and the file server URL ' None '.
  
  
Posted 3 months ago

'True' should point to the files server

  
  
Posted 3 months ago

Are you sure the files server is correctly configured on the pods ?

  
  
Posted 3 months ago

So when you do torch.save() it doesn't save the model?

  
  
Posted 3 months ago

I didn't save it in any way. I relied on the auto-save from Clearml

  
  
Posted 3 months ago

Pod easily can download dataset, upload to fileserver logs, but can't upload model 😀

  
  
Posted 3 months ago

Ok, maybe someone knows: how does a pod created by a K8s agent know the model registry URL? When I added the output_uri parameter in the Task, like output_uri=" None ", it doesn't show anything now. Previously, without this parameter, it showed a path like " None ...." in WebUI->Experiments->Artifacts

  
  
Posted 3 months ago

Ok, I found out that using scikit-learn the model is uploading, but pytorch doesn't.

  
  
Posted 3 months ago

I run code from pod created by agent and model has been uploaded. But when task was started by agent command it doesn't upload) Magic

  
  
Posted 3 months ago