Hi, I Have A Question About The Model Registry. Here'S My Situation: I'M Using K8S_Example And Struggling With Uploading A Model. Should Models Be Uploaded To The Fileserver, Or Should I Create Another S3 Bucket As Mentioned In The Documentation?

Answered

Hi, I have a question about the Model Registry. Here's my situation: I'm using k8s_example and struggling with uploading a model. Should models be uploaded to the Fileserver, or should I create another S3 bucket as mentioned in the documentation?
sdk.development.default_output_uri = ...
Currently, models are being saved locally in the pod and are deleted when the pod is terminated, and I can't find the reason why.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

Votes Newest

Answers 15

Hi @<1742355077231808512:profile|DisturbedLizard6> , you can use the output_uri parameter of Task.init() to specify where to upload models.
None

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Ok, maybe someone knows: how does a pod created by a K8s agent know the model registry URL? When I added the output_uri parameter in the Task, like output_uri=" None ", it doesn't show anything now. Previously, without this parameter, it showed a path like " None ...." in WebUI->Experiments->Artifacts

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

Hi @<1523701070390366208:profile|CostlyOstrich36> , I tried this, but It doesn't work, should it be fileserver url?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

'True' should point to the files server

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

@<1523701070390366208:profile|CostlyOstrich36> Yes, I read this at documentation and tried it. But when I use "True" It changes path from " None ...." to " None ..." It's very strange behavior

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

Pod easily can download dataset, upload to fileserver logs, but can't upload model 😀

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

Are you sure the files server is correctly configured on the pods ?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

I'm currently unsure about the correct approach. Would you kindly review my attempts and point out where I might have made a mistake? Here's what I've tried:

I've added the default url in agent helm chart

    clearml:
      ...
      clearmlConfig: |-
       sdk {
         development {
           default_output_uri: "

"
          }
       }

I've added url in agent section:

    agentk8sglue:
      ...
      fileServerUrlReference:

In the Python file, when using Task.init, I've tried the 'output_uri' key argument with both 'True' and the file server URL ' None '.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

I run code from pod created by agent and model has been uploaded. But when task was started by agent command it doesn't upload) Magic

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

Ok, I found out that using scikit-learn the model is uploading, but pytorch doesn't.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

Ok, guys, I done it, by manually uploading model.
task = Task.init(project_name='test', task_name='PyTorch MNIST train filserver dataset')
output_model = OutputModel(task=task, framework="PyTorch")
output_model.set_upload_destination(uri=" None ")
tmp_dir = os.path.join(gettempdir(), " mnist_cnn.pt ")
torch.save(model.state_dict(), tmp_dir)
output_model.update_weights(weights_filename=tmp_dir)

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

How were you saving the model with pytorch?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

I didn't save it in any way. I relied on the auto-save from Clearml

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DisturbedLizard6
				
					0
					 × 1

So when you do torch.save() it doesn't save the model?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Hi @<1742355077231808512:profile|DisturbedLizard6> , not sure I get that, did you use torch.save (like in here ) or some other command to save the models? When running with the clearml-agent. you have a print of all the configurations at the beginning of the log, can you verify your values are as you configure it?

Additionally, which version of clearml , clearml-agent and torch are you using?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					RoundElephant20
				
					0

Write your answer

2K Views

15 Answers

one year ago