Um, Is There A Way To Delete An Artifact From A Task That Is Running?

Answered

Um, is there a way to delete an artifact from a task that is running?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

Votes Newest

Answers 25

AgitatedDove14 CostlyOstrich36 I think that is the approach that'll work for me. I just need to be able to remove checkpoints I don't need given I know their name, from the UI and Storage.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

I ran a training code from a github repo. It saves checkpoints every 2000 iterations. Only problem is I'm training it for 3200 epochs and there's more than 37000 iterations in each epoch. So the checkpoints just added up. I've stopped the training for now. I need to delete all of those checkpoints before I start training again.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

shouldn't checkpoints be uploaded immediately, that's the purpose of checkpointing isn't it?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

basically don't want the storage to be filled up on the ClearML Server machine.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

VexedCat68

. So the checkpoints just added up. I've stopped the training for now. I need to delete all of those checkpoints before I start training again.

Are you uploading the checkpoints manually with artifacts? or is it autologged & uploaded ?
Also why no reuse and overwrite older checkpoints ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

the storage is basically the machine the clearml server is on, not using s3 or anything

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

VexedCat68
delete the uploaded file, or the artifact from the Task ?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Since that is an immediate concern for me as well.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

Given a situation where I want delete an uploaded artifact from both the UI and the storage, how would I go about doing that?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

I think it depends on your implementation. How are you currently implementing top X checkpoints logic?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

I need to both remove the artifact from the UI and the storage.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

can you point me to where I should look?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

AgitatedDove14 Alright I think I understand, changes made in storage will be visible in the front end directly.

Will using Model.remove, completely delete from storage as well?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

Is there a difference? I mean my use case is pretty simple. I have a training and it basically creates a lot of checkpoints. I just want to keep the n best checkpoints and whenever there are more than N checkpoints, I'll delete the worst performing one. Deleted both locally and from the the task artifacts.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

And given that I want have artifacts = task.get_registered_artifacts()

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

I think these are the relevant methods 🙂
https://clear.ml/docs/latest/docs/references/sdk/task#register_artifact
https://clear.ml/docs/latest/docs/references/sdk/task#unregister_artifact
And later you can use
https://clear.ml/docs/latest/docs/references/sdk/task#upload_artifact
When you have a finalized version of what you want

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

I plan to append the checkpoint to a list, when the len(list) > N, I'll just pop out the one with the highest loss, and delete that file from clearml and storage. That's how I plan to work with it.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

VexedCat68 , I was about to mention it myself. Maybe only keeping last few or last best checkpoints would be best in this case. I think SDK also supports this quite well 🙂

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Will using Model.remove, completely delete from storage as well? (edited)

correct see argument delete_weights_file=True

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Also I need to modify the code to only keep the N best checkpoints as artifacts and remove others.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

Hmmmm I couldn't find something in the SDK, however, you can use the API to do it

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Hmm, you can delete the artifact with:
task._delete_artifacts(artifact_names=['my_artifact']However this will not delete the file itself.
Do delete the file I would do :
remote_file = task.artifacts['delete_me'].url h = StorageHelper.get(remote_file) h.delete(remote_file) task._delete_artifacts(artifact_names=['delete_me']Maybe we should have a proper interface for that? wdyt? what's the actual use case?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

VexedCat68 the remote checkpoints (i.e. Models) represent the local storage, so if you internally overwrite the files, this is exactly what will happen in the backend. so the following should work (and store the last 5 checkpoints):
epochs += 1 torch.save("model_{}.pt",format(epochs % 5))Regrading deleting / getting models:
Model.remove(task.models['output'][-1])

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

How do I go about uploading those registered artifacts, would I just pass artifacts[i] and the name for the artifact?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

Currently every 2000 iterations, a checkpoint is saved, that's just part of the code. Since output_uri = True, it gets uploaded to the ClearML server.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					VexedCat68
				
					0
					 × 1

Write your answer

2K Views

25 Answers

3 years ago

2 years ago