Does Artifact Track Per File Base? What If Only Some File Is Updated, Does It Knows Only Uploading The New Files? Also, Wonder What Is The Best Way To Setup Storage For Teams To Share? (Not Prefer Using Cloud As Network Cost Can Be Significant Since We Do

Answered

Does artifact track per file base? What if only some file is updated, does it knows only uploading the new files? Also, wonder what is the best way to setup storage for teams to share? (not prefer using cloud as network cost can be significant since we don't use cloud VM for model training. 🙏

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Votes Newest

Answers 8

StorageManager is what you need, if you want to download/upload files to any server (this is a utility class the takes care of the DL/uL + adds caching) storage helper is used internally

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Sorry for late reply, you mention there will be built-in way to version data. May I asked is there a release date for it?

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

Also I am unclear what is the difference of storageManager and StorageHelper, is there an example that integrate that with model training.

I go through the doc and seems it doesn't mention downloading from artifact (programatically)?

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

we will have a dedicate vm to hold trains related docker, do I need to setup some file server? (i saw earlier thread mention minio)

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

oh, this is a bit different from my expectation. I thought I can use artifact for dataset or model version control.

  				
Posted 
	4 years ago

					More  		
  Report
		
					EnviousStarfish54
				
					0
					 × 1

EnviousStarfish54 regrading file server, you have one built into the trains-server, and this will be the default location to store all artifacts. You can also use external solutions like S3 GS Azure etc.
Regarding the models, any model store / load is automatically logged as long as you are using one of the supported frameworks (TF Keras PyTorch scikit learn)
If you want your model to be automatically uploaded, just add outpu_uri:
task=Task.init('examples', 'model', output_uri=' http://trains-server:8081/ ')

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

EnviousStarfish54

oh, this is a bit different from my expectation. I thought I can use artifact for dataset or model version control.

You totally can use artifacts as a way to version data (actually we will have it built in in the next versions)

Getting an artifact programmatically:
Task.get_task(task_id='aabb'). artifacts['artifactname'].get()

Models are logged automatically. No need to log manually

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi EnviousStarfish54
Artifacts are stored per experiment, that means that storage wise every experiment uploading an artifact (even if it is the same file content as previous execution) will create a new file on the central storage (default being the trains-server)
As for the preferred way to share data / artifacts. Where do you have your trains server ? Is it local ? Cloud? Where do you access it from home? VPN?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

1K Views

8 Answers

4 years ago

2 years ago