EnviousStarfish54
oh, this is a bit different from my expectation. I thought I can use artifact for dataset or model version control.
You totally can use artifacts as a way to version data (actually we will have it built in in the next versions)
Getting an artifact programmatically:
Task.get_task(task_id='aabb'). artifacts['artifactname'].get()
Models are logged automatically. No need to log manually
Sorry for late reply, you mention there will be built-in way to version data. May I asked is there a release date for it?
oh, this is a bit different from my expectation. I thought I can use artifact for dataset or model version control.
StorageManager is what you need, if you want to download/upload files to any server (this is a utility class the takes care of the DL/uL + adds caching) storage helper is used internally
we will have a dedicate vm to hold trains related docker, do I need to setup some file server? (i saw earlier thread mention minio)
Also I am unclear what is the difference of storageManager and StorageHelper, is there an example that integrate that with model training.
I go through the doc and seems it doesn't mention downloading from artifact (programatically)?
EnviousStarfish54 regrading file server, you have one built into the trains-server, and this will be the default location to store all artifacts. You can also use external solutions like S3 GS Azure etc.
Regarding the models, any model store / load is automatically logged as long as you are using one of the supported frameworks (TF Keras PyTorch scikit learn)
If you want your model to be automatically uploaded, just add outpu_uri:
task=Task.init('examples', 'model', output_uri=' http://trains-server:8081/ ')
Hi EnviousStarfish54
Artifacts are stored per experiment, that means that storage wise every experiment uploading an artifact (even if it is the same file content as previous execution) will create a new file on the central storage (default being the trains-server)
As for the preferred way to share data / artifacts. Where do you have your trains server ? Is it local ? Cloud? Where do you access it from home? VPN?