I would recommend reading this blog post, it should give you a glimpse of what can be built π
https://medium.com/pytorch/how-trigo-built-a-scalable-ai-development-deployment-pipeline-for-frictionless-retail-b583d25d0dd
Hi CleanWhale17 , at least for the moment, the code although open ( https://github.com/allegroai/trains-web ) has no external theme/customization interface.
That said we do have some thoughts on it.., What did you have in mind ?
As I'm a Full-stack developer at Core. I'd be looking to extend the TRAINS Frontend and Backend APIs to suit my need of On-Prem data storage integration and lots of other customization for Job Scheduler(CRON)/Dataset Augmentation/Custom Annot. tool etc.
That is awesome! Feel free to post a specific question here, and I'll try to direct to the right place π
Can you guide me to one such tutorial that's teaching how to customize the backend/front end with an example?
You mean like pipelines / automation etc.?
I f that is the case take a look at the examples folder :
https://github.com/allegroai/trains/tree/master/examples
Mostly the automation and services subfolders.
And also on the trains-agent examples (with the AWS autoscaler example, that soon will be rewritten so it is easier to extend )
https://github.com/allegroai/trains-agent/tree/master/examples
Glad to know it.
As I'm a Full-stack developer at Core. I'd be looking to extend the TRAINS Frontend and Backend APIs to suit my need of On-Prem data storage integration and lots of other customization for Job Scheduler(CRON)/Dataset Augmentation/Custom Annot. tool etc.
Can you guide me to one such tutorial that's teaching how to customize the backend/front end with an example?
thanks for the reference Martin.. I'd soon by starting with the TRAINS.. and would be in touch on the progress.
Hi CleanWhale17 let me see if I can address them all
Email Alert for finished Job(I'm not sure if it's already there).
Slack integration will be public by the end of the weekend π
It is fully customization / extendable, I'll be happy to help .
DVC
Full dataset tracking is supported using the artifacts and the ability to integrate to any central storage (shared folders/ S3 / GS / Azure etc.)
From my experience, it is easier to work with artifacts from Data-Processing Tasks, as Trains offers full caching and flexible Storage options, I always have the feeling "git-alike" commit/pull for dataset is the wrong approach, that said there is nothing that will limit you in integrating DVC into your pipeline.
If you are doing Computer-Vision based DL, which means annotation on json files, and pointers to actual files. Then it makes a lot of sense to have the annotation in a single json file as a Data-Processing Task (fully versioned of course), then from training Task pull the json (caching is supported), then from the json access the actual image files with direct file sharing or Using the Trains StorageManager, that does all the heavy lifting for you and can pull data from S3/Gs/Azure etc, with caching built in.
Apache AirFlow
If you have a K8s cluster and you want production grade orchestration, by all means consider AirFlow or KubeFlow. That said for R&D and constantly changing repositories/requirements, Trains offer the ability to reuse containers (so that you do not end up with a conainer per experiment, then 1000's of unused containers) and also the ability to build a fully standalone container from any experiment (i.e. package an experiment/Task in a container for later use with any orchestration solution)
Last thing, K8s is great for managing resources, not so much for scheduling.
You can use trains-agent as bare metal agent, to run containers on any machine (setup with pip install, it is that easy). Or you can integrate with K8s, there are a few example and documentation on the Nvidia NGC could (we are the leading supported platform for managing experiments on Nvidia K8s clusters)
Thanks for the details comparison.. i'll have to look more into these tools to come to any conclusion based on my needs.
Here's what I'm looking at:
An automated ML Pipeline
CleanWhale17 per your request :)
An automated ML Pipeline π Automated Data Source Integration π Data Pooling and Web Interface for Manual Annotation of Images(Seg. / Classif) [Allegro Enterprise] or users integrate with open-source Storage of Annotation output files(versioned JSON) π Online-Training Β Support(for Dataset Shifts) [Not Sure what you mean] Data Pre-processessing (filter/augment) [Allegro Enterprise] or users integrate with open-source Data-set visualization(stats of Dataset) [Allegro Enterprise] or users integrate with open-source Experiment Management(which is why I liked TRAINS), π Jupyter Integration(for Test Management) π Training Progress Visualization(TensorBoard like) π Inferencing and Visualization of Results π Reproducibility of Training Results π
online-training:
Re-training the models to update it's weights for any new dataset introduced after the previous deployment. Based on certain threshold, we can decide when to re-train the model.
It's mainly application for scenarios that involve streaming/sequential data sets that are made available over time. E.g. Facial Recognition or Retails usecases for a new Fashion segments.
CleanWhale17 nice ... π
So the answer is Trains supports the Pipeline / Automation of it, but lacks that dataset integration (that is basically up to you to manage, with either artifacts or any other method)
The Allegro Enterprise allows you to rerun the code, on a new version of the dataset from the UI (or automation) without changing a single line of code π
CleanWhale17 what is " Online-Training Β Support(for Dataset Shifts" ?
Automated Data Source Integration Data Pooling and Web Interface for Manual Annotation of Images(Seg. / Classif) Storage of Annotation output files(versioned JSON) Online-Training Support(for Dataset Shifts) Data Pre-processessing (filter/augment) Data-set visualization(stats of Dataset) Experiment Management(which is why I liked TRAINS), Jupyter Integration(for Test Management) Training Progress Visualization(TensorBoard like) Inferencing and Visualization of Results Reproducibility of Training Results
Thanks for sharing the Case Study link. Please let me know if all of the above requirements are already there in TRAINS or planned or can be integrated with external tools doing them.
I work on VisionAI so would need integration to my existing data pipeline (including the annotation tools - LabelMe, VGG etc) and also add features like Email Alert for finished Job(I'm not sure if it's already there).
Others doubts that I have is:
How does it compare to Apach AirFlow or DVC for Data Management(if I'm not going for the Paid version)?