I would let the trains team answer this in details, but as a user moving from MLflow to trains, I can share the following insights:
MLflow and trains overlap when it comes to having a system with nice web UI to compare/log experiments/models/metrics. But MFlow lacks a crutial feature IMO which is ML/DevOps: Using MLFlow, you will have to take care of the whole maintenance of your machines, design interactions between them, etc. This is where trains shines, it provides these features out-of-the-box.
MLflow has been released a couple of months before trains, at times where the whole community of AI researchers are looking for such a tool, which might explain why they got such a big attraction. Since then their development slowed down and it is still missing features like auto devops.
Trains arrived later on and I agree that its name is not search-engine friendly, which might explain to some extend why the user base is not growing as fast as one would expect for such a nice library. But that will change, all trains needs is a bit more attention/communication
DefeatedCrab47 Happy you're finding Trains useful 🙂
but it definitely has it's advantages if TRAINS would support it (early stage Data Science infrastructure).
No doubt, and I definitely see such usable example in the cards for Trains' upcoming versions...
DefeatedCrab47 For the most part, mlflow can serve basic ML models using scikit-learn. In contrast, Trains was designed with more general purpose ML/DL workflows in mind, for which there's no "generic" way to serve models as different scenarios can use different input encoding, models results would be represented in a variety of forms, etc.
Consider also, that creating an HTTP endpoint for model inference is quite a breeze: there are multiple examples of Flask on top of any DL/ML framework which you can add your work on top of.
If you're considering serving your model at (even a small) scale, my best recommendation would be to setup your serving code, test it on a single machine, then package it in a docker and have that docker deployed with k8s/Airflow AWS ElasticBeans etc.
Trains was built to support this same approach: just write your model serving code, import trains (you have a full API to get any model you need either by ID or with search capabilities) and it will download and cache it for you from wherever you actually store it (S3 GS etc.) Then the trains-agent can build a docker for you ready to be deployed.
The documentation is indeed somewhat lighter than would be ideal in some areas, especially for advanced stuff like model deployment. That's why we have additional communication channels :)
JitteryCoyote63
I agree that its name is not search-engine friendly,
LOL 😄
It was an internal joke the guys decided to call it "trains" cause you know it trains...
It was unstoppable, we should probably do a line of merch with AI 🚆 😉
Anyhow, this one definitely backfired...
Thank you for your impression! I get a bit more of a Airflow feel for running many tasks to train models with different parameters, which is a good thing.
I'm still skimming through the documents, but TRAINS documentation on how models are stored is a bit vague to me. The https://allegro.ai/docs/examples/examples_models/ only quickly mentions that you can set an output location. Which is a bit shallow compared with the https://mlflow.org/docs/latest/model-registry.html . Any good resource that talks about TRAINS models management?
It seems that with https://github.com/mlflow/mlflow/#saving-and-serving-models .
I cannot find anything about serving models in TRAINS?
FrothyDog40 Thank you for your reply. I agree that MLflow's serving solution is not going to be of much help for real deployment. However, to me the advantage of quickly setting-up an API access point with just 1 line of code helps with some internal trying out. To colleague: "Hey, this new model seems to do good, want to give it a try?".
I've setup my own Docker container with Sanic (like Flask) and indeed it's not too difficult. However, you'll still hit issues like " https://stackoverflow.com/questions/10636611/how-does-access-control-allow-origin-header-work " that throws a network security error if not properly configured.
And even turning 1 model into an API still won't do it automatically for any model. So you would have to spend time to write serving code that would do that, costing time as well.
mlflow can serve basic ML models using scikit-learn. In contrast, Trains was designed with more general purpose ML/DL workflows in mind
The GitHub README only seems to indicate scikit-learn indeed, but their https://mlflow.org/docs/latest/models.html#deploy-mlflow-models seems to indicate all supported models.
MLflow supports ( https://mlflow.org/docs/latest/models.html#built-in-model-flavors ) models from: Python Function (python_function), R Function (crate), H2O (h2o), Keras (keras), MLeap (mleap), PyTorch (pytorch), Scikit-learn (sklearn), Spark MLlib (spark), TensorFlow (tensorflow), ONNX (onnx), MXNet Gluon (gluon), XGBoost (xgboost), LightGBM (lightgbm)
Those are all frameworks I know about and more, so what would be more general than supporting these?
Since it's possible to deploy a model stored with TRAINS, it's not a limitation, but it definitely has it's advantages if TRAINS would support it (early stage Data Science infrastructure).
Please don't get me wrong. TRAINS seems amazing to me so far! but I have to convince my other colleagues.
AgitatedDove14 the funniest thing is that a train service called Allegro exists:
https://en.wikipedia.org/wiki/Allegro_(train)
Anytime I google - first result :D
JitteryCoyote63
I agree that its name is not search-engine friendly,
LOL 😄
It was an internal joke the guys decided to call it "trains" cause you know it trains...
It was unstoppable, we should probably do a line of merchandise with AI 🚆 😉
Anyhow, this one definitely backfired...