Federated learning is about sending code to where data exists, training local models and aggregating them in a central place.
Can existing design support this or extensions need to be built?
Hi LazyLeopard18 ,
So long story short, yes it does.
Longer version, to really accomplish full federated learning with control over data at "compute points" you need some data abstraction layer. Without data abstraction layer, federated learning is just averaging derivatives from different location, this can be easily done with any distributed learning framework, such as horovod pr pytorch distributed or TF distributed.
If what you are after is, can I launch multiple experiments with the same code on remote machines with trains, the answer is Yes, this is exactly how trains-agent works, and it is very easy to setup on bare-metal (basically pip install). If you want full data abstraction, then this is missing from the open-source solution of Trains and only available in the paid tier I'm assuming that as a first step, you would like to achieve (1)?
Is there documentation for (2) available for evaluation?
Would be nice to have a reference implementation
Both are fully implemented in the enterprise version. I remember a few medical use cases, and I think they are working on publishing a blog post on it, not sure. Anyhow I suggest you contact the sales people and I'm sure they will gladly setup a call/demo/PoC.
https://allegro.ai/enterprise/#contact
LazyLeopard18 could you explain some more on the specific use case you have in mind?
For now I am trying to achieve (1). But the goal is (2)