Its built in 🙂 and Its for... "Services"
Yeah but I don't get what it is for - for now I have 2 agents, each listening to some queues. I actually ignore the "services" queue until now
I don't get the difference between how I'm using my agents now, just starting them on machines, and making them listen to queues, to using the "services" mode
WackyRabbit7 It is conceptually different than actually training, etc.
The service agent is mostly one without a gpu, runs several tasks each on their own container, for example: autoscaler, the orchestrators for our hyperparameter opt and/or pipelines. I think it even uses the same hardware (by default?) of the trains-server.
Also, if I'm not mistaken some people are using it (planning to?) to push models to production.
I wonder if anyone else can share their view since this is a relatively new feature (AHEM)
regular trains-agent modus operandi is one job at a time (i.e. until the Task is done, no other Tasks will be pulled from the queue).
When adding --services-mode, it is Not 1-1 but 1-N, meaning a single trains-agent will launch as many Tasks as it can.
The trains-agent pulls a job from the queue and spins a docker (only dockers are supported for the time being) and lets the job run in the background (the job itself will be registered as another "worker" in the system). Then the trains-agent will pull the next job from the queue.