It can also work by running on multiple known nodes.
Horovod sits on top of openmpi that needs ssh to open multiple nodes, I'm not sure how one would connect it without passing the SSH keys from one node to the other, and making sure they can directly communicate. (Not saying it is not possible, but just a few things to configure before it works, the enterprise edition remove the need for the direct SSH connection between the nodes)
How would i add a glue for multinode?
Basically spin another glue service (i.e. run it in parallel to the current one), have the new glue pull Tasks from a new queue (let's say X nodes) and make sure the YAML it uses spins X pods (i.e. k8s does the 4 pods, the pod definition itself the glue will take care of, as they are replicas of one another) make sense ?