Oh hooray! So docker-compose manages the restarting of crashed containers? I didn't know that, and that is great 😄
Hi @<1541954607595393024:profile|BattyCrocodile47> , docker-compose deployment is used widely in production for ClearML and works perfectly 🙂 .
Regarding containers crashing, we've seen very little of that. Regardless, since ClearML clients (SDK/Agent) are very resilient and are designed to handle server unavailability and network issues (both periodic and even long term), combined with the fact the core ClearML server (i.e. the apiserver) is very quick to start up (usually a matter of several seconds) the result is a system that is very robust to container restarts 🙂