π thanks for clearing that up @<1523701087100473344:profile|SuccessfulKoala55>
Brilliant, thanks a lot for the answer Jake, much appreciated and clearer!
@<1529271085315395584:profile|AmusedCat74> @<1548115177340145664:profile|HungryHorse70> here we have the answer :)
Hi π Anyone having any idea on that one please? Or could point me in the right place or the right person to find out? Thanks for any help!
Sure, docs are in https://github.com/allegroai/clearml-docs
Is the doc on GitHub so we can copy that into a PR?
Hi @<1546665634195050496:profile|SolidGoose91> , sorry, missed this π
The Regular Instance Rollback Timeout controls when the autoscaler will revert to starting a regular instance and not a spot instance, after failing to start a spot - it will attempt to start a spot, and than wait and retry again and again - once the time it waited exceeded the Regular Instance Rollback Timeout, it will try to start a regular instance instead. This is for a specific attempt, where starting a spot fails and an alternative instance needs to be started.
The Spot Instance Blackout Period specifies a blackout period after trying to start a spot failed. This is related to future attempts, and basically says that after an event of failing to start a spot, all requests to start additional spot instances will be converted to attempts to start regular instances, basically as a way of "easing" the spot requests load on the cloud provider and not creating a "DOS" situation in the cloud account which might cause the provider to refuse creating spots for a longer period.