I am aware this is the current behavior, but could it be changed to something more intelligent? 😇
This is part if a more advanced set of features of the scheduler, but only available in the enterprise edition 🙂
If you spin two agent on the same GPU, they are not ware of one another ... So this is expected behavior ...
Make sense ?
When you say I can still get race/starvation cases, you mean in the enterprise or regular version?
you mean in the enterprise
Enterprise with the smarter GPU scheduler, this is inherent problem of sharing resources, there is no perfect solution, you either have fairness, but then you get idle GPU's of you have races, where you can get starvation
BTW: you still can get race/starvation cases... But at least no crash
I see, will keep that in mind. Thanks Martin!