I'm using this feature, in this case i would create 2 agents, one with cpu only queue and the other with gpu queue. And then at the code level decide with queue to send to.
That's a cool idea. Then you pass the tolerations definition through a different pod template?
Then you pass the tolerations definition through a different pod template?
Yup.