That makes sense, I'll add that as a future addition
This is something you can do in the GCP console, one would imagine it can be done using their python library.
I think the limitation is that you can only pass a relative subnet path in the GCP Autoscaler console. Then, by the looks of the error message, the ClearML Autoscaler constructs the full path under the hood /project/<project_id>/subnet/<subnet_id>
.
I'd like the option to specify the full path myself in the Autoscaler which would then allow me to use a shared subnet.
I'm not sure this is supported in the Google machine spec
đź‘Ť Thanks for getting back to me.
Another issue I found was that I could only use vpc subnets from the google project I am launching the VMs in.
I cannot use shared vpc subnets from another project. This would be a useful feature to implement as GCP recommends segmenting the cloud estate so that the vpc and VMs are in different projects.
Hi @<1529271085315395584:profile|AmusedCat74> , sorry I missed this, this looks like an obvious bug, I'll try to fix it for the next release
@<1523701087100473344:profile|SuccessfulKoala55> Just following up as I figured out what was happening here and could be useful for the future.
The prefilled value for Number of GPUs
in the GCP Autoscaler is 1
.
When one ticks Run in CPU mode (no gpus)
it hides the GPU Type
and Number of GPUs
fields. However, the value which was these fields are still submitted in the API Request (I'm guessing here) when the Autoscaler is launched.
Hence, to get past this, you need to explicitly set Number of GPUs
to 0
before ticking the Run in CPU mode (no gpus)
which does not seem like the correct behaviour and is likely a bug.
Thanks Jake. Do you know how I set the GPU count to 0?
@<1529271085315395584:profile|AmusedCat74> the error seems to indicate you've selected a GPU count larger than 0 for that specific resource
@<1537605940121964544:profile|EnthusiasticShrimp49> How do I specify to not attach a gpu? I thought ticking 'Run in CPU Mode' would be sufficient. Is there something else I'm missing?
Hey @<1529271085315395584:profile|AmusedCat74> , I may be wrong , but I think you can’t attach a gpu to an e2 instance , it should be at least an n1, no?
Hi @<1529271085315395584:profile|AmusedCat74> , can you please provide the full log of the autoscaler?