Updating that a newer version of the autoscaler was deployed
I will try to create a docker image.
What ways do I have to upload the image to be used by autoscaler? do I have to use docker-hub?
Is there a possibility to relaunch my old autoscaler as it was? at least until the support for no-docker configuration is back? I don't care if you do it @<1574207105437536256:profile|HungryCat90>
You can always add the relevant configurations to the docker image itself as well. From my understanding a new version should be released towards the end of the month and with it the ability to run without docker image required on the autoscaler
I doubt that would be possible because it looks like the autoscaler versions are global
As a quick workaround you can launch the open source autoscaler until the no-docker capability is available again.
None
Hi @<1708653001188577280:profile|QuaintOwl32> , you can set some default image to use. My default for most jobs is nvcr.io/nvidia/pytorch:23.03-py3
this is an urgent issue for me, as this broke my training flow
Yes, this will cause the code to run inside the container.
if so it won't work as my environment is in the hist linux
Not sure I understand this part, can you please elaborate?
Hi @<1708653001188577280:profile|QuaintOwl32> , the support for this option was temporarily removed, but will be added back soon - we'll update here
My aws image is configured to support my training. As docker is separated from the host system my training will not work on it.
Of course, but in my case its very complicated to create this image
will my code run inside of this docker? if so it won't work as my environment is in the host linux