My aws image is configured to support my training. As docker is separated from the host system my training will not work on it.
this is an urgent issue for me, as this broke my training flow
I doubt that would be possible because it looks like the autoscaler versions are global
As a quick workaround you can launch the open source autoscaler until the no-docker capability is available again.
None
Yes, this will cause the code to run inside the container.
if so it won't work as my environment is in the hist linux
Not sure I understand this part, can you please elaborate?
Of course, but in my case its very complicated to create this image
Is there a possibility to relaunch my old autoscaler as it was? at least until the support for no-docker configuration is back? I don't care if you do it @<1574207105437536256:profile|HungryCat90>
Updating that a newer version of the autoscaler was deployed
Hi @<1708653001188577280:profile|QuaintOwl32> , you can set some default image to use. My default for most jobs is nvcr.io/nvidia/pytorch:23.03-py3
You can always add the relevant configurations to the docker image itself as well. From my understanding a new version should be released towards the end of the month and with it the ability to run without docker image required on the autoscaler
will my code run inside of this docker? if so it won't work as my environment is in the host linux
Hi @<1708653001188577280:profile|QuaintOwl32> , the support for this option was temporarily removed, but will be added back soon - we'll update here
I will try to create a docker image.
What ways do I have to upload the image to be used by autoscaler? do I have to use docker-hub?