I'll just take a screenshot from my companies daily standup of data scientists and software developers..... that'll be enough!
Yes 🙂 https://discuss.pytorch.org/t/shm-error-in-docker/22755
add either "--ipc=host" or "--shm-size= 8g " to the docker args (on the Task or globally in the clearml.conf extra_docker_args)
notice the 8g depends on the GPU
Basically it gives it direct access to the host, this is why it is considered less safe (access on other levels as well, like network)
If I did that, I am pretty sure that's the last thing I'd ever do...... 🤣
This appears to confirm it as well.
https://github.com/pytorch/pytorch/issues/1158
Thanks AgitatedDove14 , you're very helpful.
Hmm good question, I'm actually not sure if you can pass 24GB (this is not a limit on the GPU memory, this affects the memblock size, I think)
LOL I see a meme waiting for GrumpyPenguin23 😉
In my case it's a Tesla P40, which has 24 GB VRAM.
Oh, so this applies to VRAM, not RAM?
Does "--ipc=host" make it a dynamic allocation then?
I believe the standard shared allocation for a docker container is 64 MB, which is obviously not enough for training deep learning image classification networks, but I am unsure of the best solution to fix the problem.
Pffff security.
Data scientist be like....... 😀
Network infrastructure person be like ...... 😱