I guess I would need to put this in the extra_vm_bash_script param of the auto-scaler, but it will reboot in loop right? Isn’t there an easier way to achieve that?
You can edit the extra_vm_bash_script
which means the next time the instance is booted you will have the bash script executed,
In the meantime, you can ssh to the running instance and change the ulimit manually, wdyt?
I will try addingsudo sh -c "echo '\n* soft nofile 65535\n* hard nofile 65535' >> /etc/security/limits.conf"
to the extra_vm_bash_script
, maybe that’s enough actually
mmmh it fails, but if I connect to the instance and execute ulimit -n
, I do see65535
while the tasks I send to this agent fail with:OSError: [Errno 24] Too many open files: '/root/.commons/images/aserfgh.png'
and from the task itself, I run:import subprocess print(subprocess.check_output("ulimit -n", shell=True))
Which gives me in the logs of the task:b'1024'
So nnofiles is still 1024, the default value, but not when I ssh, damn. Maybe rebooting would work
I think you cannot change it for a running process, do you want me to check for you if this can be done ?
yes please, I think indeed that’s the problen
Set it on the PID of the agent process itself (i.e. the clearml-agent python process)
now how to adapt to do it from extra_vm_bash_script
?
it actually looks like I don’t need such a high number of files opened at the same time
because at some point it introduces too much overhead I guess
So actually I don’t need to play with this limit, I am OK with the default for now
thanks for your help anyway AgitatedDove14 !
BTW: for future reference, if you set the ulimit in the bash, all processes created after that should have the new ulimit