I think you cannot change it for a running process, do you want me to check for you if this can be done ?
thanks for your help anyway AgitatedDove14 !
Set it on the PID of the agent process itself (i.e. the clearml-agent python process)
mmmh it fails, but if I connect to the instance and execute ulimit -n
, I do see65535
while the tasks I send to this agent fail with:OSError: [Errno 24] Too many open files: '/root/.commons/images/aserfgh.png'
and from the task itself, I run:import subprocess print(subprocess.check_output("ulimit -n", shell=True))
Which gives me in the logs of the task:b'1024'
So nnofiles is still 1024, the default value, but not when I ssh, damn. Maybe rebooting would work
I guess I would need to put this in the extra_vm_bash_script param of the auto-scaler, but it will reboot in loop right? Isn’t there an easier way to achieve that?
You can edit the extra_vm_bash_script
which means the next time the instance is booted you will have the bash script executed,
In the meantime, you can ssh to the running instance and change the ulimit manually, wdyt?
now how to adapt to do it from extra_vm_bash_script
?
it actually looks like I don’t need such a high number of files opened at the same time
So actually I don’t need to play with this limit, I am OK with the default for now
because at some point it introduces too much overhead I guess
BTW: for future reference, if you set the ulimit in the bash, all processes created after that should have the new ulimit
yes please, I think indeed that’s the problen
I will try addingsudo sh -c "echo '\n* soft nofile 65535\n* hard nofile 65535' >> /etc/security/limits.conf"
to the extra_vm_bash_script
, maybe that’s enough actually