As we use a custom CUDA image, we do not want this running on user login, and get ugly error messages about missing symlinks.
You can customize the startup bash script (running inside Any container) here:
https://github.com/allegroai/clearml-agent/blob/bf07b7f76d3236c1118b81730c6d9718705a795a/docs/clearml.conf#L145
LackadaisicalOtter14 Would that help?
Hey,
Sorry for delay in replying.
The line causing problems is line 484 in the interactive_session_task'echo "ldconfig" >> /etc/profile && '
When a user logs in, due to a custom cuda/torch version being used, when a user logs in; they are greeted with/sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-cfg.so.470.63.01 is empty, not checked. /sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-compiler.so.470.63.01 is empty, not checked. /sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-ml.so.470.63.01 is empty, not checked. /sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-opencl.so.470.63.01 is empty, not checked. /sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libcuda.so.470.63.01 is empty, not checked. /sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.470.63.01 is empty, not checked. /sbin/ldconfig.real: File /lib/x86_64-linux-gnu/libnvidia-allocator.so.470.63.01 is empty, not checked. /sbin/ldconfig.real: Can't create temporary cache file /etc/ld.so.cache~: Permission denied
Is it possible to remove this line to stop it from being executed?
Hey thanks informing me of this.
However, this doesn't help need to remove ldconfig from /etc/profile
which is put there by the interactive_session_task 😕
That would be great! Might have to useÂ
2>/dev/null
 in some of my bash scripts
Feel free to test and PR :)
One other question regarding connecting. We have setup sshd inside the docker image we are using.
Actually the remote session opens port 10022 on the host machine (so it does not collide with the default ssh port)
It actually runs an additional sshd
inside the docker, setting its port.
And the clearml-session will ssh directly into the container sshd (port 10022), make sense ?
Is there a way to disable this behaviour, and let the container run isolated from the host?
what do you mean by that ?
I have made a PR request.
Thanks you!!! 🎉 we will merge shortly 🙂
That would be great! Might have to use 2>/dev/null
 in some of my bash scripts 😊
One other question regarding connecting. We have setup sshd inside the docker image we are using. I see that when we try to connect over port 22, it forwards to the host machine. I believe this is due to mounting ports on the host; which is possible as the spun up container has the capabilities:'--cap-add=net_admin', '--cap-add=sys_module
Is there a way to disable this behaviour, and let the container run isolated from the host?
We use wireguard to tunnel into the container to port 22 on the same image when not instantiated with ClearML.
Thank you!
Hi LackadaisicalOtter14
Is it possible to remove this line to stop it from being executed
Everything is possible 🙂 II think the main question is why it is there (which ti the best of my understanding, is to solve for any cuda drivers and installed packages, meaning anything that is installed in runtime)
I think we can suppress the error, wdyt?'echo "ldconfig" 2>/dev/null >> /etc/profile && '
Hey,
Do not worry about the SSH problem I mentioned, I understand now, thank you!
Regarding the ldconfig warning supression, I tested it and it works as expected!
I have made a PR request.
Thanks for your help AgitatedDove14 😊
ldconfig fromÂ
/etc/profile
 which is put there by the interactive_session_task
LackadaisicalOtter14 are you sure ? maybe this is done as part of the installation the interactive session runs ?
Could that be the issue ?apt-get update && apt-get install -y openssh-server
Hi LackadaisicalOtter14
However, whenever we spin up a session,Â
 always gets run and overwrites our configs
what do you mean by that?
The what config are being overwritten? (generally speaking, it just add the OS environment it needs to for the setup process)