Thanks for your reply.
Environment of host system:
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06 Driver Version: 545.29.06 CUDA Version: 12.3 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3090 Off | 00000000:01:00.0 Off | N/A |
| 33% 52C P0 115W / 350W | 4MiB / 24576MiB | 0% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA GeForce RTX 3090 Off | 00000000:03:00.0 Off | N/A |
| 30% 49C P0 108W / 350W | 4MiB / 24576MiB | 5% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
I installed cuda-toolkit-12.3 following this url on fractional gpu container. After the setup, I ran
nsys profile nvidia-smi (or other cuda program)
It says:
Segmentation fault (core dumped)
Tell me if you need another information. Thanks again.
What command did you run? What were you trying to do? What was the setup?
- Install the docker container which clearml provides (clearml/fractional-gpu:u22-cu12.3-4gb)
- Run
docker run -it --gpus all --ipc=host --pid=host clearml/fractional-gpu:u22-cu12.3-8gb bash
- Install the cuda toolkit
wget
mv cuda-ubuntu2204.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget
dpkg -i cuda-repo-ubuntu2204-12-3-local_12.3.0-545.23.06-1_amd64.deb
cp /var/cuda-repo-ubuntu2204-12-3-local/cuda-*-keyring.gpg /usr/share/keyrings/
apt-get update
apt-get -y install cuda-toolkit-12-3
- Run
nsys profile nvidia-smi
Addtionally, my container was clearml/fractional-gpu:u22-cu12.3-4gb (other containers based on cu12.3 also show that error too).
Can you provide the log though? Where you got there error?
That "Segmentation fault (core dumped)' was all I got.
I ran nsys profiler inside of the container.