Debian/CentOS 系统下安装NVIDIA驱动与cuda、cudnn库
1. 介绍
1.1 CUDA
CUDA是NVIDIA推出的用于自家GPU的并行计算框架,也就是说CUDA只能在NVIDIA的GPU上运行,而且只有当要解决的计算问题是可以大量并行计算的时候才能发挥CUDA的作用。
1.2 cuDNN
cuDNN(CUDA Deep Neural Network library):是NVIDIA打造的针对深度神经网络的加速库,是一个用于深层神经网络的GPU加速库。如果你要用GPU训练模型,cuDNN不是必须的,但是一般会采用这个加速库。
2. 准备工作
2.1 关闭桌面系统
$ sudo service lightdm stop
# or
$ sudo service sddm stop
2.2 卸载第三方驱动
$ lsmod | grep nouveau
$ sudo apt-get remove --purge nvidia*
$ sudo apt-get autoremove
$ sudo vi /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
$ sudo update-initramfs -u
$ sudo reboot
Centos 7
NVIDIA GeForce GT 610 GPU installed in this system is supported through the NVIDIA 390.xx legacy Linux graphics drivers
我这块显示太老,所以只能安装较旧的版本。
Before installing the CUDA Toolkit on Linux, please ensure that you have the latest NVIDIA driver R390 installed. The latest NVIDIA R390 driver is available at: www.nvidia.com/drivers
下载R390驱动 NVIDIA-Linux-x86_64-390.141.run
cuda_9.1.85_387.26_linux.run / https://developer.nvidia.com/zh-cn/cuda-downloads 选择旧版本
$ sudo rpm -i cuda-repo-rhel7-9-1-local-9.1.85-1.x86_64.rpm
$ sudo yum clean all
$ sudo yum install cuda
$ sudo yum install kmod-nvidia-390xx*
以下是采用runfile方式安装,但安装失败,编译内核时失败。
$ lsmod | grep nouveau
$ sudo vi /etc/modprobe.d/nvidia-installer-disable-nouveau.conf
blacklist nouveau
options nouveau modeset=0
$ sudo mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r)-nouveau.img
$ sudo dracut /boot/initramfs-$(uname -r).img $(uname -r)
$ sudo reboot
$ sudo ./cuda_9.1.85_387.26_linux.run
安装失败处理
Installing the NVIDIA display driver...
The driver installation is unable to locate the kernel source. Please make sure that the kernel source packages are installed and set up correctly.
If you know that the kernel source packages are installed and set up correctly, you may pass the location of the kernel source with the '--kernel-source-path' flag.
===========
= Summary =
===========
Driver: Installation Failed
Toolkit: Installation skipped
Samples: Installation skipped
Logfile is /tmp/cuda_install_27244.log.
# 原因是少内核源码
# 解决
$ sudo yum --enablerepo=elrepo-kernel install kernel-ml-devel
# 因为我安装的不是默认版本内核,所以指定一下内核路径。还需要注意gcc版本,10下编译内核失败。最后还是将内核和gcc恢复到原始版本
$ sudo ./cuda_9.1.85_387.26_linux.run --kernel-source-path=/usr/src/kernels/3.10.0-1160.21.1.el7.x86_64
重启后再执行1.1
3. 安装
根据显卡型号到NVIDIA官网下载驱动和库安装包 可直接下载cuda安装包,它包括NVIDIA驱动
文件 | 说明 |
---|---|
NVIDIA-Linux-x86_64-375.20.run | NVIDIA驱动安装包 |
cuda_11.1.0_455.23.05_linux.run | cuda库安装包 |
cudnn-11.1-linux-x64-v8.0.4.30.tgz | cudnn库压缩包 |
3.1 使用cuda安装包安装
$ chmod a+x cuda_11.1.0_455.23.05_linux.run
$ ./cuda_11.1.0_455.23.05_linux.run
3.2 启动桌面系统
$ sudo service lightdm start
# or
$ sudo service sddm start
3.3 验证
$ nvidia-smi
Thu Mar 11 09:10:21 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05 Driver Version: 455.23.05 CUDA Version: 11.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 GeForce MX150 Off | 00000000:03:00.0 Off | N/A |
| N/A 39C P0 N/A / N/A | 0MiB / 2002MiB | 1% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
~/NVIDIA_CUDA-11.1_Samples/1_Utilities/deviceQuery$ ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "GeForce MX150"
CUDA Driver Version / Runtime Version 11.1 / 11.1
CUDA Capability Major/Minor version number: 6.1
Total amount of global memory: 2003 MBytes (2099904512 bytes)
( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores
GPU Max Clock rate: 1532 MHz (1.53 GHz)
Memory Clock rate: 3004 Mhz
Memory Bus Width: 64-bit
L2 Cache Size: 524288 bytes
Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 49152 bytes
Total shared memory per multiprocessor: 98304 bytes
Total number of registers available per block: 65536
Warp size: 32
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
Maximum memory pitch: 2147483647 bytes
Texture alignment: 512 bytes
Concurrent copy and kernel execution: Yes with 2 copy engine(s)
Run time limit on kernels: No
Integrated GPU sharing Host Memory: No
Support host page-locked memory mapping: Yes
Alignment requirement for Surfaces: Yes
Device has ECC support: Disabled
Device supports Unified Addressing (UVA): Yes
Device supports Managed Memory: Yes
Device supports Compute Preemption: Yes
Supports Cooperative Kernel Launch: Yes
Supports MultiDevice Co-op Kernel Launch: Yes
Device PCI Domain ID / Bus ID / location ID: 0 / 3 / 0
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.1, CUDA Runtime Version = 11.1, NumDevs = 1
Result = PASS
( 3) Multiprocessors, (128) CUDA Cores/MP: 384 CUDA Cores
3个流式多处理器(即SM),每个多处理器包含128个流处理器,共384个CUDA核
4. Samples
4.1 CUDA Samples目录
目录名 | 说明 |
---|---|
Simple Reference | 基础CUDA示例,适用于初学者, 反映了运用CUDA和CUDA runtime APIs的一些基本概念. |
Utilities Reference | 演示如何查询设备能力和衡量GPU/CPU 带宽的实例程序。 |
Graphics Reference | 图形化示例展现的是 CUDA, OpenGL, DirectX 之间的互通性 |
Imaging Reference | 图像处理,压缩,和数据分析 |
Finance Reference | 金融计算的并行处理 |
Simulations Reference | 展现一些运用CUDA的模拟算法 |
Advanced Reference | 用CUDA实现的一些先进的算法 |
Cudalibraries Reference | 这类示例主要告诉我们该如何使用CUDA各种函数库(NPP, CUBLAS, CUFFT,CUSPARSE, and CURAND). |