GPU驱动安装

Nvidia Driver 安装

lsmod | grep nouveau
yum list | grep kernel-devel

虚拟环境搭建

配置镜像

pip config set global.index-url https://pypi.tuna.tsinghua.edu/simple

安装miniconda

anaconda是包含一些常用包的版本(这里的常用不代表你常用 微笑.jpg),miniconda则是精简版

Centos 下Miniconda的安装与使用
miniconda下载

进入bash

source ~/.bashrc

conda create -n OCR python=3.9
conda activate OCR
sudo yum-config-manager --add-repo https://developer.download.nvidia/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo
sudo yum clean all
sudo yum -y install nvidia-driver-latest-dkms cuda
sudo yum -y install cuda-drivers

安装cudatoolkit

参考:一文讲清楚CUDA、CUDA toolkit、CUDNN、NVCC关系

  • 显卡是GPU 硬件

  • 显卡驱动是NVIDIA Driver 软件

  • cuda 英文全称是Compute Unified Device Architecture,是显卡厂商NVIDIA推出的运算平台。 CUDA™是一种由NVIDIA推出的通用并行计算架构,该架构使GPU能够解决复杂的计算问题

  • cudnn 一个专门为深度学习计算设计的软件库,里面提供了很多专门的计算函数

  • CUDA Toolkit 安装cuda 安装CUDA Toolkit

  • nvcc CUDA的编译器

CUDA Toolkit Archive 11.6下载

conda install cudatoolkit=11.6 
conda install cudnn=8.4

默认安装6

updatedb
locate libcublas.so
locate libcudnn.so
cd /usr/lib
ll /usr/lib |grep libcu 
ln -s /data1/miniconda/pkgs/cudatoolkit-11.6.0-hecad31d_10/lib/libcublas.so libcublas.so 
ln -s /data1/tools/cudnn-linux-x86_64-8.5.0.96_cuda11-archive/lib/libcudnn.so libcudnn.so 
export LD_LIBRARY_PATH= /data1/miniconda/envs/OCR/lib
export PATH=$PATH:$LD_LIBRARY_PATH
source /etc/profile

pytorch安装

失败:

conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge

成功:

pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch/whl/cu116

paddle gpu安装

参考官网:https://www.paddlepaddle/

python -m pip install paddlepaddle-gpu==2.3.2.post116 -f https://www.paddlepaddle/whl/linux/mkl/avx/stable.html

报错

ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20’ not found

原因:glibc版本不对

解决方案参考:
ImportError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.21‘ not found

mv libstdc++.so.6 libstdc++.so.6.bak
rm -rf /lib64/libstdc++.so.6
ln -s /lib64/libstdc++.so.6.0.26 /lib64/libstdc++.so.6

RuntimeError: (PreconditionNotMet) Cannot load cudnn shared library. Cannot invoke method cudnnGetVersion.

[Hint: cudnn_dso_handle should not be null.] (at /paddle/paddle/phi/backends/dynload/cudnn:59)

是 LD_LIBRARY_PATH环境变量未设置(LD_LIBRARY_PATH是程序加载运行期间查找动态链接库时指定除了系统默认路径之外的其他路径)

vim /etc/profile
export LD_LIBRARY_PATH=/data1/miniconda/envs/OCR/lib 
export PATH=$PATH:$LD_LIBRARY_PATH

paddle检查GPU安装成功

 
import paddle
paddle.fluid.is_compiled_with_cuda()
paddle.fluid.install_check.run_check()    

opencv版本

opencv报错 ImportError: libXext.so.6: cannot open shared object file: No such file or directory

参考:ImportError: libX11.so.6: cannot open shared object file: No such file or directory

pip install opencv-python pandas paddleocr
sudo yum install libX11
sudo yum install libXext

其他命令

传输:

scp  xxxx root@192.168.xx.xxx:path

解压:

xz -d cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar.xz
tar -xvf cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar

查看

nvidia-smi
pip install numba
numba -s

修正时区

timedatectl 
timedatectl set-timezone Asia/Shanghai 

安装镜像

conda config --show channels
conda config --add channels https://mirrors.tuna.tsinghua.edu/anaconda/cloud/msys2/
conda config --add channels https://mirrors.tuna.tsinghua.edu/anaconda/cloud/bioconda/
conda config --add channels https://mirrors.tuna.tsinghua.edu/anaconda/cloud/menpo/
conda config --add channels https://mirrors.tuna.tsinghua.edu/anaconda/cloud/conda-forge/
conda config --add channels https://mirrors.tuna.tsinghua.edu/anaconda/cloud/pytorch/
conda config --add channels https://mirrors.tuna.tsinghua.edu/anaconda/pkgs/main/
conda config --add channels https://mirrors.tuna.tsinghua.edu/anaconda/pkgs/free/

更多推荐

centos搭建paddle环境(GPU)