GPU Linux虛擬主機(jī)GN7型安裝配置文檔 定稿

??我現(xiàn)在配的虛擬主機(jī)缺了顆GPU,一些使用GPU的算法無法在線演示桑涎,有點(diǎn)美中不足。網(wǎng)上搜了一圈兼贡,騰訊云現(xiàn)在有個推廣活動攻冷,花點(diǎn)小錢就可以配一個實(shí)驗(yàn)了,比較便宜遍希,其它一些廠的個人擔(dān)負(fù)不起等曼,所以在騰訊云上買了一個實(shí)例,試用一個月,以完成配置測試的實(shí)驗(yàn)禁谦。

??完全沒有用過Ubuntu胁黑,大概需要重裝很多次才能搞定。進(jìn)度很難估計(jì)州泊,先用一個月看看丧蘸。所以要寫下每一步的詳細(xì)文檔,以便隨時重裝遥皂。我將安裝tensorflow2.6力喷、pytorch1.11.0與HanLP2.1,它們的版本不會沖突演训。對應(yīng)的是CUDA11.2與cuDNN8.5弟孟,cuDNN8.5適配CUDA11.X(最后改回cuDNN8.1了),Python 3.9样悟。然后會在Rstudio中通過reticulate包拂募、tensorflow包與keras包調(diào)用它們。如果有時間窟她,也會測試一下R語言實(shí)現(xiàn)的torch包陈症,它提供了類似PyTorch的功能,直接調(diào)用libtorch震糖。

??騰訊云GPU計(jì)算型虛擬主機(jī) GN7录肯,搭載 NVIDIA T4 GPU,8核CPU+32G RAM+100G SSD+1顆T4试伙,帶寬5M嘁信,¥80/試用一個月,試用計(jì)劃GPU實(shí)驗(yàn)室疏叨,入門教程潘靖。

一、從鏡像安裝操作系統(tǒng)蚤蔓。

??不同的GPU驅(qū)動版本卦溢,可選的CUDA版本不同,要選460.106.00版秀又。

公共鏡像:Ubuntu Server 18.04.1 LTS64位

后臺自動安裝GPU驅(qū)動

GPU 驅(qū)動版本:460.106.00

CUDA版本: 11.2.2

cuDNN版本: 8.2.1

用戶名: ubuntu

網(wǎng)址 :172.16.XX.XX(內(nèi))106.52.XX.XX(公)

??安裝完成单寂,用SecureCRT或PuTTY連接,它的SSH服務(wù)器啟用了更新的密鑰交換算法吐辙,SecureCRT要升級到9.0版以上宣决。

1、登錄機(jī)器后昏苏,先啟用root賬戶尊沸,參閱資料威沫。設(shè)置root賬戶密碼:

$sudo passwd root

賬戶切換:

$su root
#su ubuntu

如果要允許root在SSH登錄,參閱資料1資料2洼专。

# vi /etc/ssh/sshd_config

找到這一段:

# Authentication:
#LoginGraceTime 2m
#PermitRootLogin prohibit-password
#StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

改成這樣:

# Authentication:
#LoginGraceTime 2m
#PermitRootLogin prohibit-password
PermitRootLogin yes
StrictModes yes
#MaxAuthTries 6
#MaxSessions 10

重啟SSH服務(wù)器:

# systemctl restart sshd.service

為了方便后面安裝軟件棒掠,關(guān)閉sudo命令的PATH限制,參閱資料屁商,用wq!存盤:

# vi /etc/sudoers
Defaults        env_reset
Defaults        mail_badpass
# Defaults      secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin"

然后用 -E選項(xiàng)運(yùn)行sudo命令可以繼承當(dāng)前用戶的環(huán)境變量設(shè)置烟很,這樣安裝軟件也可以不用登錄到root用戶,比如后面用conda命令安裝Python軟件包:

(gpu) ubuntu@VM-0-14-ubuntu:~$ sudo -E conda list hanlp
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
hanlp                     2.1.0b42                 pypi_0    pypi
hanlp-common              0.0.18                   pypi_0    pypi
hanlp-downloader          0.0.25                   pypi_0    pypi
hanlp-trie                0.0.5                    pypi_0    pypi

2蜡镶、大約需要10~15分鐘進(jìn)行安裝雾袱,可以用以下命令查看當(dāng)前安裝進(jìn)程:

root@VM-0-14-ubuntu:~# ps aux | grep -i install
root      8158  0.0  0.0  13776  1156 pts/0    S+   08:50   0:00 grep --color=auto -i install

如上面所示,里面沒有nv_driver_install.sh及nv_cuda_install.sh帽哑,則表示驅(qū)動安裝已經(jīng)完成谜酒。

3叹俏、驗(yàn)證GPU驅(qū)動安裝成功妻枕。

root@VM-0-14-ubuntu:~# nvidia-smi
Sat Oct 29 08:52:11 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.106.00   Driver Version: 460.106.00   CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            On   | 00000000:00:08.0 Off |                    0 |
| N/A   28C    P8     8W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

4、驗(yàn)證CUDA 安裝成功粘驰。上面入門教程寫的不適用于這個配置組合屡谐,/usr/local/cuda是到/usr/local/cuda-11.2的鏈接。

root@VM-0-14-ubuntu:~# cat  /usr/local/cuda/version.txt
cat: /usr/local/cuda/version.txt: No such file or directory
root@VM-0-14-ubuntu:~# find / -name cuda
/usr/local/cuda-11.2/targets/x86_64-linux/include/cuda
/usr/local/cuda-11.2/targets/x86_64-linux/include/thrust/system/cuda
/usr/local/cuda
root@VM-0-14-ubuntu:~# cd /usr/local/cuda
root@VM-0-14-ubuntu:/usr/local/cuda# ls
bin                DOCS      extras   lib64    nsight-compute-2020.3.1  nsight-systems-2020.4.3  nvvm       README   share  targets  version.json
compute-sanitizer  EULA.txt  include  libnvvp  nsightee_plugins         nvml                     nvvm-prev  samples  src    tools
root@VM-0-14-ubuntu:/usr/local/cuda# cd bin
root@VM-0-14-ubuntu:/usr/local/cuda/bin# ./nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

5蝌数、驗(yàn)證cuDNN安裝愕掏,上面入門教程寫的同樣不適用,系統(tǒng)從鏡像安裝cuDNN沒有成功顶伞。

root@VM-0-14-ubuntu:/usr/local/cuda/bin# cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
cat: /usr/include/cudnn_version.h: No such file or directory

二饵撑、手工安裝cuDNN,參閱資料唆貌。

cuDNN下載要登錄Nvidia的網(wǎng)站滑潘,所以用下面的命令是不行的:

wget https://developer.nvidia.com/compute/cudnn/secure/8.5.0/local_installers/11.7/cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar.xz

1、在筆記本上下載好锨咙,再用SecureFX從SSH端口傳到服務(wù)器上语卤,解壓安裝。Linux上驗(yàn)證過的CUDA與cuDNN等的匹配關(guān)系參閱該資料酪刀。

tensorflow-cuda-cudnn-python版本對照表
# tar -xvf cudnn-linux-x86_64-8.5.0.96_cuda11-archive.tar.xz
# cd cudnn-linux-x86_64-8.5.0.96_cuda11-archive
# cp lib/* /usr/local/cuda/lib64/
# cp include/* /usr/local/cuda/include/
# chmod a+r /usr/local/cuda/lib64/*
# chmod a+r /usr/local/cuda/include/*

2粹舵、將CUDA目錄加入全局環(huán)境變量:

# vi /etc/profile
export PATH=/usr/local/cuda-11.2/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.2/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda-11.2

3、source /etc/profile使它生效骂倘,或者logout再login眼滤,驗(yàn)證cuDNN安裝:

root@VM-0-14-ubuntu:/usr/local/cuda/bin# source /etc/profile
root@VM-0-14-ubuntu:/usr/local/cuda/bin# echo $PATH
/usr/local/cuda-11.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
root@VM-0-14-ubuntu:/usr/local/cuda/bin# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
root@VM-0-14-ubuntu:/usr/local/cuda/bin# cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
#define CUDNN_MAJOR 8
#define CUDNN_MINOR 5
#define CUDNN_PATCHLEVEL 0
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)

#endif /* CUDNN_VERSION_H */

三、安裝Anaconda

1历涝、下載安裝Anaconda诅需,裝在/usr/local/anaconda3目錄情妖。

$ wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2022.10-Linux-x86_64.sh
$ sudo bash Anaconda3-2022.10-Linux-x86_64.sh

安裝完成,選擇運(yùn)行 conda init:

done
installation finished.
Do you wish the installer to initialize Anaconda3
by running conda init? [yes|no]
[no] >>> yes
modified      /usr/local/anaconda3/condabin/conda
modified      /usr/local/anaconda3/bin/conda
modified      /usr/local/anaconda3/bin/conda-env
no change     /usr/local/anaconda3/bin/activate
no change     /usr/local/anaconda3/bin/deactivate
no change     /usr/local/anaconda3/etc/profile.d/conda.sh
no change     /usr/local/anaconda3/etc/fish/conf.d/conda.fish
no change     /usr/local/anaconda3/shell/condabin/Conda.psm1
no change     /usr/local/anaconda3/shell/condabin/conda-hook.ps1
no change     /usr/local/anaconda3/lib/python3.9/site-packages/xontrib/conda.xsh
no change     /usr/local/anaconda3/etc/profile.d/conda.csh
modified      /root/.bashrc

==> For changes to take effect, close and re-open your current shell. <==

If you'd prefer that conda's base environment not be activated on startup, 
   set the auto_activate_base parameter to false: 

conda config --set auto_activate_base false

Thank you for installing Anaconda3!

===========================================================================

Working with Python and Jupyter is a breeze in DataSpell. It is an IDE
designed for exploratory data analysis and ML. Get better data insights
with DataSpell.

DataSpell for Anaconda is available at: https://www.anaconda.com/dataspell

編輯全局變量腳本诱担,把設(shè)置conda環(huán)境的腳本加到最后毡证,以便所有用戶都可用。

# vi /etc/profile
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/usr/local/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/usr/local/anaconda3/etc/profile.d/conda.sh" ]; then
        . "/usr/local/anaconda3/etc/profile.d/conda.sh"
    else
        export PATH="/usr/local/anaconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

運(yùn)行~/.bashrc使conda base環(huán)境生效蔫仙,或者logout再login料睛。

# source ~/.bashrc

2、root安裝tensorflow-gpu 2.6摇邦。

# conda create --name gpu python=3.9
# pip install ipykernel
# python -m ipykernel install --user --name gpu
# conda activate gpu
# pip install tensorflow-gpu==2.6

3恤煞、ubuntu用戶測試安裝。

(base) ubuntu@VM-0-14-ubuntu:~$ conda activate gpu
(gpu) ubuntu@VM-0-14-ubuntu:~$ python
Python 3.9.13 (main, Oct 13 2022, 21:15:33) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.test.is_built_with_cuda() 
True
>>> a = tf.constant(1.)
2022-10-29 18:14:29.577429: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.585025: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.585898: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.587034: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-29 18:14:29.587744: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.588624: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:29.589442: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.245462: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.246301: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.247122: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-10-29 18:14:30.247901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13803 MB memory:  -> device: 0, name: Tesla T4, pci bus id: 0000:00:08.0, compute capability: 7.5
>>> b = tf.constant(2.)
>>> print(a+b)
tf.Tensor(3.0, shape=(), dtype=float32)
>>> 

四施籍、配置Jupyter Notebook

??Jupyter Notebook的安裝配置要簡單一點(diǎn)居扒,先配起它來驗(yàn)證GPU環(huán)境的安裝,參閱資料丑慎。

1喜喂、安裝Anaconda3時base環(huán)境已經(jīng)安裝了Jupyter Notebook,但上面建立的虛擬環(huán)境"gpu"里面沒有安裝竿裂,要安裝一下玉吁,先用conda activate激活環(huán)境再裝。

(base) root@VM-0-14-ubuntu:~# conda activate gpu
(gpu) root@VM-0-14-ubuntu:~# conda list jupyter
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
(gpu) root@VM-0-14-ubuntu:~# conda install  jupyter notebook
Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /usr/local/anaconda3/envs/gpu

  added / updated specs:
    - jupyter
    - notebook


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    asttokens-2.0.5            |     pyhd3eb1b0_0          20 KB
......
Proceed ([y]/n)? y


Downloading and Extracting Packages
soupsieve-2.3.2.post | 65 KB     | ################################################################################################################################################## | 100% 
......
asttokens-2.0.5      | 20 KB     | ################################################################################################################################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Retrieving notices: ...working... done
 

2腻异、為用戶ubuntu配置Jupyter Notebook进副。

1)產(chǎn)生配置文件。

(base) ubuntu@VM-0-14-ubuntu:~$ jupyter notebook --generate-config
Writing default config to: /home/ubuntu/.jupyter/jupyter_notebook_config.py

2)產(chǎn)生登錄口令的Hash悔常。

(base) ubuntu@VM-0-14-ubuntu:~$ python
Python 3.9.13 (main, Aug 25 2022, 23:26:10) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from notebook.auth import passwd
>>> passwd()
Enter password: 
Verify password: 
'argon2:$argon2id$v=19$m=10240,t=10,p=xxxxxxxxxxxxxxxxxxx'
>>> 

3影斑、編輯配置文件,拷貝上面登錄口令的Hash到配置文件机打。

$ vi ~/.jupyter/jupyter_notebook_config.py
c.NotebookApp.ip='*'                     # 就是設(shè)置所有ip皆可訪問  
c.NotebookApp.password = 'argon2:$argon2id$v=19$m=10240,t=10,p=xxxxxxxxxxxxxxxxxxx'  # 上面復(fù)制的那個sha密文'  
c.NotebookApp.open_browser = False       # 禁止自動打開瀏覽器  
c.NotebookApp.port =8888                 # 端口
c.NotebookApp.notebook_dir = '/home/ubuntu/jupyternotebook'  #設(shè)置Notebook啟動進(jìn)入的目錄

4矫户、啟動Jupyter Notebook,注意要先激活使用"gpu"環(huán)境姐帚,用的是它吏垮。

(base) ubuntu@VM-0-14-ubuntu:~$ conda activate gpu
(gpu) ubuntu@VM-0-14-ubuntu:~$ conda list jupyter
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
jupyter                   1.0.0            py39h06a4308_8  
jupyter_client            7.3.5            py39h06a4308_0  
jupyter_console           6.4.3              pyhd3eb1b0_0  
jupyter_core              4.11.1           py39h06a4308_0  
jupyter_server            1.18.1           py39h06a4308_0  
jupyterlab                3.4.4            py39h06a4308_0  
jupyterlab_pygments       0.1.2                      py_0  
jupyterlab_server         2.15.2           py39h06a4308_0  
jupyterlab_widgets        1.0.0              pyhd3eb1b0_1  
(gpu) ubuntu@VM-0-14-ubuntu:~$ jupyter notebook &
[1] 16510
(gpu) ubuntu@VM-0-14-ubuntu:~$ [W 07:53:21.094 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[W 2022-10-30 07:53:21.326 LabApp] 'ip' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'password' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'password' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'port' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'notebook_dir' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[W 2022-10-30 07:53:21.326 LabApp] 'notebook_dir' has moved from NotebookApp to ServerApp. This config will be passed to ServerApp. Be sure to update your config before our next release.
[I 2022-10-30 07:53:21.333 LabApp] JupyterLab extension loaded from /usr/local/anaconda3/envs/gpu/lib/python3.9/site-packages/jupyterlab
[I 2022-10-30 07:53:21.333 LabApp] JupyterLab application directory is /usr/local/anaconda3/envs/gpu/share/jupyter/lab
[I 07:53:21.337 NotebookApp] Serving notebooks from local directory: /home/ubuntu/jupyternotebook
[I 07:53:21.337 NotebookApp] Jupyter Notebook 6.4.12 is running at:
[I 07:53:21.337 NotebookApp] http://VM-0-14-ubuntu:8888/
[I 07:53:21.337 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

5、瀏覽器訪問罐旗,輸入上面設(shè)置的密碼登錄膳汪,然后新建一個測試的notebook測試GPU環(huán)境的安裝。

import tensorflow as tf
tf.test.is_built_with_cuda() 
a = tf.constant(1.)
b = tf.constant(2.)
print(a+b)
JupyterNotebook測試tensorflow-gpu安裝

6九秀、新建一個測試的notebook測試keras與cuDNN遗嗽。

import os
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers,optimizers, datasets
from tensorflow.keras.models import load_model
from matplotlib import pyplot as plt
import numpy as np

# 一、數(shù)據(jù)集處理
 
# 構(gòu)建模型
(x_train_raw, y_train_raw),(x_test_raw,y_test_raw) = datasets.mnist.load_data()
print(y_train_raw[0])                                         # 5
print(x_train_raw.shape, y_train_raw.shape)                   # (60000,28,28)6萬張訓(xùn)練集
print(x_test_raw.shape, y_test_raw.shape)                     # (10000,28,28)1萬張測試集
 
num_classes = 10
y_train= keras.utils.to_categorical(y_train_raw,num_classes)  # 將分類標(biāo)簽變?yōu)楠?dú)熱碼(onehot)
y_test = keras.utils.to_categorical(y_test_raw,num_classes)
print(y_train[0])                                             # [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
 
# 數(shù)據(jù)可視化鼓蜒,看看測試的數(shù)據(jù)
plt.figure()
for i in range(9):
    plt.subplot(3,3,i+1)
    plt.imshow(x_train_raw[i])
    plt.axis('off')
plt.show()

# 二痹换、構(gòu)建并編譯全連接神經(jīng)網(wǎng)絡(luò)
 
# 編譯全連接層
x_train = x_train_raw.reshape(60000,784)                     # 將28*28的圖像展開成784*1的向量
x_test = x_test_raw.reshape(10000,784)                       # 將圖像像素值歸一化0~1
x_train= x_train.astype('float32')/255
x_test = x_test.astype('float32')/255                        
    
model = keras.Sequential([                                   # 創(chuàng)建模型征字。模型包括3個全連接層和兩個RELU激活函數(shù)
    layers.Dense(512,activation='relu', input_dim = 784),    # 降維處理
    layers.Dense(256,activation='relu'),
    layers.Dense(124,activation='relu'),
    layers.Dense(num_classes,activation='softmax')
])

# 三、訓(xùn)練網(wǎng)絡(luò)
 
Optimizer = optimizers.Adam(0.001)
model.compile(loss=keras.losses.categorical_crossentropy,
    optimizer=Optimizer,                                     # Adam優(yōu)化器 
    metrics=['accuracy']
)
model.fit(x_train,y_train,                                   # 訓(xùn)練集數(shù)據(jù)標(biāo)簽
    batch_size=128,                                          # 批大小 
    epochs=10,                                               # 訓(xùn)練的輪次
    verbose=1)                                               # 輸出日志

# 四娇豫、測試模型
 
score = model.evaluate(x_test,y_test,verbose=0)
print('Test loss:', score[0])                                # 損失函數(shù): 0.0853068439
print('Test accuracy:', score[1])                             # 精確度: 0.9767
 
test_loss,test_acc = model.evaluate(x=x_test,y=y_test)
print("Test Accuracy %.2f"%test_acc)                         # 精確度: 0.9   

# 五匙姜、保存模型
 
model.save('./final_DNN_mode1.h5')                 # 保存DNN模型

# 六、加載保存的模型
new_model = load_model('./final_DNN_mode1.h5')
new_model.summary()

# 七冯痢、CNN 模型測試 -----------------------------------------------------------------------------------------------------

# 將數(shù)據(jù)擴(kuò)充維度氮昧,以適應(yīng)CNN模型
X_train=x_train.reshape(60000,28,28,1)
X_test=x_test.reshape(10000,28,28,1)

# 定義卷積神經(jīng)網(wǎng)絡(luò)模型
model=keras.Sequential([                                   # 創(chuàng)建網(wǎng)絡(luò)序列
    layers.Conv2D(filters=32,kernel_size = 5,strides = (1,1), padding ='same',activation = tf.nn.relu,input_shape = (28,28,1)),
                                                             # 添加第一層卷積層和池化層
    layers.MaxPool2D(pool_size=(2,2),strides = (2,2),padding = 'valid'),
                                                             # 添加第二層卷積層和泄化層
    layers.Conv2D(filters=64, kernel_size = 3, strides=(1, 1),padding='same', activation = tf.nn.relu),
    layers.MaxPool2D(pool_size=(2,2),strides = (2,2),padding = 'valid'),
                                                             # 添加dropout層 以減少過擬合
    layers.Dropout(0.25),                     # 隨機(jī)丟棄神經(jīng)元的比例    
    layers.Flatten(),
                                                             # 添加兩層全連接層
    layers.Dense(units=128,activation = tf.nn.relu),
    layers.Dropout(0.5),
    layers.Dense(units=10,activation = tf.nn.softmax)
])  

# 編譯并訓(xùn)練模型
Optimizer = optimizers.Adam(0.001)
model.compile(Optimizer,loss="categorical_crossentropy",metrics=['accuracy'])
model.fit(x=X_train,y=y_train,epochs=5,batch_size=128)       # 輪次為5

# 保存CNN模型
model.save('./final_CNN_model.h5')                  
# 加載保存的模型
new_model = load_model('./final_CNN_model.h5')

# 八、測試數(shù)據(jù)進(jìn)行可視化測試
 
# @matplotlib.inline
def res_Visual(n):
    # 參閱 https://blog.csdn.net/yiyihuazi/article/details/122323349
    # keras 2.6刪除了predict_classes()函數(shù)
    # final_opt_a=new_model.predict_classes(X_test[0:n])        # 通過模型預(yù)測測試集
    # 用下面的語句代替
    predicts = new_model.predict(X_test[0:n])
    final_opt_a = np.argmax(predicts, axis=1)
    
    fig, ax = plt.subplots(nrows=int(n/5), ncols=5)
    ax = ax.flatten()
    print('前{}張圖片預(yù)測結(jié)果為:'.format(n))
    for i in range(n): 
        print(final_opt_a[i],end='.')
        if int((i+1)%5)==0:
            print('\t')
 
        # 圖片可視化展示
        img = X_test[i].reshape((28,28))                       # 讀取每行數(shù)據(jù)浦楣,格式為Ndarry
        plt.axis("off")
        ax[i].imshow(img,cmap='Greys',interpolation='nearest') # 可視化
        ax[i].axis("off")
    print('測試集前{}張圖片為:'.format(n))
    
    
res_Visual(20) 

keras要降低版本到2.6.0袖肥,否則出錯,參閱資料振劳。

ImportError: cannot import name 'dtensor' from 'tensorflow.compat.v2.experimental' 

(gpu) root@VM-0-14-ubuntu:~# conda list keras
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
keras                     2.10.0                   pypi_0    pypi
keras-preprocessing       1.1.2                    pypi_0    pypi
(gpu) root@VM-0-14-ubuntu:~# pip install keras==2.6

??測試程序前面DNN全連接神經(jīng)網(wǎng)絡(luò)的部分通過了椎组,后面使用cuDNN的CNN卷積神經(jīng)網(wǎng)絡(luò)部分沒有通過,cuDNN8.5的版本可能過高历恐,參閱資料寸癌。需要降回經(jīng)過測試確認(rèn)的8.1版。報(bào)錯:

OP_REQUIRES failed at conv_ops.cc:1276 : Not found: No algorithm worked!

7夹供、降低cuDNN版本到8.1灵份。筆記本上下載并用SecureFX通過SSH傳到服務(wù)器上仁堪,拷貝并替換cuDNN8.5的文件哮洽。

# tar -xvf cudnn-11.2-linux-x64-v8.1.1.33.tgz
# cd cuda
# cp -f lib64/* /usr/local/cuda/lib64/
# cp -f include/* /usr/local/cuda/include/
# chmod a+r /usr/local/cuda/lib64/*
# chmod a+r /usr/local/cuda/include/*

??在全局環(huán)境變量中加入下面的設(shè)置,否則跑CNN測試時可能會報(bào)錯說申請的內(nèi)存過大弦聂,導(dǎo)致算法運(yùn)行失斈窀ā:

# vi /etc/profile
TF_GPU_ALLOCATOR=cuda_malloc_async

??更新動態(tài)鏈接庫的Cache,否則鏈接不對莺葫,重啟系統(tǒng):

# ldconfig -X
# reboot now

??用ubuntu用戶登錄匪凉,激活"gpu"環(huán)境并啟動Jupyter Notebook:

$ conda activate gpu
$ jupyter notebook &

8、重新運(yùn)行剛才的notebook測試GPU環(huán)境的安裝捺檬,通過再层。

1、加載tensorflow識別手寫數(shù)字例子數(shù)據(jù)集
2堡纬、構(gòu)建并編譯DNN神經(jīng)網(wǎng)絡(luò)
3聂受、訓(xùn)練網(wǎng)絡(luò)
4、測試模型
5烤镐、CNN 模型測試
6蛋济、測試數(shù)據(jù)進(jìn)行可視化

五、安裝Pytorch與HanLP

??我在同一個虛擬環(huán)境"gpu"中安裝Tensorflow炮叶、Pytorch與HanLP碗旅,因?yàn)橐蹾anLP2.1渡处,它同時支持兩個后端。

1祟辟、安裝Pytorch医瘫。

(gpu) root@VM-0-14-ubuntu:~# conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /usr/local/anaconda3/envs/gpu

  added / updated specs:
    - cudatoolkit=11.3
    - pytorch==1.11.0
    - torchaudio==0.11.0
    - torchvision==0.12.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    cudatoolkit-11.3.1         |       h2bc3f7f_2       549.3 MB
......
  torchvision        pytorch/linux-64::torchvision-0.12.0-py39_cu113 None


Proceed ([y]/n)? y


Downloading and Extracting Packages
lame-3.100           | 323 KB    | ################################################################################################################################################## | 100% 
......
######################################################################## | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: | By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA): https://docs.nvidia.com/cuda/eula/index.html

done
Retrieving notices: ...working... done

2、安裝HanLP旧困。

(gpu) root@VM-0-14-ubuntu:~# pip install hanlp
Looking in indexes: http://mirrors.tencentyun.com/pypi/simple
Collecting hanlp
......
Successfully built hanlp-common hanlp-trie hanlp-downloader phrasetree
Installing collected packages: toposort, tokenizers, phrasetree, tqdm, regex, pyyaml, pynvml, hanlp-common, filelock, huggingface-hub, hanlp-trie, hanlp-downloader, transformers, hanlp
Successfully installed filelock-3.8.0 hanlp-2.1.0b42 hanlp-common-0.0.18 hanlp-downloader-0.0.25 hanlp-trie-0.0.5 huggingface-hub-0.10.1 phrasetree-0.0.8 pynvml-11.4.1 pyyaml-6.0 regex-2022.9.13 tokenizers-0.11.6 toposort-1.5 tqdm-4.64.1 transformers-4.23.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

安裝fasttext登下,這是HanLP一些Tensorflow預(yù)訓(xùn)練模型要用的:

(gpu) root@VM-0-14-ubuntu:~# pip install fasttext
Looking in indexes: http://mirrors.tencentyun.com/pypi/simple
Collecting fasttext
  Downloading http://mirrors.tencentyun.com/pypi/packages/f8/85/e2b368ab6d3528827b147fdb814f8189acc981a4bc2f99ab894650e05c40/fasttext-0.9.2.tar.gz (68 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 68.8/68.8 kB 332.3 kB/s eta 0:00:00
  Preparing metadata (setup.py) ... done
Collecting pybind11>=2.2
  Using cached http://mirrors.tencentyun.com/pypi/packages/1d/53/e6b27f3596278f9dd1d28ef1ddb344fd0cd5db98ef2179d69a2044e11897/pybind11-2.10.1-py3-none-any.whl (216 kB)
Requirement already satisfied: setuptools>=0.7.0 in /usr/local/anaconda3/envs/gpu/lib/python3.9/site-packages (from fasttext) (65.5.0)
Requirement already satisfied: numpy in /usr/local/anaconda3/envs/gpu/lib/python3.9/site-packages (from fasttext) (1.23.3)
Building wheels for collected packages: fasttext
  Building wheel for fasttext (setup.py) ... done
  Created wheel for fasttext: filename=fasttext-0.9.2-cp39-cp39-linux_x86_64.whl size=299146 sha256=4dee6f6dc5fb53404fb5cbb69c2cc3a2faef7f3af0500567ad49dc01f26d89d7
  Stored in directory: /root/.cache/pip/wheels/ca/08/ee/d0dd871c6c089c4c3971722067bd577f8827c9b4d5d6f2477a
Successfully built fasttext
Installing collected packages: pybind11, fasttext

3、測試PyTorch及HanLP叮喳。

??先簡單測試下被芳,后面會繼續(xù)測試。

import torch

print(torch.__version__)
print(torch.cuda.is_available())
pytorch識別出GPU
# 先運(yùn)行Tensorflow模型再運(yùn)行PyTorch模型就成功馍悟,如果前面先運(yùn)行過PyTorch模型畔濒,這里就會失敗。
import hanlp
tokenizer = hanlp.load(hanlp.pretrained.tok.LARGE_ALBERT_BASE)
text = 'NLP統(tǒng)計(jì)模型沒有加規(guī)則锣咒,聰明人知道自己加侵状。英文、數(shù)字毅整、自定義詞典統(tǒng)統(tǒng)都是規(guī)則趣兄。'
print(tokenizer(text))

# 后面的測試不受運(yùn)行順序的影響

import hanlp
HanLP = hanlp.load(hanlp.pretrained.mtl.CLOSE_TOK_POS_NER_SRL_DEP_SDP_CON_ELECTRA_SMALL_ZH) # 世界最大中文語料庫
HanLP(['2021年HanLPv2.1為生產(chǎn)環(huán)境帶來次世代最先進(jìn)的多語種NLP技術(shù)。', '阿婆主來到北京立方庭參觀自然語義科技公司悼嫉。'])

import hanlp
HanLP = hanlp.pipeline() \
    .append(hanlp.utils.rules.split_sentence, output_key='sentences') \
    .append(hanlp.load('FINE_ELECTRA_SMALL_ZH'), output_key='tok') \
    .append(hanlp.load('CTB9_POS_ELECTRA_SMALL'), output_key='pos') \
    .append(hanlp.load('MSRA_NER_ELECTRA_SMALL_ZH'), output_key='ner', input_key='tok') \
    .append(hanlp.load('CTB9_DEP_ELECTRA_SMALL', conll=0), output_key='dep', input_key='tok')\
    .append(hanlp.load('CTB9_CON_ELECTRA_SMALL'), output_key='con', input_key='tok')
HanLP('2021年HanLPv2.1為生產(chǎn)環(huán)境帶來次世代最先進(jìn)的多語種NLP技術(shù)艇潭。阿婆主來到北京立方庭參觀自然語義科技公司。')

HanLP('2021年HanLPv2.1為生產(chǎn)環(huán)境帶來次世代最先進(jìn)的多語種NLP技術(shù)戏蔑。').pretty_print()

import hanlp

tok = hanlp.load(hanlp.pretrained.tok.COARSE_ELECTRA_SMALL_ZH)
tok(['商品和服務(wù)蹋凝。', '阿婆主來到北京立方庭參觀自然語義科技公司。'])

tok_fine = hanlp.load(hanlp.pretrained.tok.FINE_ELECTRA_SMALL_ZH)
tok_fine('阿婆主來到北京立方庭參觀自然語義科技公司')

pos = hanlp.load(hanlp.pretrained.pos.CTB9_POS_ELECTRA_SMALL)
pos(["我", "的", "希望", "是", "希望", "張晚霞", "的", "背影", "被", "晚霞", "映紅", "总棵。"])
分詞與詞性標(biāo)注
管道操作
打印語法樹
各種預(yù)訓(xùn)練模型模型分詞

六鳍寂、安裝配置JupyterHub

??Linux GPU虛擬主機(jī)作為科研、開發(fā)情龄、測試或生產(chǎn)環(huán)境迄汛,多用戶是很自然的,Jupyter Notebook是單用戶的骤视,JupyterHub則提供了一層多用戶的代理鞍爱,讓大家可以通過它登錄系統(tǒng),使用各自的Jupyter Notebook或Jupyter Lab尚胞,后者是前者的下一代版本硬霍。

Jupyterhub統(tǒng)一代理各用戶的Jupyterlab,從而實(shí)現(xiàn)多用戶服務(wù)

??根據(jù)該帖子笼裳,如果曾經(jīng)運(yùn)行過Jupyter Notebook唯卖,那么它在$HOME/.jupyter下的配置文件會與Jupyterhub要啟動的用戶Jupyter Lab或Jupyter Notebook Server沖突粱玲,導(dǎo)致服務(wù)進(jìn)程不能啟動,代理轉(zhuǎn)發(fā)失敗拜轨,這是個BUG抽减?所以如果曾經(jīng)運(yùn)行過Jupyter Notebook,像前面那樣橄碾,要先刪除那個目錄卵沉。這個問題搞了兩天,幾乎要崩潰法牲,還是stackoverflow給力史汗。

??參閱資料1參閱資料2拒垃,參閱資料3停撞,參閱資料4

1悼瓮、安裝并升級node.js與npm戈毒。

# #從軟件源獲取最新軟件列表,更新系統(tǒng)軟件
# apt-get update 
# apt-get upgrade
# #安裝依賴
# apt install -y npm nodejs

升級node.js横堡,不要安裝最新的18版埋市,兼容性有問題,會報(bào)錯命贴,參閱資料道宅,JupyterHub要求版本10以上,而Ubuntu18安裝的是版本8套么。

##----- 先清除 npm cache
# npm cache clean -f 
##----- 安裝 n 模塊
# npm install -g n

升級node.js:

root@VM-0-14-ubuntu:~# n 16.18.0    # 指定版本16.18.0
  installing : node-v16.18.0
       mkdir : /usr/local/n/versions/node/16.18.0
       fetch : https://nodejs.org/dist/v16.18.0/node-v16.18.0-linux-x64.tar.xz
     copying : node/16.18.0
   installed : v16.18.0 (with npm 8.19.2)

Note: the node command changed location and the old location may be remembered in your current shell.
         old : /usr/bin/node
         new : /usr/local/bin/node
If "node --version" shows the old version then start a new shell, or reset the location hash with:
hash -r  (for bash, zsh, ash, dash, and ksh)
rehash   (for csh and tcsh)

root@VM-0-14-ubuntu:~# hash -r
root@VM-0-14-ubuntu:~# node -v
v16.18.0
root@VM-0-14-ubuntu:~# npm -v
8.19.2

2培己、安裝configurable-http-proxy。

可以用npm裝:

npm install -g configurable-http-proxy

不過推薦用conda裝胚泌,會把其它依賴包一起裝上,它也會安裝一個node.js版本11肃弟,也可以用玷室,注意要切換并安裝到相應(yīng)的虛擬環(huán)境中,這里是"gpu"笤受。

(gpu) root@VM-0-14-ubuntu:~# conda install configurable-http-proxy
(gpu) root@VM-0-14-ubuntu:~# conda list configurable-http-proxy
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
configurable-http-proxy   4.0.1                   node6_0  
(gpu) root@VM-0-14-ubuntu:~# configurable-http-proxy -V
4.0.1
(gpu) root@VM-0-14-ubuntu:~# 

3穷缤、在虛擬環(huán)境中安裝JupyterHub等。

(gpu) root@VM-0-14-ubuntu:~# conda install jupyter jupyterlab jupyterhub
(gpu) root@VM-0-14-ubuntu:~# conda list jupyter
# packages in environment at /usr/local/anaconda3/envs/gpu:
#
# Name                    Version                   Build  Channel
jupyter                   1.0.0            py39h06a4308_8  
jupyter_client            7.3.5            py39h06a4308_0  
jupyter_console           6.4.3              pyhd3eb1b0_0  
jupyter_core              4.11.1           py39h06a4308_0  
jupyter_server            1.18.1           py39h06a4308_0  
jupyter_telemetry         0.1.0                      py_0  
jupyterhub                2.0.0              pyhd3eb1b0_0  
jupyterlab                3.4.4            py39h06a4308_0  
jupyterlab_pygments       0.1.2                      py_0  
jupyterlab_server         2.15.2           py39h06a4308_0  
jupyterlab_widgets        1.0.0              pyhd3eb1b0_1  

4箩兽、配置JupyterHub津肛。

新建目錄/etc/jupyterhub,在該目錄下新建一個配置文件汗贫,編輯文件身坐。

(gpu) root@VM-0-14-ubuntu:~#  mkdir /etc/jupyterhub
(gpu) root@VM-0-14-ubuntu:~# cd /etc/jupyterhub
(gpu) root@VM-0-14-ubuntu:/etc/jupyterhub# jupyterhub --generate-config
Writing default config to: jupyterhub_config.py
(gpu) root@VM-0-14-ubuntu:/etc/jupyterhub# vi  jupyterhub_config.py

內(nèi)容如下:

# Added by Jean 2022/10/31
c.Authenticator.whitelist = {'ubuntu'}   # 允許使用Jupyterhub的用戶列表秸脱,逗號分隔。
c.Authenticator.admin_users = {'ubuntu'}  #Jupyterhub的管理員用戶列表
c.Spawner.notebook_dir = '/home/{username}'  #瀏覽器登錄后進(jìn)入用戶的主目錄
c.Spawner.default_url = '/lab'    # 使用Jupyterlab而不是Notebook
c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'

5部蛇、用root用戶后臺啟動JupyterHub摊唇。

(gpu) root@VM-0-14-ubuntu:/etc/jupyterhub# jupyterhub  -f /etc/jupyterhub/jupyterhub_config.py  &

6、在瀏覽器中訪問涯鲁,輸入的是Linux系統(tǒng)中已有的用戶名巷查,網(wǎng)址是http://ip:8000,后面再配SSL加密抹腿。

JupyterHub中運(yùn)行Jupyter Lab

JupyterHub里可以打開終端窗口岛请,執(zhí)行各種操作彻亲,用戶的身份就是登錄的用戶躯保。如果SSH端口被屏蔽,這樣就可以通過HTTP端口建立隧道肉渴。執(zhí)行su命令就可以root房蝉。

(base) ubuntu@VM-0-14-ubuntu:~$ su --help
Usage: su [options] [LOGIN]

Options:
  -c, --command COMMAND         pass COMMAND to the invoked shell
  -h, --help                    display this help message and exit
  -, -l, --login                make the shell a login shell
  -m, -p,
  --preserve-environment        do not reset environment variables, and
                                keep the same shell
  -s, --shell SHELL             use SHELL instead of the default in passwd

(base) ubuntu@VM-0-14-ubuntu:~$ su --preserve-environment
Password: 
(base) root@VM-0-14-ubuntu:~# 
JupyterHub中打開終端窗口

7僚匆、配置SSL加密。

??這是配好后SSL加密連接登錄的截圖搭幻,可以打開網(wǎng)址前面的鎖圖標(biāo)看證書鏈的內(nèi)容咧擂,前面的截圖可見,如果是非加密連接檀蹋,網(wǎng)址前面顯示的是“不安全”提示松申。此處自簽的數(shù)字證書是簽給IP,因?yàn)檫@個虛擬主機(jī)還沒有申請域名俯逾。

用自簽證書給JupyterHub建立SSL加密通道

1)先講講JupyterHub配置贸桶。在配置文件中增加兩行指出使用的服務(wù)器密鑰文件和證書文件即可,后面再講用openssl自建CA及簽發(fā)該數(shù)字證書桌肴。因?yàn)槭莚oot用戶皇筛,server.key沒有指定訪問密碼。

# Added by Jean for SSL 2022/03/19
c.JupyterHub.ssl_key = '/root/cert/server.key'
c.JupyterHub.ssl_cert = '/root/cert/server.crt'

重啟JupyterHub后坠七,把自建CA的根證書拷出并導(dǎo)入瀏覽器(后面講),用https://ip:8000訪問即可水醋,如上圖所示。

2)自建CA簽發(fā)自簽服務(wù)器證書彪置。

參閱資料拄踪。

(gpu) root@VM-0-14-ubuntu:~# cd /root
(gpu) root@VM-0-14-ubuntu:~# mkdir cert
(gpu) root@VM-0-14-ubuntu:~# cd cert
(gpu) root@VM-0-14-ubuntu:~/cert# mkdir demoCA && cd demoCA
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# mkdir private newcerts
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# touch index.txt
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# echo '01' > serial
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# cd private
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# openssl genrsa -out cakey.pem 2048
Generating RSA private key, 2048 bit long modulus (2 primes)
...............................................................................+++++
....................+++++
e is 65537 (0x010001)
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# openssl req -sha256 -new -x509 -days 3650 -key cakey.pem -out cacert.pem \
>              -subj "/C=CN/ST=GD/L=ZhuHai/O=Jean/OU=Study/CN=RootCA"
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# ls
cacert.pem  cakey.pem
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA/private# cd .. && mv ./private/cacert.pem ./
(gpu) root@VM-0-14-ubuntu:~/cert/demoCA# ls
cacert.pem  index.txt  newcerts  private  serial

上面的命令執(zhí)行了一系列的操作:

A、在root用戶的HOME目錄/root下新建了/root/cert目錄拳魁。

B惶桐、然后在其下建立了自建CA的目錄結(jié)構(gòu)./demoCA,因?yàn)閛penssl默認(rèn)的配置文件中,建在當(dāng)前目錄的./demoCA目錄下姚糊。

C贿衍、然后產(chǎn)生了CA的密鑰cakey.pem。

D叛拷、簽發(fā)了CA的自簽數(shù)字證書cacert.pem舌厨,然后移動到./demoCA目錄下。后面自建CA簽發(fā)服務(wù)器證書時會到那里找CA根證書忿薇,這是openssl默認(rèn)的配置裙椭。

E、最后列出了demoCA的目錄結(jié)構(gòu)署浩。

可以找出openssl默認(rèn)的配置文件看一下揉燃,自建CA在當(dāng)前目錄的./demoCA目錄下:

(gpu) root@VM-0-14-ubuntu:~# find / -name openssl.cnf
/usr/lib/ssl/openssl.cnf
/usr/local/anaconda3/pkgs/openssl-1.1.1q-h7f8727e_0/ssl/openssl.cnf
/usr/local/anaconda3/ssl/openssl.cnf
/usr/local/anaconda3/envs/gpu/ssl/openssl.cnf
/usr/local/anaconda3/envs/hub/ssl/openssl.cnf
/etc/ssl/openssl.cnf
(gpu) root@VM-0-14-ubuntu:~# vi /usr/lib/ssl/openssl.cnf
####################################################################
[ ca ]
default_ca      = CA_default            # The default ca section

####################################################################
[ CA_default ]

dir             = ./demoCA              # Where everything is kept
certs           = $dir/certs            # Where the issued certs are kept
crl_dir         = $dir/crl              # Where the issued crl are kept
database        = $dir/index.txt        # database index file.
#unique_subject = no                    # Set to 'no' to allow creation of
                                        # several certs with same subject.
new_certs_dir   = $dir/newcerts         # default place for new certs.

certificate     = $dir/cacert.pem       # The CA certificate
serial          = $dir/serial           # The current serial number
crlnumber       = $dir/crlnumber        # the current crl number
                                        # must be commented out to leave a V1 CRL
crl             = $dir/crl.pem          # The current CRL
private_key     = $dir/private/cakey.pem# The private key
RANDFILE        = $dir/private/.rand    # private random number file

x509_extensions = usr_cert              # The extensions to add to the cert

F、生成服務(wù)器證書的密鑰與證書請求筋栋。

參考帖子1帖子2炊汤,要先執(zhí)行下面的命令產(chǎn)生/root/.rnd文件,否則產(chǎn)生服務(wù)器密鑰的命令會出錯弊攘。

openssl rand -out /root/.rnd -hex 256

??切換到./demoCA的父目錄/root/cert抢腐,然后執(zhí)行下面的命令產(chǎn)生服務(wù)器證書的密鑰與證書請求,產(chǎn)生證書請求用配置文件/usr/lib/ssl/openssl.cnf襟交,額外增加了認(rèn)證的主體別名迈倍,Chrome瀏覽器使用主體別名來檢查證書的主體別名與網(wǎng)址是否一致。因?yàn)橛?a target="_blank">https://ip訪問捣域,這里的主體別名為IP.1:106.52.33.185啼染,表示是該證書認(rèn)證的第一個IP,還可以有IP.2等等焕梅。如果是認(rèn)證域名迹鹅,可以是DNS.1 = jeanye.cn等等,如此類推贞言。產(chǎn)生證書請求文件server.csr斜棚。

(gpu) root@VM-0-14-ubuntu:~/cert# openssl genrsa -out server.key 2048
(gpu) root@VM-0-14-ubuntu:~/cert# openssl req -new \
>     -sha256 \
>     -key server.key \
>     -subj "/C=CN/ST=GD/L=ZhuHai/O=Jean/OU=Study/CN=106.52.33.185" \
>     -reqexts SAN \
>     -config <(cat /usr/lib/ssl/openssl.cnf \
>         <(printf "[SAN]\nsubjectAltName=IP.1:106.52.33.185")) \
>     -out server.csr

G、簽署服務(wù)器證書蜗字。

??openssl會在默認(rèn)子目錄./demoCA中找到cakey.pem與cacert.pem打肝,按照證書請求文件server.csr的請求,使用配置文件/usr/lib/ssl/openssl.cnf挪捕,以及與請求一樣的證書擴(kuò)展(主體別名)簽署證書,輸出成server.crt争便。

(gpu) root@VM-0-14-ubuntu:~/cert# openssl ca -in server.csr \
>         -md sha256 \
>     -extensions SAN \
>     -config <(cat /usr/lib/ssl/openssl.cnf \
>         <(printf "[SAN]\nsubjectAltName=IP.1:106.52.33.185")) \
>      -out server.crt
Using configuration from /dev/fd/63
Check that the request matches the signature
Signature ok
Certificate Details:
        Serial Number: 1 (0x1)
        Validity
            Not Before: Nov  2 09:47:58 2022 GMT
            Not After : Nov  2 09:47:58 2023 GMT
        Subject:
            countryName               = CN
            stateOrProvinceName       = GD
            organizationName          = Jean
            organizationalUnitName    = Study
            commonName                = 106.52.33.185
        X509v3 extensions:
            X509v3 Subject Alternative Name: 
                IP Address:106.52.33.185
Certificate is to be certified until Nov  2 09:47:58 2023 GMT (365 days)
Sign the certificate? [y/n]:y


1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated
(gpu) root@VM-0-14-ubuntu:~/cert# ls
demoCA  server.crt  server.csr  server.key

H级零、自建CA根證書導(dǎo)入瀏覽器。

??把自建CA的根證書/root/cert/demoCA/cacert.pem下載到客戶端(比如Win10),在瀏覽器(比如Chrome)中導(dǎo)入到受信任根證書頒證機(jī)構(gòu)中奏纪。

Google瀏覽器:

??設(shè)置->隱私設(shè)置和安全性->安全->高級->管理證書->受信任根證書頒證機(jī)構(gòu)->導(dǎo)入->下一步->瀏覽->所有文件(*.*)

導(dǎo)入自建CA根證書到瀏覽器受信任根證書頒發(fā)機(jī)構(gòu)列表

I鉴嗤、瀏覽器中輸入網(wǎng)址https://106.52.33.185:8000訪問,輸入用戶名/密碼登錄序调。

輸入用戶名/密碼登錄醉锅,啟動自己的Jupyter Lab實(shí)例

8、配置JupyterHub為開機(jī)自啟動服務(wù)发绢。

1)建立服務(wù)配置文件硬耍。

先看看conda虛擬環(huán)境"gpu"的PATH設(shè)置:

(gpu) root@VM-0-14-ubuntu:~# echo $PATH
/usr/local/anaconda3/envs/gpu/bin:/usr/local/anaconda3/condabin:/usr/local/cuda-11.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
(gpu) root@VM-0-14-ubuntu:~# 

然后新建一個系統(tǒng)守護(hù)進(jìn)程的配置文件:

(gpu) root@VM-0-14-ubuntu:~# vi /etc/systemd/system/jupyterhub.service

內(nèi)容如下,幾個要點(diǎn)边酒。

A经柴、以root運(yùn)行。

B墩朦、設(shè)定PATH路徑坯认,因?yàn)殚_機(jī)啟動進(jìn)程沒有登錄的過程,不會執(zhí)行/etc/profile等設(shè)置環(huán)境變量氓涣,把上面的PATH拷進(jìn)去牛哺。

C、用全路徑引用執(zhí)行jupyterhub劳吠。

[Unit]
Description=Jupyterhub service
After=syslog.target network.target

[Service]
User=root
Environment="PATH=/usr/local/anaconda3/envs/gpu/bin:/usr/local/anaconda3/condabin:/usr/local/cuda-11.2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
ExecStart=/usr/local/anaconda3/envs/gpu/bin/jupyterhub -f /etc/jupyterhub/config.py

[Install]
WantedBy=multi-user.target

然后讓服務(wù)配置文件生效:

(gpu) root@VM-0-14-ubuntu:~# systemctl enable jupyterhub.service

然后可以用下面幾個命令來管理服務(wù):

# systemctl status jupyterhub.service
# systemctl start jupyterhub.service
# systemctl stop jupyterhub.service

用下面的命令來查看服務(wù)的日志:

(gpu) root@VM-0-14-ubuntu:~# journalctl -u jupyterhub.service -f

上面Jupyterhub的配置文件中引润,日志也另外輸出到以下的文件:

c.JupyterHub.extra_log_file = '/var/log/jupyterhub.log'

所以也可以打開日志文件來看。

這樣赴背,每次服務(wù)器重啟椰拒,Jupyterhub都會自動啟動了。

本篇到此結(jié)束凰荚,Linux GPU虛擬主機(jī)與GPU燃观、Python深度學(xué)習(xí)運(yùn)行與開發(fā)環(huán)境相關(guān)的部分就配好了,Rstudio便瑟、Shiny等其它部分另起文章缆毁。

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市到涂,隨后出現(xiàn)的幾起案子脊框,更是在濱河造成了極大的恐慌,老刑警劉巖践啄,帶你破解...
    沈念sama閱讀 217,277評論 6 503
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件浇雹,死亡現(xiàn)場離奇詭異,居然都是意外死亡屿讽,警方通過查閱死者的電腦和手機(jī)昭灵,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,689評論 3 393
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人烂完,你說我怎么就攤上這事试疙。” “怎么了抠蚣?”我有些...
    開封第一講書人閱讀 163,624評論 0 353
  • 文/不壞的土叔 我叫張陵祝旷,是天一觀的道長。 經(jīng)常有香客問我嘶窄,道長怀跛,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 58,356評論 1 293
  • 正文 為了忘掉前任护侮,我火速辦了婚禮敌完,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘羊初。我一直安慰自己滨溉,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,402評論 6 392
  • 文/花漫 我一把揭開白布长赞。 她就那樣靜靜地躺著晦攒,像睡著了一般。 火紅的嫁衣襯著肌膚如雪得哆。 梳的紋絲不亂的頭發(fā)上脯颜,一...
    開封第一講書人閱讀 51,292評論 1 301
  • 那天,我揣著相機(jī)與錄音贩据,去河邊找鬼栋操。 笑死,一個胖子當(dāng)著我的面吹牛饱亮,可吹牛的內(nèi)容都是我干的矾芙。 我是一名探鬼主播,決...
    沈念sama閱讀 40,135評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼近上,長吁一口氣:“原來是場噩夢啊……” “哼剔宪!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起壹无,我...
    開封第一講書人閱讀 38,992評論 0 275
  • 序言:老撾萬榮一對情侶失蹤葱绒,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后斗锭,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體地淀,經(jīng)...
    沈念sama閱讀 45,429評論 1 314
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,636評論 3 334
  • 正文 我和宋清朗相戀三年岖是,在試婚紗的時候發(fā)現(xiàn)自己被綠了骚秦。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片她倘。...
    茶點(diǎn)故事閱讀 39,785評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡璧微,死狀恐怖作箍,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情前硫,我是刑警寧澤胞得,帶...
    沈念sama閱讀 35,492評論 5 345
  • 正文 年R本政府宣布,位于F島的核電站屹电,受9級特大地震影響阶剑,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜危号,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,092評論 3 328
  • 文/蒙蒙 一牧愁、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧外莲,春花似錦猪半、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,723評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至声邦,卻和暖如春乏奥,著一層夾襖步出監(jiān)牢的瞬間,已是汗流浹背亥曹。 一陣腳步聲響...
    開封第一講書人閱讀 32,858評論 1 269
  • 我被黑心中介騙來泰國打工邓了, 沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留,地道東北人媳瞪。 一個月前我還...
    沈念sama閱讀 47,891評論 2 370
  • 正文 我出身青樓骗炉,卻偏偏與公主長得像,于是被迫代替她去往敵國和親材失。 傳聞我的和親對象是個殘疾皇子痕鳍,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,713評論 2 354

推薦閱讀更多精彩內(nèi)容