安裝TensorFlow
anaconda
下載: python3.6版本
安裝: sudo sh Anaconda3-5.2.0-Linux-x86_64.sh
tensorflow
從pypi下載已經(jīng)編譯好的tensorflow_gpu
tensorflow_gpu依賴的python package, 一部分可以在python uci package下載, 一部分仍在在pypi下載.
tensorflow_gpu-1.10.1-cp36-cp36m-manylinux1_x86_64.whl
absl_py-0.4.0-py2.py3-none-any.whl
astor-0.7.1-py2.py3-none-any.whl
gast-0.2.0-py2.py3-none-any.whl
grpcio-1.14.1-cp36-cp36m-manylinux1_x86_64.whl
msgpack-0.5.6-cp36-cp36m-manylinux1_x86_64.whl
numpy-1.14.5-cp36-cp36m-manylinux1_x86_64.whl(版本限制!)
protobuf-3.6.1-cp36-cp36m-manylinux1_x86_64.whl
termcolor-1.1.0-py2.py3-none-any.whl
tensorboard-1.10.0-py3-none-any.whl
Markdown-2.6.11-py2.py3-none-any.whl
安裝: pip install XXX.whl
安裝Cuda
下載: cuda9.0
-
安裝:
sudo sh cuda_9.0.176_384.81_linux.run
- 當(dāng)提示安裝
openGL
, 若為雙顯卡且主顯為非NVIDIA GPU, 則選擇no
- 安裝cuda出現(xiàn)的問題. 因為cuda需要修改顯卡驅(qū)動, 因此需要關(guān)閉圖形界面.
- 當(dāng)提示安裝
# error:
It appears that an X server is running.
Please exit X before installation.
If you're sure that X is not running,
but are getting this error,
please delete any X lock files in /tmp.
# solution:
/etc/init.d/lightdm stop
# 然后重啟
reboot
# refs: https://www.cnblogs.com/liyuanhong/articles/4919755.html
# 重啟圖形界面
sudo service lightdm start
或sudo lightdm restart
- 驗證
- 輸入
ls /dev/nvidia*
- 若顯示
/dev/nvidia0 /dev/nvidiactl /dev/nvidia-uvm
, 則安裝成功. - 若顯示
ls: cannot access/dev/nvidia*: No such file or directory
, 或/dev/nvidia0 /dev/nvidiactl
(安裝成功中顯示中的一個或兩個), 則需修改啟動文件sudo vi /etc/rc.local
- 把第一行的
#!/bin/sh -e
中的-e
去掉 - 然后在
exit 0
前面插入下面運行腳本 - 輸入腳本后, 運行
source /etc/rc.local
更新 - 在運行
ls /dev/nvidia*
將會看到三個文件, 則安裝成功
- 若顯示
- 輸入
/sbin/modprobe nvidia
if [ "$?" -eq 0 ]; then
# Count the number of NVIDIA controllers found.
NVDEVS=`lspci | grep -i NVIDIA`
N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l`
N=`expr $N3D + $NVGA - 1`
for i in `seq 0 $N`; do
mknod -m 666 /dev/nvidia$i c 195 $i
done
mknod -m 666 /dev/nvidiactl c 195 255
else
exit 1
fi
/sbin/modprobe nvidia-uvm
if [ "$?" -eq 0 ]; then
# Find out the major device number used by the nvidia-uvm driver
D=`grep nvidia-uvm /proc/devices | awk '{print $1}'`
mknod -m 666 /dev/nvidia-uvm c $D 0
else
exit 1
fi
-
配置環(huán)境變量
- 運行
sudo gedit /etc/profile
, 在末尾添加下面文本, 然后運行source /etc/profile
更新
- 運行
export CUDA_HOME=/usr/local/cuda-9.0
export PATH=${PATH}:${CUDA_HOME}/bin
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:${CUDA_HOME}/lib64
- 安裝完成后, 運行
nvidia-smi
進行測試
安裝cudnn. 這里需要注冊一個NVIDIA賬戶
下載: cuDNN7.2.1
-
- 安裝(第一個就可以了):
sudo dpkg -i libcudnn7_7.2.1.38-1+cuda9.0_amd64.deb
- 安裝(第一個就可以了):
安裝出現(xiàn)問題
# error:
/sbin/ldconfig.real: /usr/lib/nvidia-375/libEGL.so.1 is not a symbolic link
# solution:
https://askubuntu.com/questions/900285/libegl-so-1-is-not-a-symbolic-link
第二個腳本可用
安裝完成后進行測試
- 運行
python
import tensorflow
print(tensorflow.__version__)
- 運行問題
2018-08-31 16:11:56.214798: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-31 16:11:56.217297: E tensorflow/stream_executor/cuda/cuda_driver.cc:397] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2018-08-31 16:11:56.217342: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:157] no NVIDIA GPU device is present: /dev/nvidia0 does not exist
- 運行
nvidia-smi
Failed to initialize NVML: Driver/library version mismatch
- 原因分析:
- 安裝cuda失敗, 重新安裝cuda, cuda的安裝可以見cuda 安裝.
- 安裝的cuda版本不對, 則安裝對應(yīng)的cuda, 或者是cuda版本自動更新, 重啟電腦解決.
參考
TensorFlow官方安裝教程
cuda 安裝
cudnn 安裝
其他
代理上網(wǎng)
- windows安裝
CCproxy
- ubuntu設(shè)置環(huán)境變量
# proxy envir
MY_PROXY_URL=http://XXX.XX.XX.XX:808/
export ftp_proxy=${MY_PROXY_URL}
export http_proxy=${MY_PROXY_URL}
export https_proxy=${MY_PROXY_URL}
# proxy envir or
sudo apt-get -o Acquire::http::proxy="http://XXX.XX.XX.XX:808/" update