1.
2. 問題
2.1 求助:TensorFlow指定使用GPU2,3卻自動(dòng)占用GPU0
問題:使用os.environ["CUDA_VISIBLE_DEVICES"] = "1"設(shè)置使用GPU1覆积,卻總是使用GPU0颂砸;工禾,問題出在print("GPU狀態(tài):",tf.test.is_gpu_available())代碼上,在設(shè)置使用哪塊GPU之前镶蹋,不能調(diào)用tensorflow的函數(shù)。
import os
# 用于保存訓(xùn)練后的檢查點(diǎn)文件和日志文件路徑
train_log_file = 'miniImageNet-better-cs.ckpt'
print("GPU狀態(tài):",tf.test.is_gpu_available())
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" #有效讹弯,但是運(yùn)行結(jié)果不對(duì)
os.environ["CUDA_VISIBLE_DEVICES"] = gpu_flag
#os.environ["CUDA_VISIBLE_DEVICES"] = "1"
修改前的運(yùn)行l(wèi)og:
2020-07-27 12:18:22.763620: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0, 1, 2, 3, 4, 5, 6, 7
2020-07-27 12:18:22.772153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-27 12:18:22.772171: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0 1 2 3 4 5 6 7
2020-07-27 12:18:22.772181: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N N N N N N N N
2020-07-27 12:18:22.772188: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 1: N N N N N N N N
2020-07-27 12:18:22.772195: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 2: N N N N N N N N
2020-07-27 12:18:22.772202: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 3: N N N N N N N N
2020-07-27 12:18:22.772209: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 4: N N N N N N N N
2020-07-27 12:18:22.772216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 5: N N N N N N N N
2020-07-27 12:18:22.772222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 6: N N N N N N N N
修改如下雪隧,把tf.test.is_gpu_available()語(yǔ)句挪后即可:
import os
# 用于保存訓(xùn)練后的檢查點(diǎn)文件和日志文件路徑
train_log_file = 'miniImageNet-better-cs.ckpt'
print("GPU狀態(tài):",tf.test.is_gpu_available())
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" #有效,但是運(yùn)行結(jié)果不對(duì)
os.environ["CUDA_VISIBLE_DEVICES"] = gpu_flag
#os.environ["CUDA_VISIBLE_DEVICES"] = "1"
修改后的運(yùn)行l(wèi)og:
2020-07-27 12:25:40.070721: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2020-07-27 12:25:40.070792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-27 12:25:40.070801: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0
2020-07-27 12:25:40.070810: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N
2020-07-27 12:25:40.070915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10238 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:88:00.0, compute capability: 7.5)
2020-07-27 12:25:42.117883: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2020-07-27 12:25:42.117947: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-07-27 12:25:42.117956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0
2020-07-27 12:25:42.117964: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N
2020-07-27 12:25:42.118029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/device:GPU:0 with 10238 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:88:00.0, compute capability: 7.5)
參考資料
[1] tensorflow運(yùn)行在gpu還是cpu
[2] tensorflow訓(xùn)練使用GPU和CPU的不同指定方法
[3] tensorflow指定使用哪塊GPU運(yùn)行程序
[4] 求助:TensorFlow指定使用GPU2,3卻自動(dòng)占用GPU0