在win10上配置成功KittiSeg后仰禀,隨便從網(wǎng)上download了一張圖片測試一下效果,結果發(fā)現(xiàn)不work
隨意數(shù)據(jù)上的結果:
kitti數(shù)據(jù)上的結果:
估計是kitti數(shù)據(jù)集太小了单料,模型過擬合较锡,所以泛化性能低
打算自己訓練一個model,用百度road_seg的數(shù)據(jù)集。用的是seg_3的數(shù)據(jù)
對數(shù)據(jù)進行修改队伟,把label變成2分類問題
import numpy as np
import cv2
#test for one image
#img = cv2.imread('Label/Record057/Camera 6/171206_042426425_Camera_6_bin.png')
#print(img.shape)
#img[np.where((img!=[49,49,49]).all(axis = 2))] = [255,0,0]
#img[np.where((img==[49,49,49]).all(axis = 2))] = [255,0,255]
#cv2.imwrite('a.png',img)
#image = scipy.misc.imresize(img,0.6)
#print(image.shape)
files = [line for line in open('all2.txt')]
file = files[1000:]
num = 0
for path in file:
path = path[:-1] #這里要去掉最后的/n腔稀,這也是一個字符盆昙,不然返回空
img = cv2.imread(path)
print(img.shape)
img[np.where((img!=[49,49,49]).all(axis = 2))] = [255,0,0]
img[np.where((img==[49,49,49]).all(axis = 2))] = [255,0,255]
cv2.imwrite(path,img)
print(num)
num=num+1
制作并劃分train.txt val.txt
a.先制作all.txt
分別讀取圖片和label,分成兩個txt(all1.txt all2.txt),再把兩個txt合并焊虏。
import os
import glob as gb
#import cv2
all_file = 'all2.txt'
with open (all_file, 'w') as a:
#for root, dirs,files in os.walk('ColorImage'):
for root, dirs,files in os.walk('Label'):
if len(dirs) == 0:
for i in range(len(files)):
if files[i][-3:] =='png':
file_path = root+'/'+files[i]
a.writelines(file_path+'\n')
合并兩個txt文件淡喜,此時發(fā)現(xiàn)圖片30331,label30350,兩者不一致炕淮,所以要對兩個txt做比較拆火,去除不匹配的。
#這種是按行來比較涂圆,同行之間才比較们镜,所以不符合自己想要的
之前每行后面都加了/n換行,在這里润歉,也算一個字符模狭,所以要去掉
#with open('all.txt','w') as a:
# with open('all1.txt','r') as a1, open('all2.txt') as a2:
# for line1, line2 in zip(a1,a2):
# #print(line1[10:-5],line2[5:-9])
# if line1[10:-5]==line2[5:-9]:
# print('o')
# a.writelines(line1[:-1]+' '+line2[:-1]+'\n')
with open('all.txt','w') as a:
with open('all1.txt','r') as a1, open('all2.txt') as a2:
lines1 = [line for line in a1]
lines2 = [line for line in a2]
for line1 in lines1:
#print(line1+'a')
for line2 in lines2:
if line1[10:-5]==line2[5:-9]:
a.writelines(line1[:-1]+' '+line2[:-1]+'\n')
#print(line1[10:-5],line2[5:-9])
隨機劃分成訓練集驗證集 由于數(shù)據(jù)量太大,自己想要先驗證一下算法踩衩,所以嚼鹉,就先用1000樣本,900訓練驱富,100驗證锚赤。
from random import shuffle
def make_val_split():
"""
Splits the Images in train and test.
Assumes a File all.txt in data_folder.
"""
all_file = "alls.txt"
train_file = "trains.txt"
test_file = "vals.txt"
test_num = 100
files = [line for line in open(all_file)]
shuffle(files)
train = files[:-test_num]
test = files[-test_num:]
#train_file = os.path.join(data_folder, train_file)
#test_file = os.path.join(data_folder, test_file)
with open(train_file, 'w') as file:
for label in train:
file.write(label)
with open(test_file, 'w') as file:
for label in test:
file.write(label)
def main():
make_val_split()
if __name__ == '__main__':
main()
運行train.py遇到的問題:
報錯1
image_file, gt_image_file = file.split(" ")
ValueError: too many values to unpack (expected 2)
查看了一下自己的trains.txt 發(fā)現(xiàn) ColorImage\Record008\Camera 5/171206_030617006_Camera_5.jpg Label\Record008\Camera 5/171206_030617006_Camera_5_bin.png 確實每行都有三個空格
所以需要對代碼進行修改
對kitti_seg_input.py的114行進行修改,修改后如下
#image_file, gt_image_file = file.split(" ")
image_file1,image_file2,gt_image_file1,gt_image_file2 = file.split(" ")
image_file = image_file1+" "+image_file2
gt_image_file = gt_image_file1+" "+gt_image_file2
報錯2:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1,64,2701,3367] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: conv1_2/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](conv1_1/Relu, conv1_2/filter/read)]]
Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.19GiB. Current allocation summary follows.
根據(jù)https://github.com/tensorflow/models/issues/3393估計可能確實是batchsize的問題褐鸥,但是线脚,自己的batchsize已經(jīng)是1了,gg 內存不夠的話,要么resize圖片浑侥,要么換簡單一點的網(wǎng)絡姊舵,
鑒于自己的圖片較大,有(2710, 3384)寓落,所以先resize一下
根據(jù)https://docs.scipy.org/doc/scipy/reference/generated/scipy.misc.imresize.html 使用scipy.misc.imresize(img,0.8) 把圖像縮小為原來的0.8括丁,0.8還是跑不起來,所以改為0.6
img = scipy.misc.imread(image_file, mode='RGB')
image = scipy.misc.imresize(img,0.8)
# Please update Scipy, if mode='RGB' is not avaible
gt_img = scp.misc.imread(gt_image_file, mode='RGB')
gy_image = scipy.misc.imresize(gt_img,0.8)
報錯3:
train中間evaluate的時候報錯
hypes, sess, tv_graph['image_pl'], tv_graph['inf_out'])
File "D:\work\KittiSeg-hotfix-python3.5_support\hypes\../evals/kitti_eval.py", line 71, in evaluate
image_file, gt_file = datum.split(" ")
ValueError: too many values to unpack (expected 2)
對kitti_eval.py進行修改 伶选,和輸入的類似 就ok了
至此史飞,開始訓練過程。
對昨天訓練好的model在自己的數(shù)據(jù)上進行測試仰税,結果發(fā)現(xiàn)mask全白祸憋,也就是說將所有的像素預測為road,排查原因肖卧。將驗證集的圖片輸入亦如此,遂將訓練集圖片輸入掸鹅,還是如此塞帐。說明擬合的時候就出了問題,所以巍沙,應該是標注出了問題葵姥。
在驗證集,訓練集上的結果:
查看.json文件
自己指定的是
"road_color" : [255,0,255],
"background_color" : [255,0,0],
沒有更改句携,和kitti的一樣榔幸,但是聯(lián)想到自己處理出來的數(shù)據(jù)是藍色的,kitti的是紅色的矮嫉,肯定有哪里不對削咆,
查看當時自己改數(shù)據(jù)的代碼
是按照kitti的改的啊,分別用取色器查看了kitti和自己的標定蠢笋,發(fā)現(xiàn)藍色分量和紅色分量的值是反的拨齐,kitti的是紅色分量為255,自己的是藍色分量255昨寞,查了一下瞻惋,opencv讀取圖片是BGR,所以BR兩個分量反了援岩,將.json文件中background_color改為[0,0,255]
重新開始訓練
以后訓練前還是要細細檢查歼狼,這種bug白白浪費了一天時間。