深度學(xué)習(xí)這幾年是一個(gè)很火的技術(shù)遣耍,也有很多涌入這個(gè)領(lǐng)域,對于新手來說炮车,入門很容易舵变,訓(xùn)練一個(gè)簡單的模型酣溃,調(diào)下參數(shù)誰都會。對于任何一個(gè)公司主要有高質(zhì)量的數(shù)據(jù)纪隙,隨便一個(gè)研究生赊豌,本科生都能復(fù)現(xiàn)大神的模型直接部署到應(yīng)用上。困難的是如何找到應(yīng)用的方向绵咱,未來深度學(xué)習(xí)的趨勢也肯定是比拼如何快速優(yōu)質(zhì)應(yīng)用這些技術(shù)為產(chǎn)品服務(wù)碘饼。我在這個(gè)方向也接近有一年經(jīng)驗(yàn)了。最開始做的是缺陷檢測悲伶,接觸最多的是分類模型艾恼,后來也逐漸接觸分割,檢測等等麸锉,在這個(gè)過程中自己也踩了不少坑钠绍,也學(xué)了不少東西,從最開始的在服務(wù)器上追求精度花沉,到現(xiàn)在直接在移動端部署模型柳爽,效率精度兼?zhèn)洹_@篇文章主要是以交通標(biāo)志分類為例給新人提供一個(gè)深度學(xué)習(xí)從數(shù)據(jù)查找碱屁,模型訓(xùn)練泻拦,以及應(yīng)用到生產(chǎn)環(huán)境整個(gè)流程的思路。
大綱
數(shù)據(jù)來源
模型訓(xùn)練
模型轉(zhuǎn)換
模型部署
數(shù)據(jù)來源
對于我們做應(yīng)用的人而言忽媒,最重要的應(yīng)該就是數(shù)據(jù)争拐。數(shù)據(jù)往往是一個(gè)算法公司的主要財(cái)產(chǎn)之一。那么如何為自己的問題獲取對應(yīng)的數(shù)據(jù)呢晦雨?先說結(jié)論:大型公開數(shù)據(jù)集 > 遷移學(xué)習(xí) > 自己標(biāo)注架曹。
如果自己做的問題有大型公開數(shù)據(jù)集最好,那么直接用大型數(shù)據(jù)集就行闹瞧,免去自己查找數(shù)據(jù)的麻煩绑雄,只需要專注于選模型,調(diào)參數(shù)等奥邮。這里給幾個(gè)個(gè)CV方向的數(shù)據(jù)查找網(wǎng)址:
- Google最近推出的數(shù)據(jù)集搜索引擎
https://toolbox.google.com/datasetsearch - Kaggle 上面的一些比賽也會有一些公開數(shù)據(jù)
https://www.kaggle.com - Google檢索
這就主要考驗(yàn)一個(gè)人的檢索能力了万牺,翻墻是必然的。
當(dāng)然使用公開數(shù)據(jù)集的時(shí)候也要遵守相應(yīng)的規(guī)范洽腺,看是不是可以直接拿來商用脚粟。這方面其實(shí)中國公司都不怎么注意。
如果沒有大型公開數(shù)據(jù)集就要看有沒有小一點(diǎn)的數(shù)據(jù)集蘸朋,然后在這個(gè)基礎(chǔ)上使用在大型數(shù)據(jù)集上訓(xùn)練好的模型權(quán)重進(jìn)行遷移學(xué)習(xí)核无。
最開始來現(xiàn)在這個(gè)公司的時(shí)候,給公司做一個(gè)人像分割模型藕坯。和最新版微信的制作自己的表情背后的技術(shù)實(shí)現(xiàn)是一致的团南。但是我們有的數(shù)據(jù)只有2000張左右噪沙,先使用Pascal數(shù)據(jù)訓(xùn)練一個(gè)物體分割數(shù)據(jù)模型,然后在這個(gè)基礎(chǔ)上使用我們自己的人像數(shù)據(jù)遷移學(xué)習(xí)一個(gè)人像模型吐根,最后取得了比較好的效果正歼。可以參考我另一篇文章:
Tensorflow移動端模型轉(zhuǎn)換
實(shí)在沒有辦法可以自己標(biāo)注數(shù)據(jù)拷橘,但是這個(gè)是成本很大的問題朋腋,還有準(zhǔn)確度的問題。當(dāng)然這只是針對小公司而言膜楷,對于大公司數(shù)據(jù)也是壁壘之一旭咽。
這篇文章我們使用一個(gè)交通標(biāo)志數(shù)據(jù)集。
下載訓(xùn)練和測試數(shù)據(jù):
BelgiumTSC_Training (171.3MBytes)
BelgiumTSC_Testing (76.5MBytes)
下載后分別命名為train/val文件夾放在traffic_sign目錄下:
.
├── data
│ └── traffic_sign
│ ├── train
│ └── val
注意這里我們只將這個(gè)數(shù)據(jù)用來學(xué)習(xí)使用赌厅。
獲得數(shù)據(jù)之后穷绵,最好大致檢查一遍所有的數(shù)據(jù),觀察下數(shù)據(jù)質(zhì)量特愿,統(tǒng)計(jì)下每個(gè)類的數(shù)目仲墨,這樣數(shù)據(jù)有個(gè)大致的了解。
模型訓(xùn)練
這里我們使用Keras框架揍障,Tensorflow作為后端來進(jìn)行訓(xùn)練目养,其實(shí)對于一般做移動端應(yīng)用的公司我覺得使用Keras,然后轉(zhuǎn)換到移動端推理框架挺方便的毒嫡。
這里就不介紹Keras使用了癌蚁,有需要的童鞋可以參考我的其他文章,介紹了很多Keras使用兜畸。這里提供一個(gè)訓(xùn)練網(wǎng)絡(luò)和讀取數(shù)據(jù)的易用接口努释,省去很多重復(fù)性工作:
train.py
"""
Easy to use train script for different kinds of networkds and dataset...
@author: Vincent
"""
import os
import glob
from collections import Counter
import numpy as np
import keras
from keras.optimizers import SGD
import keras.backend as K
from keras.models import load_model
from keras.preprocessing.image import ImageDataGenerator
from keras import callbacks
import argparse
from simplenet import SimpleNet
from learning_rate import create_lr_schedule
if __name__ == "__main__":
ap = argparse.ArgumentParser()
ap.add_argument(
'--dataset',
type=str,
default='traffic_sign',
help='directory name of dataset, which should have structure ./train ./val and according classes to suit flow from directory'
)
ap.add_argument(
'--batch_size',
type=int,
default=16,
help='training batch size'
)
ap.add_argument(
'--input_shape',
type=list,
default=(112,112,3),
help='input image shape',
)
ap.add_argument(
'--epochs',
type=int,
default=100,
help='training epochs'
)
ap.add_argument(
'--class_weight_balance_mode',
type=bool,
default=True,
help='whether to enable class weights mode to deal with classs unbalance'
)
ap.add_argument(
'--model',
type=str,
default="SimpleNet",
help="which model to use to train"
)
args = vars(ap.parse_args())
num_classes = len([f for f in os.listdir(os.path.join('/Users/yuhua.cheng/Opt/temp/traffic_sign/data/{0}'.format(args['dataset']),'train'))
if os.path.isdir(os.path.join('/Users/yuhua.cheng/Opt/temp/traffic_sign/data/{0}/train/'.format(args['dataset']),f))])
print("num_classes:", num_classes)
num_train_samples = len(glob.glob('/Users/yuhua.cheng/Opt/temp/traffic_sign/data/{0}/train/*/*.ppm'.format(args['dataset'])))
num_val_samples = len(glob.glob('/Users/yuhua.cheng/Opt/temp/traffic_sign/data/{0}/val/*/*.ppm'.format(args['dataset'])))
if args['class_weight_balance_mode']:
trained_model_path = './models/{0}_with_class_weights.h5'.format(args['dataset'])
else:
trained_model_path = './models/{0}_without_class_weights.h5'.format(args['dataset'])
train_gen = ImageDataGenerator(
rescale = 1/255.,
samplewise_center=True,
samplewise_std_normalization=True,
rotation_range=15,
zoom_range=0.15,
width_shift_range=0.1,
height_shift_range=0.1,
horizontal_flip=True,
)
val_gen = ImageDataGenerator(
rescale = 1/255.,
samplewise_center=True,
samplewise_std_normalization=True
)
train_iter = train_gen.flow_from_directory('/Users/yuhua.cheng/Opt/temp/traffic_sign/data/{0}/train'.format(args['dataset']),
target_size=args['input_shape'][0:2],
batch_size=args['batch_size'],
# color_mode='grayscale',
# save_to_dir='./aug_train',
class_mode='categorical',
interpolation='bicubic')
val_iter = train_gen.flow_from_directory('/Users/yuhua.cheng/Opt/temp/traffic_sign/data/{0}/val'.format(args['dataset']),
target_size=args['input_shape'][0:2],
batch_size=args['batch_size'],
# color_mode='grayscale',
# save_to_dir='./aug_val',
class_mode='categorical',
interpolation='bicubic')
# 針對樣本不均衡問題進(jìn)行weight balance
class_weight = {}
counter = Counter(train_iter.classes)
max_val = float(max(counter.values()))
class_weights = {class_id:max_val/num_images for class_id, num_images in counter.items()}
print("class_weights for samples:", class_weights)
#
model = locals()[args['model']](input_shape=args['input_shape'], num_classes=num_classes)
# sgd = SGD(lr=1e-1, decay=1e-6, momentum=0.9, nesterov=True)
sgd = keras.optimizers.Adadelta()
# create callbacks
tensorboard = callbacks.TensorBoard(log_dir='./logs', write_graph=False)
learning_rate = callbacks.LearningRateScheduler(create_lr_schedule(epochs=args['epochs'], lr_base=0.01, mode='progressive_drops'))
callbacks = [tensorboard, learning_rate]
# compile the model
model.compile(optimizer=sgd, loss='categorical_crossentropy', metrics=['accuracy'])
# train the model
if args['class_weight_balance_mode']:
history = model.fit_generator(
generator = train_iter,
steps_per_epoch = num_train_samples // args['batch_size'],
epochs=args['epochs'],
validation_data = val_iter,
validation_steps = num_val_samples // args['batch_size'],
class_weight = class_weights,
verbose = 1,
callbacks = callbacks)
else:
history = model.fit_generator(
generator = train_iter,
steps_per_epoch = num_train_samples // args['batch_size'],
epochs = args['epochs'],
validation_data = val_iter,
validation_steps = num_val_samples // args['batch_size'],
verbose = 1,
callbacks = callbacks)
model.save(trained_model_path)
我的網(wǎng)絡(luò)結(jié)構(gòu):
simplenet.py
"""
my simplenet for experiments
"""
import keras
from keras.models import load_model, Model
from keras import regularizers, optimizers
from keras.layers import Input, Conv2D, Activation, Dense, Flatten
from keras.layers import BatchNormalization, Dropout
from keras.layers import MaxPooling2D, GlobalMaxPooling2D, GlobalAveragePooling2D
from keras.datasets import cifar10
def conv2d_bn_drop(x, filters, kernel_size=3, strides=1, padding='same', activation='relu', use_bias=False, dropout_rate=0, name=None):
"""Utility fucntion to apply conv + BN + dropout
# Arguments:
# Returns:
Output tensor after applying 'Conv2D' and 'BatchNormalization' and "DropOut'
"""
if name is not None:
conv_name = name + '_conv'
bn_name = name + '_bn'
drop_name = name + '_dropout'
ac_name = name + '_' + activation
else:
conv_name = None
bn_name = None
drop_name = name + '_dropout'
x = Conv2D(filters, kernel_size, strides=strides, padding=padding, use_bias=use_bias, name=conv_name)(x)
x = BatchNormalization(axis=-1, scale=False, name=bn_name)(x)
x = Activation(activation, name=ac_name)(x)
x = Dropout(rate=dropout_rate, name=drop_name)(x)
return x
def conv2d_bn_pooling_drop(x, filters, kernel_size=3, strides=1, padding='same', activation='relu', use_bias=False, pooling="max", dropout_rate=0, name=None):
"""Utility fucntion to apply conv + BN + dropout
# Arguments:
# Returns:
Output tensor after applying 'Conv2D' and 'BatchNormalization' and "DropOut'
"""
if name is not None:
conv_name = name + '_conv'
bn_name = name + '_bn'
drop_name = name + '_dropout'
ac_name = name + '_' + activation
else:
conv_name = None
bn_name = None
drop_name = name + '_dropout'
x = Conv2D(filters, kernel_size, padding=padding, use_bias=use_bias, name=conv_name)(x)
x = BatchNormalization(axis=-1, scale=False, name=bn_name)(x)
if pooling == 'max':
x = MaxPooling2D(pool_size=(2,2), strides=2, padding='valid')(x)
else:
x = AveragePooling2D(pool_size=(2,2), strides=2, padding='valid')(x)
x = Activation(activation, name=ac_name)(x)
x = Dropout(rate=dropout_rate, name=drop_name)(x)
return x
def conv2d_pooling_bn_drop(x, filters, kernel_size=3, strides=1, padding='same', activation='relu', use_bias=False, pooling="max", dropout_rate=0, name=None):
"""Utility fucntion to apply conv + BN + dropout
# Arguments:
# Returns:
Output tensor after applying 'Conv2D' and 'BatchNormalization' and "DropOut'
"""
if name is not None:
conv_name = name + '_conv'
bn_name = name + '_bn'
drop_name = name + '_dropout'
ac_name = name + '_' + activation
else:
conv_name = None
bn_name = None
drop_name = name + '_dropout'
x = Conv2D(filters, kernel_size, padding=padding, use_bias=use_bias, name=conv_name)(x)
if pooling == 'max':
x = MaxPooling2D(pool_size=(2,2), strides=2, padding='valid')(x)
else:
x = AveragePooling2D(pool_size=(2,2), strides=2, padding='valid')(x)
x = BatchNormalization(axis=-1, scale=False, name=bn_name)(x)
x = Activation(activation, name=ac_name)(x)
x = Dropout(rate=dropout_rate, name=drop_name)(x)
return x
def SimpleNet(input_tensor=None, stride=2, weight_decay=1e-2, pooling="Max", act='relu',
input_shape=(227,227,3), num_classes=10):
s = stride
act = 'relu'
if input_tensor is None:
input_tensor = Input(shape=input_shape)
x = conv2d_bn_drop(input_tensor, 64, (7,7), strides=2, padding='same', activation='relu', name="block1_0")
x = conv2d_bn_drop(x, 64, (3,3), padding='same', activation='relu', name="block1_1")
x = conv2d_bn_drop(x, 96, (3,3), padding='same', activation='relu', name="block2_0")
x = conv2d_bn_pooling_drop(x, 96, (3,3), padding='same', activation='relu', name="block2_1")
x = conv2d_bn_drop(x, 96, (3,3), padding='same', activation='relu', name="block2_2")
x = conv2d_bn_drop(x, 128, (3,3), padding='same', activation='relu', name="block3_0")
x = conv2d_pooling_bn_drop(x, 128, (3,3), padding='same', activation='relu', name="block4_0")
x = conv2d_bn_drop(x, 160, (3,3), padding='same', activation='relu', name="block4_1")
x = conv2d_bn_pooling_drop(x, 160, (3,3), padding='same', activation='relu', dropout_rate=0.3, name="block4_2")
x = Conv2D(filters=256, kernel_size=(3,3), strides=1, padding="same", activation='relu', name='block5_0_conv')(x)
x = Conv2D(filters=512, kernel_size=(3,3), strides=1, padding="same", activation='relu', name='cccp5')(x)
x = MaxPooling2D(pool_size=(2,2), strides=2, padding='valid', name='poolcp5')(x)
x = Conv2D(filters=512, kernel_size=(3,3), strides=2, padding="same", activation='relu', name='cccp6')(x)
x = GlobalAveragePooling2D()(x)
x = Dense(num_classes)(x)
x = Activation('softmax')(x)
model = Model(inputs=input_tensor, outputs=x)
model.summary()
return model
if __name__ == '__main__':
input_tensor = Input(shape=(227, 227,3))
model = SimpleNet(input_tensor)
準(zhǔn)備好數(shù)據(jù)和網(wǎng)絡(luò)配置文件之后在tran.py訓(xùn)練腳本中傳入相應(yīng)的參數(shù),直接訓(xùn)練便可咬摇。
訓(xùn)練100 epochs之后就有0.945-0.95的準(zhǔn)確度了伐蒂,說明我們的模型效果還可以。
訓(xùn)練好模型之后一般需要在真實(shí)環(huán)境測試一下:
測試腳本:
import cv2
import os
import glob
import numpy as np
from matplotlib import pyplot as plt
from keras.models import load_model
from imageio import imread
image_files = [f for f in os.listdir('./data/traffic_sign/test') if not f.startswith('.')]
classes = sorted(os.listdir('./data/traffic_sign/val'))
model = load_model('./models/traffic_sign_with_class_weights.h5')
model.summary()
for image_file in image_files:
img = imread(os.path.join('./data/traffic_sign/test', image_file))
plt.subplot(1,2,1)
plt.imshow(img)
plt.title("img")
img = cv2.resize(img, (112,112))
img = img.astype("float32")
img = (img - np.mean(img)) / np.std(img)
img = np.expand_dims(img, 0)
label = np.argmax(model.predict(img))
label_image = imread(glob.glob('./data/traffic_sign/train/{0}/*.ppm'.format(classes[label]))[0])
plt.subplot(1,2,2)
plt.imshow(label_image)
plt.title("predicted img")
plt.show()
看起來還可以哈
一般很多人的文章調(diào)完參數(shù)肛鹏,達(dá)到一定的準(zhǔn)確度逸邦,觀察一些測試數(shù)據(jù),就不介紹了在扰。然而你有這個(gè)模型缕减,如何將它應(yīng)用到生產(chǎn)環(huán)境中還有一段路要走。接下來的部分就介紹如何將訓(xùn)練好的模型移植到移動端健田,打造一個(gè)真正實(shí)時(shí)可用的App烛卧。
模型轉(zhuǎn)換
這一小結(jié)介紹如何將模型轉(zhuǎn)換到移動端可用框架。
現(xiàn)有的移動端推理框架有很多妓局,如CoreML, tensorflow lite总放,Caffe2等。需要了解的話好爬,可以參考下我這篇文章: Tensorflow移動端模型轉(zhuǎn)換局雄。國內(nèi)ncnn的口碑和速度算是比較好的了,用的人也比較多存炮。這里我們選用蘋果自帶的CoreML炬搭,CoreML入門比較簡單,不需要太多配置穆桂,將模型格式轉(zhuǎn)化正確便可宫盔,筆者不是做IOS開發(fā)的,也是在前人的基礎(chǔ)上進(jìn)行一些修改享完。
我們遇到的第一個(gè)問題是需要將Keras模型轉(zhuǎn)換到CoreML可用的格式, 這里提供一個(gè)轉(zhuǎn)換腳本(版本不同會有接口的變換, 這里是python2, Keras 2.1.6, tensorflow 1.12.0):
import coremltools
import keras
from keras.models import load_model
from keras.utils.generic_utils import CustomObjectScope
class_labels = []
for i in range(62):
class_labels.append(str(i))
with CustomObjectScope({'relu6': keras.applications.mobilenet.relu6}):
keras_model = load_model('traffic_sign_with_class_weights.h5')
coreml_model = coremltools.converters.keras.convert(keras_model,
input_names=['input_1'],
image_input_names='input_1',
output_names='activation_1',
image_scale=2/255.0,
red_bias=-1,
green_bias=-1,
blue_bias=-1,
class_labels=class_labels)
coreml_model.save('traffic_sign_with_class_weights.mlmodel')
里面具體的參數(shù)意義可以參考我的Tensorflow移動端模型轉(zhuǎn)換灼芭。
正確轉(zhuǎn)換之后我們就得到CoreML下可用的深度學(xué)習(xí)模型了,剩下的只需要在IOS工程中正確調(diào)用便可般又,稍微有些IOS 開發(fā)相關(guān)的知識就能完成彼绷。
有需要的童鞋可以關(guān)注下這個(gè)github list:
https://github.com/likedan/Awesome-CoreML-Models
里面有很多CoreML相關(guān)的Demo,可以用來進(jìn)行二次開發(fā)茴迁。
最后的結(jié)果, 分類的label按照訓(xùn)練數(shù)據(jù)的子類文件夾排序:
至此我們就完成一個(gè)深度學(xué)習(xí)應(yīng)用的開發(fā)了寄悯,這里只是拋磚引玉,要實(shí)現(xiàn)其他功能的應(yīng)用堕义,流程也大致如此猜旬。
希望這篇文章可以對入門計(jì)算機(jī)視覺的童鞋有所裨益,有什么問題都可以留言或者私信討論倦卖。
Todo:
完成ncnn 調(diào)用的 demo