人臉識(shí)別,基于人臉部特征信息識(shí)別身份的生物識(shí)別技術(shù)三娩。攝像機(jī)庵芭、攝像頭采集人臉圖像或視頻流,自動(dòng)檢測(cè)雀监、跟蹤圖像中人臉双吆,做臉部相關(guān)技術(shù)處理,人臉檢測(cè)会前、人臉關(guān)鍵點(diǎn)檢測(cè)好乐、人臉驗(yàn)證等⊥咭耍《麻省理工科技評(píng)論》(MIT Technology Review)蔚万,2017年全球十大突破性技術(shù)榜單,支付寶“刷臉支付”(Paying with Your Face)入圍临庇。
人臉識(shí)別優(yōu)勢(shì)反璃,非強(qiáng)制性(采集方式不容易被察覺(jué),被識(shí)別人臉圖像可主動(dòng)獲取)假夺、非接觸性(用戶(hù)不需要與設(shè)備接觸)淮蜈、并發(fā)性(可同時(shí)多人臉檢測(cè)、跟蹤已卷、識(shí)別)礁芦。深度學(xué)習(xí)前,人臉識(shí)別兩步驟:高維人工特征提取悼尾、降維。傳統(tǒng)人臉識(shí)別技術(shù)基于可見(jiàn)光圖像肖方。深度學(xué)習(xí)+大數(shù)據(jù)(海量有標(biāo)注人臉數(shù)據(jù))為人臉識(shí)別領(lǐng)域主流技術(shù)路線(xiàn)闺魏。神經(jīng)網(wǎng)絡(luò)人臉識(shí)別技術(shù),大量樣本圖像訓(xùn)練識(shí)別模型俯画,無(wú)需人工選取特征析桥,樣本訓(xùn)練過(guò)程自行學(xué)習(xí),識(shí)別準(zhǔn)確率可以達(dá)到99%艰垂。
人臉識(shí)別技術(shù)流程泡仗。
人臉圖像采集、檢測(cè)猜憎。人臉圖像采集娩怎,攝像頭把人臉圖像采集下來(lái),靜態(tài)圖像胰柑、動(dòng)態(tài)圖像截亦、不同位置爬泥、不同表情。用戶(hù)在采集設(shè)備拍報(bào)范圍內(nèi)崩瓤,采集設(shè)置自動(dòng)搜索并拍攝袍啡。人臉檢測(cè)屬于目標(biāo)檢測(cè)(object detection)。對(duì)要檢測(cè)目標(biāo)對(duì)象概率統(tǒng)計(jì)却桶,得到待檢測(cè)對(duì)象特征境输,建立目標(biāo)檢測(cè)模型。用模型匹配輸入圖像颖系,輸出匹配區(qū)域嗅剖。人臉檢測(cè)是人臉識(shí)別預(yù)處理,準(zhǔn)確標(biāo)定人臉在圖像的位置大小集晚。人臉圖像模式特征豐富窗悯,直方圖特征、顏色特征偷拔、模板特征蒋院、結(jié)構(gòu)特征、哈爾特征(Haar-like feature)莲绰。人臉檢測(cè)挑出有用信息欺旧,用特征檢測(cè)人臉。人臉檢測(cè)算法蛤签,模板匹配模型辞友、Adaboost模型,Adaboost模型速度震肮。精度綜合性能最好称龙,訓(xùn)練慢、檢測(cè)快戳晌,可達(dá)到視頻流實(shí)時(shí)檢測(cè)效果鲫尊。
人臉圖像預(yù)處理÷儋耍基于人臉檢測(cè)結(jié)果疫向,處理圖像,服務(wù)特征提取豪嚎。系統(tǒng)獲取人臉圖像受到各種條件限制搔驼、隨機(jī)干擾,需縮放侈询、旋轉(zhuǎn)舌涨、拉伸、光線(xiàn)補(bǔ)償扔字、灰度變換泼菌、直方圖均衡化谍肤、規(guī)范化、幾何校正哗伯、過(guò)濾荒揣、銳化等圖像預(yù)處理。
人臉圖像特征提取焊刹。人臉圖像信息數(shù)字化系任,人臉圖像轉(zhuǎn)變?yōu)橐淮當(dāng)?shù)字(特征向量)。如虐块,眼睛左邊俩滥、嘴唇右邊、鼻子贺奠、下巴位置霜旧,特征點(diǎn)間歐氏距離、曲率儡率、角度提取出特征分量挂据,相關(guān)特征連接成長(zhǎng)特征向量硫朦。
人臉圖像匹配烛恤、識(shí)別。提取人臉圖像特征數(shù)據(jù)與數(shù)據(jù)庫(kù)存儲(chǔ)人臉特征模板搜索匹配较剃,根據(jù)相似程度對(duì)身份信息進(jìn)行判斷眉孩,設(shè)定閾值个绍,相似度越過(guò)閾值,輸出匹配結(jié)果浪汪。確認(rèn)巴柿,一對(duì)一(1:1)圖像比較,證明“你就是你”死遭,金融核實(shí)身份篮洁、信息安全領(lǐng)域。辨認(rèn)殃姓,一對(duì)多(1:N)圖像匹配,“N人中找你”瓦阐,視頻流蜗侈,人走進(jìn)識(shí)別范圍就完成識(shí)別,安防領(lǐng)域睡蟋。
人臉識(shí)別分類(lèi)踏幻。
人臉檢測(cè)。檢測(cè)戳杀、定位圖片人臉该面,返回高業(yè)餓呀人臉框坐標(biāo)夭苗。對(duì)人臉?lè)治觥⑻幚淼牡谝徊礁糇骸题造!盎瑒?dòng)窗口”,選擇圖像矩形區(qū)域作滑動(dòng)窗口猾瘸,窗口中提取特征對(duì)圖像區(qū)域描述界赔,根據(jù)特征描述判斷窗口是否人臉。不斷遍歷需要觀察窗口牵触。
人臉關(guān)鍵點(diǎn)檢測(cè)淮悼。定位、返回人臉五官揽思、輪廓關(guān)鍵點(diǎn)坐標(biāo)位置袜腥。人臉輪廓、眼睛钉汗、眉毛羹令、嘴唇、鼻子輪廓儡湾。Face++提供高達(dá)106點(diǎn)關(guān)鍵點(diǎn)特恬。人臉關(guān)鍵點(diǎn)定位技術(shù),級(jí)聯(lián)形回歸(cascaded shape regression, CSR)徐钠。人臉識(shí)別癌刽,基于DeepID網(wǎng)絡(luò)結(jié)構(gòu)。DeepID網(wǎng)絡(luò)結(jié)構(gòu)類(lèi)似卷積神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)尝丐,倒數(shù)第二層显拜,有DeepID層,與卷積層4?最大池化層3相連爹袁,卷積神經(jīng)網(wǎng)絡(luò)層數(shù)越高視野域越大远荠,既考慮局部特征,又考慮全局特征失息。輸入層 31x39x1?卷積層1 28x36x20(卷積核4x4x1)譬淳、最大池化層1 12x18x20(過(guò)濾器2x2)、卷積層2 12x16x20(卷積核3x3x20)盹兢、最大池化層2 6x8x40(過(guò)濾器2x2)邻梆、卷積層3 4x6x60(卷積核3x3x40)、最大池化層2 2x3x60(過(guò)濾器2x2)绎秒、卷積層4 2x2x80(卷積核2x2x60)浦妄、DeepID層 1x160、全連接層 Softmax〖谅Γ《Deep Learning Face Representation from Predicting 10000 Classes》 http://mmlab.ie.cuhk.edu.hk/pdf/YiSun_CVPR14.pdf 蠢涝。
人臉驗(yàn)證。分析兩張人臉同一人可能性大小阅懦。輸入兩張人臉和二,得到置信度分類(lèi)、相應(yīng)閾值故黑,評(píng)估相似度儿咱。
人臉屬性檢測(cè)。人臉屬性辯識(shí)场晶、人臉情緒分析混埠。https://www.betaface.com/wpa/ 在線(xiàn)人臉識(shí)別測(cè)試。給出人年齡诗轻、是否有胡子钳宪、情緒(高興、正常扳炬、生氣吏颖、憤怒)、性別恨樟、是否帶眼鏡半醉、膚色。
人臉識(shí)別應(yīng)用劝术,美圖秀秀美顏應(yīng)用缩多、世紀(jì)佳緣查看潛在配偶“面相”相似度,支付領(lǐng)域“刷臉支付”养晋,安防領(lǐng)域“人臉鑒權(quán)”衬吆。Face++、商湯科技绳泉,提供人臉識(shí)別SDK逊抡。
人臉檢測(cè)。https://github.com/davidsandberg/facenet 零酪。
Florian Schroff冒嫡、Dmitry Kalenichenko、James Philbin論文《FaceNet: A Unified Embedding for Face Recognition and Clustering》 https://arxiv.org/abs/1503.03832 四苇。https://github.com/davidsandberg/facenet/wiki/Validate-on-lfw 孝凌。
LFW(Labeled Faces in the Wild Home)數(shù)據(jù)集。http://vis-www.cs.umass.edu/lfw/ 蛔琅。美國(guó)馬薩諸塞大學(xué)阿姆斯特分校計(jì)算機(jī)視覺(jué)實(shí)驗(yàn)室整理。13233張圖片,5749人罗售。4096人只有一張圖片辜窑,1680人多于一張。每張圖片尺寸250x250寨躁。人臉圖片在每個(gè)人物名字文件夾下穆碎。
數(shù)據(jù)預(yù)處理。校準(zhǔn)代碼 https://github.com/davidsandberg/facenet/blob/master/src/align/align_dataset_mtcnn.py 职恳。
檢測(cè)所用數(shù)據(jù)集校準(zhǔn)為和預(yù)訓(xùn)練模型所用數(shù)據(jù)集大小一致所禀。
設(shè)置環(huán)境變量
export PYTHONPATH=[...]/facenet/src
校準(zhǔn)命令
for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done
預(yù)訓(xùn)練模型20170216-091149.zip https://drive.google.com/file/d/0B5MzpY9kBtDVZ2RpVDYwWmxoSUk 。
訓(xùn)練集 MS-Celeb-1M數(shù)據(jù)集 https://www.microsoft.com/en-us/research/project/ms-celeb-1m-challenge-recognizing-one-million-celebrities-real-world/ 放钦。微軟人臉識(shí)別數(shù)據(jù)庫(kù)色徘,名人榜選擇前100萬(wàn)名人,搜索引擎采集每個(gè)名人100張人臉圖片操禀。預(yù)訓(xùn)練模型準(zhǔn)確率0.993+-0.004褂策。
檢測(cè)。python src/validate_on_lfw.py datasets/lfw/lfw_mtcnnpy_160 models
基準(zhǔn)比較颓屑,采用facenet/data/pairs.txt斤寂,官方隨機(jī)生成數(shù)據(jù),匹配和不匹配人名和圖片編號(hào)揪惦。
十折交叉驗(yàn)證(10-fold cross validation)遍搞,精度測(cè)試方法。數(shù)據(jù)集分成10份器腋,輪流將其中9份做訓(xùn)練集溪猿,1份做測(cè)試保,10次結(jié)果均值作算法精度估計(jì)蒂培。一般需要多次10折交叉驗(yàn)證求均值再愈。
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import argparse
import facenet
import lfw
import os
import sys
import math
from sklearn import metrics
from scipy.optimize import brentq
from scipy import interpolate
def main(args):
with tf.Graph().as_default():
with tf.Session() as sess:
# Read the file containing the pairs used for testing
# 1. 讀入之前的pairs.txt文件
# 讀入后如[['Abel_Pacheco','1','4']]
pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))
# Get the paths for the corresponding images
# 獲取文件路徑和是否匹配關(guān)系對(duì)
paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext)
# Load the model
# 2. 加載模型
facenet.load_model(args.model)
# Get input and output tensors
# 獲取輸入輸出張量
images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
#image_size = images_placeholder.get_shape()[1] # For some reason this doesn't work for frozen graphs
image_size = args.image_size
embedding_size = embeddings.get_shape()[1]
# Run forward pass to calculate embeddings
# 3. 使用前向傳播驗(yàn)證
print('Runnning forward pass on LFW images')
batch_size = args.lfw_batch_size
nrof_images = len(paths)
nrof_batches = int(math.ceil(1.0*nrof_images / batch_size)) # 總共批次數(shù)
emb_array = np.zeros((nrof_images, embedding_size))
for i in range(nrof_batches):
start_index = i*batch_size
end_index = min((i+1)*batch_size, nrof_images)
paths_batch = paths[start_index:end_index]
images = facenet.load_data(paths_batch, False, False, image_size)
feed_dict = { images_placeholder:images, phase_train_placeholder:False }
emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict)
# 4. 計(jì)算準(zhǔn)確率、驗(yàn)證率护戳,十折交叉驗(yàn)證方法
tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(emb_array,
actual_issame, nrof_folds=args.lfw_nrof_folds)
print('Accuracy: %1.3f+-%1.3f' % (np.mean(accuracy), np.std(accuracy)))
print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
# 得到auc值
auc = metrics.auc(fpr, tpr)
print('Area Under Curve (AUC): %1.3f' % auc)
# 1得到錯(cuò)誤率(eer)
eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)
print('Equal Error Rate (EER): %1.3f' % eer)
def parse_arguments(argv):
parser = argparse.ArgumentParser()
parser.add_argument('lfw_dir', type=str,
help='Path to the data directory containing aligned LFW face patches.')
parser.add_argument('--lfw_batch_size', type=int,
help='Number of images to process in a batch in the LFW test set.', default=100)
parser.add_argument('model', type=str,
help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file')
parser.add_argument('--image_size', type=int,
help='Image size (height, width) in pixels.', default=160)
parser.add_argument('--lfw_pairs', type=str,
help='The file containing the pairs to use for validation.', default='data/pairs.txt')
parser.add_argument('--lfw_file_ext', type=str,
help='The file extension for the LFW dataset.', default='png', choices=['jpg', 'png'])
parser.add_argument('--lfw_nrof_folds', type=int,
help='Number of folds to use for cross validation. Mainly used for testing.', default=10)
return parser.parse_args(argv)
if __name__ == '__main__':
main(parse_arguments(sys.argv[1:]))
性別翎冲、年齡識(shí)別。https://github.com/dpressel/rude-carnie 媳荒。
Adience 數(shù)據(jù)集抗悍。http://www.openu.ac.il/home/hassner/Adience/data.html#agegender 。26580張圖片钳枕,2284類(lèi)缴渊,年齡范圍8個(gè)區(qū)段(02?46、813鱼炒、1520衔沼、2532、3843、4853指蚁、60)菩佑,含有噪聲、姿勢(shì)凝化、光照變化稍坯。aligned # 經(jīng)過(guò)剪裁對(duì)齊數(shù)據(jù),faces # 原始數(shù)據(jù)搓劫。fold_0_data.txt至fold_4_data.txt 全部數(shù)據(jù)標(biāo)記瞧哟。fold_frontal_0_data.txt至fold_frontal_4_data.txt 僅用近似正面姿態(tài)面部標(biāo)記。數(shù)據(jù)結(jié)構(gòu) user_id 用戶(hù)Flickr帳戶(hù)ID枪向、original_image 圖片文件名勤揩、face_id 人標(biāo)識(shí)符、age遣疯、gender雄可、x、y缠犀、dx数苫、dy 人臉邊框、tilt_ang 切斜角度辨液、fiducial_yaw_angle 基準(zhǔn)偏移角度虐急、fiducial_score 基準(zhǔn)分?jǐn)?shù)。https://www.flickr.com/
數(shù)據(jù)預(yù)處理滔迈。腳本把數(shù)據(jù)處理成TFRecords格式止吁。https://github.com/dpressel/rude-carnie/blob/master/preproc.py 。https://github.com/GilLevi/AgeGenderDeepLearning/tree/master/Folds文件夾燎悍,已經(jīng)對(duì)訓(xùn)練集敬惦、測(cè)試集劃分、標(biāo)注谈山。gender_train.txt俄删、gender_val.txt 圖片列表 Adience 數(shù)據(jù)集處理TFRecords文件。圖片處理為大小256x256 JPEG編碼RGB圖像奏路。tf.python_io.TFRecordWriter寫(xiě)入TFRecords文件畴椰,輸出文件output_file。
構(gòu)建模型鸽粉。年齡斜脂、性別訓(xùn)練模型,Gil Levi触机、Tal Hassner論文《Age and Gender Classification Using Convolutional Neural Networks》http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.722.9654&rank=1 帚戳。模型 https://github.com/dpressel/rude-carnie/blob/master/model.py 玷或。tenforflow.contrib.slim。
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import time
import os
import numpy as np
import tensorflow as tf
from data import distorted_inputs
import re
from tensorflow.contrib.layers import *
from tensorflow.contrib.slim.python.slim.nets.inception_v3 import inception_v3_base
TOWER_NAME = 'tower'
def select_model(name):
if name.startswith('inception'):
print('selected (fine-tuning) inception model')
return inception_v3
elif name == 'bn':
print('selected batch norm model')
return levi_hassner_bn
print('selected default model')
return levi_hassner
def get_checkpoint(checkpoint_path, requested_step=None, basename='checkpoint'):
if requested_step is not None:
model_checkpoint_path = '%s/%s-%s' % (checkpoint_path, basename, requested_step)
if os.path.exists(model_checkpoint_path) is None:
print('No checkpoint file found at [%s]' % checkpoint_path)
exit(-1)
print(model_checkpoint_path)
print(model_checkpoint_path)
return model_checkpoint_path, requested_step
ckpt = tf.train.get_checkpoint_state(checkpoint_path)
if ckpt and ckpt.model_checkpoint_path:
# Restore checkpoint as described in top of this program
print(ckpt.model_checkpoint_path)
global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
return ckpt.model_checkpoint_path, global_step
else:
print('No checkpoint file found at [%s]' % checkpoint_path)
exit(-1)
def _activation_summary(x):
tensor_name = re.sub('%s_[0-9]*/' % TOWER_NAME, '', x.op.name)
tf.summary.histogram(tensor_name + '/activations', x)
tf.summary.scalar(tensor_name + '/sparsity', tf.nn.zero_fraction(x))
def inception_v3(nlabels, images, pkeep, is_training):
batch_norm_params = {
"is_training": is_training,
"trainable": True,
# Decay for the moving averages.
"decay": 0.9997,
# Epsilon to prevent 0s in variance.
"epsilon": 0.001,
# Collection containing the moving mean and moving variance.
"variables_collections": {
"beta": None,
"gamma": None,
"moving_mean": ["moving_vars"],
"moving_variance": ["moving_vars"],
}
}
weight_decay = 0.00004
stddev=0.1
weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
with tf.variable_scope("InceptionV3", "InceptionV3", [images]) as scope:
with tf.contrib.slim.arg_scope(
[tf.contrib.slim.conv2d, tf.contrib.slim.fully_connected],
weights_regularizer=weights_regularizer,
trainable=True):
with tf.contrib.slim.arg_scope(
[tf.contrib.slim.conv2d],
weights_initializer=tf.truncated_normal_initializer(stddev=stddev),
activation_fn=tf.nn.relu,
normalizer_fn=batch_norm,
normalizer_params=batch_norm_params):
net, end_points = inception_v3_base(images, scope=scope)
with tf.variable_scope("logits"):
shape = net.get_shape()
net = avg_pool2d(net, shape[1:3], padding="VALID", scope="pool")
net = tf.nn.dropout(net, pkeep, name='droplast')
net = flatten(net, scope="flatten")
with tf.variable_scope('output') as scope:
weights = tf.Variable(tf.truncated_normal([2048, nlabels], mean=0.0, stddev=0.01), name='weights')
biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
output = tf.add(tf.matmul(net, weights), biases, name=scope.name)
_activation_summary(output)
return output
def levi_hassner_bn(nlabels, images, pkeep, is_training):
batch_norm_params = {
"is_training": is_training,
"trainable": True,
# Decay for the moving averages.
"decay": 0.9997,
# Epsilon to prevent 0s in variance.
"epsilon": 0.001,
# Collection containing the moving mean and moving variance.
"variables_collections": {
"beta": None,
"gamma": None,
"moving_mean": ["moving_vars"],
"moving_variance": ["moving_vars"],
}
}
weight_decay = 0.0005
weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
with tf.variable_scope("LeviHassnerBN", "LeviHassnerBN", [images]) as scope:
with tf.contrib.slim.arg_scope(
[convolution2d, fully_connected],
weights_regularizer=weights_regularizer,
biases_initializer=tf.constant_initializer(1.),
weights_initializer=tf.random_normal_initializer(stddev=0.005),
trainable=True):
with tf.contrib.slim.arg_scope(
[convolution2d],
weights_initializer=tf.random_normal_initializer(stddev=0.01),
normalizer_fn=batch_norm,
normalizer_params=batch_norm_params):
conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
conv2 = convolution2d(pool1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2')
pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
conv3 = convolution2d(pool2, 384, [3, 3], [1, 1], padding='SAME', biases_initializer=tf.constant_initializer(0.), scope='conv3')
pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')
# can use tf.contrib.layer.flatten
flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')
full1 = fully_connected(flat, 512, scope='full1')
drop1 = tf.nn.dropout(full1, pkeep, name='drop1')
full2 = fully_connected(drop1, 512, scope='full2')
drop2 = tf.nn.dropout(full2, pkeep, name='drop2')
with tf.variable_scope('output') as scope:
weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')
biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)
return output
def levi_hassner(nlabels, images, pkeep, is_training):
weight_decay = 0.0005
weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
with tf.variable_scope("LeviHassner", "LeviHassner", [images]) as scope:
with tf.contrib.slim.arg_scope(
[convolution2d, fully_connected],
weights_regularizer=weights_regularizer,
biases_initializer=tf.constant_initializer(1.),
weights_initializer=tf.random_normal_initializer(stddev=0.005),
trainable=True):
with tf.contrib.slim.arg_scope(
[convolution2d],
weights_initializer=tf.random_normal_initializer(stddev=0.01)):
conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
norm1 = tf.nn.local_response_normalization(pool1, 5, alpha=0.0001, beta=0.75, name='norm1')
conv2 = convolution2d(norm1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2')
pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
norm2 = tf.nn.local_response_normalization(pool2, 5, alpha=0.0001, beta=0.75, name='norm2')
conv3 = convolution2d(norm2, 384, [3, 3], [1, 1], biases_initializer=tf.constant_initializer(0.), padding='SAME', scope='conv3')
pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')
flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')
full1 = fully_connected(flat, 512, scope='full1')
drop1 = tf.nn.dropout(full1, pkeep, name='drop1')
full2 = fully_connected(drop1, 512, scope='full2')
drop2 = tf.nn.dropout(full2, pkeep, name='drop2')
with tf.variable_scope('output') as scope:
weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')
biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)
return output
訓(xùn)練模型片任。https://github.com/dpressel/rude-carnie/blob/master/train.py 庐椒。
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from six.moves import xrange
from datetime import datetime
import time
import os
import numpy as np
import tensorflow as tf
from data import distorted_inputs
from model import select_model
import json
import re
LAMBDA = 0.01
MOM = 0.9
tf.app.flags.DEFINE_string('pre_checkpoint_path', '',
"""If specified, restore this pretrained model """
"""before beginning any training.""")
tf.app.flags.DEFINE_string('train_dir', '/home/dpressel/dev/work/AgeGenderDeepLearning/Folds/tf/test_fold_is_0',
'Training directory')
tf.app.flags.DEFINE_boolean('log_device_placement', False,
"""Whether to log device placement.""")
tf.app.flags.DEFINE_integer('num_preprocess_threads', 4,
'Number of preprocessing threads')
tf.app.flags.DEFINE_string('optim', 'Momentum',
'Optimizer')
tf.app.flags.DEFINE_integer('image_size', 227,
'Image size')
tf.app.flags.DEFINE_float('eta', 0.01,
'Learning rate')
tf.app.flags.DEFINE_float('pdrop', 0.,
'Dropout probability')
tf.app.flags.DEFINE_integer('max_steps', 40000,
'Number of iterations')
tf.app.flags.DEFINE_integer('steps_per_decay', 10000,
'Number of steps before learning rate decay')
tf.app.flags.DEFINE_float('eta_decay_rate', 0.1,
'Learning rate decay')
tf.app.flags.DEFINE_integer('epochs', -1,
'Number of epochs')
tf.app.flags.DEFINE_integer('batch_size', 128,
'Batch size')
tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
'Checkpoint name')
tf.app.flags.DEFINE_string('model_type', 'default',
'Type of convnet')
tf.app.flags.DEFINE_string('pre_model',
'',#'./inception_v3.ckpt',
'checkpoint file')
FLAGS = tf.app.flags.FLAGS
# Every 5k steps cut learning rate in half
def exponential_staircase_decay(at_step=10000, decay_rate=0.1):
print('decay [%f] every [%d] steps' % (decay_rate, at_step))
def _decay(lr, global_step):
return tf.train.exponential_decay(lr, global_step,
at_step, decay_rate, staircase=True)
return _decay
def optimizer(optim, eta, loss_fn, at_step, decay_rate):
global_step = tf.Variable(0, trainable=False)
optz = optim
if optim == 'Adadelta':
optz = lambda lr: tf.train.AdadeltaOptimizer(lr, 0.95, 1e-6)
lr_decay_fn = None
elif optim == 'Momentum':
optz = lambda lr: tf.train.MomentumOptimizer(lr, MOM)
lr_decay_fn = exponential_staircase_decay(at_step, decay_rate)
return tf.contrib.layers.optimize_loss(loss_fn, global_step, eta, optz, clip_gradients=4., learning_rate_decay_fn=lr_decay_fn)
def loss(logits, labels):
labels = tf.cast(labels, tf.int32)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=logits, labels=labels, name='cross_entropy_per_example')
cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
tf.add_to_collection('losses', cross_entropy_mean)
losses = tf.get_collection('losses')
regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
total_loss = cross_entropy_mean + LAMBDA * sum(regularization_losses)
tf.summary.scalar('tl (raw)', total_loss)
#total_loss = tf.add_n(losses + regularization_losses, name='total_loss')
loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg')
loss_averages_op = loss_averages.apply(losses + [total_loss])
for l in losses + [total_loss]:
tf.summary.scalar(l.op.name + ' (raw)', l)
tf.summary.scalar(l.op.name, loss_averages.average(l))
with tf.control_dependencies([loss_averages_op]):
total_loss = tf.identity(total_loss)
return total_loss
def main(argv=None):
with tf.Graph().as_default():
model_fn = select_model(FLAGS.model_type)
# Open the metadata file and figure out nlabels, and size of epoch
# 打開(kāi)元數(shù)據(jù)文件md.json,這個(gè)文件是在預(yù)處理數(shù)據(jù)時(shí)生成蚂踊。找出nlabels、epoch大小
input_file = os.path.join(FLAGS.train_dir, 'md.json')
print(input_file)
with open(input_file, 'r') as f:
md = json.load(f)
images, labels, _ = distorted_inputs(FLAGS.train_dir, FLAGS.batch_size, FLAGS.image_size, FLAGS.num_preprocess_threads)
logits = model_fn(md['nlabels'], images, 1-FLAGS.pdrop, True)
total_loss = loss(logits, labels)
train_op = optimizer(FLAGS.optim, FLAGS.eta, total_loss, FLAGS.steps_per_decay, FLAGS.eta_decay_rate)
saver = tf.train.Saver(tf.global_variables())
summary_op = tf.summary.merge_all()
sess = tf.Session(config=tf.ConfigProto(
log_device_placement=FLAGS.log_device_placement))
tf.global_variables_initializer().run(session=sess)
# This is total hackland, it only works to fine-tune iv3
# 本例可以輸入預(yù)訓(xùn)練模型Inception V3笔宿,可用來(lái)微調(diào) Inception V3
if FLAGS.pre_model:
inception_variables = tf.get_collection(
tf.GraphKeys.VARIABLES, scope="InceptionV3")
restorer = tf.train.Saver(inception_variables)
restorer.restore(sess, FLAGS.pre_model)
if FLAGS.pre_checkpoint_path:
if tf.gfile.Exists(FLAGS.pre_checkpoint_path) is True:
print('Trying to restore checkpoint from %s' % FLAGS.pre_checkpoint_path)
restorer = tf.train.Saver()
tf.train.latest_checkpoint(FLAGS.pre_checkpoint_path)
print('%s: Pre-trained model restored from %s' %
(datetime.now(), FLAGS.pre_checkpoint_path))
# 將ckpt文件存儲(chǔ)在run-(pid)目錄
run_dir = '%s/run-%d' % (FLAGS.train_dir, os.getpid())
checkpoint_path = '%s/%s' % (run_dir, FLAGS.checkpoint)
if tf.gfile.Exists(run_dir) is False:
print('Creating %s' % run_dir)
tf.gfile.MakeDirs(run_dir)
tf.train.write_graph(sess.graph_def, run_dir, 'model.pb', as_text=True)
tf.train.start_queue_runners(sess=sess)
summary_writer = tf.summary.FileWriter(run_dir, sess.graph)
steps_per_train_epoch = int(md['train_counts'] / FLAGS.batch_size)
num_steps = FLAGS.max_steps if FLAGS.epochs < 1 else FLAGS.epochs * steps_per_train_epoch
print('Requested number of steps [%d]' % num_steps)
for step in xrange(num_steps):
start_time = time.time()
_, loss_value = sess.run([train_op, total_loss])
duration = time.time() - start_time
assert not np.isnan(loss_value), 'Model diverged with loss = NaN'
# 每10步記錄一次摘要文件犁钟,保存一個(gè)檢查點(diǎn)文件
if step % 10 == 0:
num_examples_per_step = FLAGS.batch_size
examples_per_sec = num_examples_per_step / duration
sec_per_batch = float(duration)
format_str = ('%s: step %d, loss = %.3f (%.1f examples/sec; %.3f ' 'sec/batch)')
print(format_str % (datetime.now(), step, loss_value,
examples_per_sec, sec_per_batch))
# Loss only actually evaluated every 100 steps?
if step % 100 == 0:
summary_str = sess.run(summary_op)
summary_writer.add_summary(summary_str, step)
if step % 1000 == 0 or (step + 1) == num_steps:
saver.save(sess, checkpoint_path, global_step=step)
if __name__ == '__main__':
tf.app.run()
驗(yàn)證模型。https://github.com/dpressel/rude-carnie/blob/master/guess.py 泼橘。
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import math
import time
from data import inputs
import numpy as np
import tensorflow as tf
from model import select_model, get_checkpoint
from utils import *
import os
import json
import csv
RESIZE_FINAL = 227
GENDER_LIST =['M','F']
AGE_LIST = ['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)']
MAX_BATCH_SZ = 128
tf.app.flags.DEFINE_string('model_dir', '',
'Model directory (where training data lives)')
tf.app.flags.DEFINE_string('class_type', 'age',
'Classification type (age|gender)')
tf.app.flags.DEFINE_string('device_id', '/cpu:0',
'What processing unit to execute inference on')
tf.app.flags.DEFINE_string('filename', '',
'File (Image) or File list (Text/No header TSV) to process')
tf.app.flags.DEFINE_string('target', '',
'CSV file containing the filename processed along with best guess and score')
tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
'Checkpoint basename')
tf.app.flags.DEFINE_string('model_type', 'default',
'Type of convnet')
tf.app.flags.DEFINE_string('requested_step', '', 'Within the model directory, a requested step to restore e.g., 9000')
tf.app.flags.DEFINE_boolean('single_look', False, 'single look at the image or multiple crops')
tf.app.flags.DEFINE_string('face_detection_model', '', 'Do frontal face detection with model specified')
tf.app.flags.DEFINE_string('face_detection_type', 'cascade', 'Face detection model type (yolo_tiny|cascade)')
FLAGS = tf.app.flags.FLAGS
def one_of(fname, types):
return any([fname.endswith('.' + ty) for ty in types])
def resolve_file(fname):
if os.path.exists(fname): return fname
for suffix in ('.jpg', '.png', '.JPG', '.PNG', '.jpeg'):
cand = fname + suffix
if os.path.exists(cand):
return cand
return None
def classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer):
try:
num_batches = math.ceil(len(image_files) / MAX_BATCH_SZ)
pg = ProgressBar(num_batches)
for j in range(num_batches):
start_offset = j * MAX_BATCH_SZ
end_offset = min((j + 1) * MAX_BATCH_SZ, len(image_files))
batch_image_files = image_files[start_offset:end_offset]
print(start_offset, end_offset, len(batch_image_files))
image_batch = make_multi_image_batch(batch_image_files, coder)
batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})
batch_sz = batch_results.shape[0]
for i in range(batch_sz):
output_i = batch_results[i]
best_i = np.argmax(output_i)
best_choice = (label_list[best_i], output_i[best_i])
print('Guess @ 1 %s, prob = %.2f' % best_choice)
if writer is not None:
f = batch_image_files[i]
writer.writerow((f, best_choice[0], '%.2f' % best_choice[1]))
pg.update()
pg.done()
except Exception as e:
print(e)
print('Failed to run all images')
def classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer):
try:
print('Running file %s' % image_file)
image_batch = make_multi_crop_batch(image_file, coder)
batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})
output = batch_results[0]
batch_sz = batch_results.shape[0]
for i in range(1, batch_sz):
output = output + batch_results[i]
output /= batch_sz
best = np.argmax(output) # 最可能性能分類(lèi)
best_choice = (label_list[best], output[best])
print('Guess @ 1 %s, prob = %.2f' % best_choice)
nlabels = len(label_list)
if nlabels > 2:
output[best] = 0
second_best = np.argmax(output)
print('Guess @ 2 %s, prob = %.2f' % (label_list[second_best], output[second_best]))
if writer is not None:
writer.writerow((image_file, best_choice[0], '%.2f' % best_choice[1]))
except Exception as e:
print(e)
print('Failed to run image %s ' % image_file)
def list_images(srcfile):
with open(srcfile, 'r') as csvfile:
delim = ',' if srcfile.endswith('.csv') else '\t'
reader = csv.reader(csvfile, delimiter=delim)
if srcfile.endswith('.csv') or srcfile.endswith('.tsv'):
print('skipping header')
_ = next(reader)
return [row[0] for row in reader]
def main(argv=None): # pylint: disable=unused-argument
files = []
if FLAGS.face_detection_model:
print('Using face detector (%s) %s' % (FLAGS.face_detection_type, FLAGS.face_detection_model))
face_detect = face_detection_model(FLAGS.face_detection_type, FLAGS.face_detection_model)
face_files, rectangles = face_detect.run(FLAGS.filename)
print(face_files)
files += face_files
config = tf.ConfigProto(allow_soft_placement=True)
with tf.Session(config=config) as sess:
label_list = AGE_LIST if FLAGS.class_type == 'age' else GENDER_LIST
nlabels = len(label_list)
print('Executing on %s' % FLAGS.device_id)
model_fn = select_model(FLAGS.model_type)
with tf.device(FLAGS.device_id):
images = tf.placeholder(tf.float32, [None, RESIZE_FINAL, RESIZE_FINAL, 3])
logits = model_fn(nlabels, images, 1, False)
init = tf.global_variables_initializer()
requested_step = FLAGS.requested_step if FLAGS.requested_step else None
checkpoint_path = '%s' % (FLAGS.model_dir)
model_checkpoint_path, global_step = get_checkpoint(checkpoint_path, requested_step, FLAGS.checkpoint)
saver = tf.train.Saver()
saver.restore(sess, model_checkpoint_path)
softmax_output = tf.nn.softmax(logits)
coder = ImageCoder()
# Support a batch mode if no face detection model
if len(files) == 0:
if (os.path.isdir(FLAGS.filename)):
for relpath in os.listdir(FLAGS.filename):
abspath = os.path.join(FLAGS.filename, relpath)
if os.path.isfile(abspath) and any([abspath.endswith('.' + ty) for ty in ('jpg', 'png', 'JPG', 'PNG', 'jpeg')]):
print(abspath)
files.append(abspath)
else:
files.append(FLAGS.filename)
# If it happens to be a list file, read the list and clobber the files
if any([FLAGS.filename.endswith('.' + ty) for ty in ('csv', 'tsv', 'txt')]):
files = list_images(FLAGS.filename)
writer = None
output = None
if FLAGS.target:
print('Creating output file %s' % FLAGS.target)
output = open(FLAGS.target, 'w')
writer = csv.writer(output)
writer.writerow(('file', 'label', 'score'))
image_files = list(filter(lambda x: x is not None, [resolve_file(f) for f in files]))
print(image_files)
if FLAGS.single_look:
classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer)
else:
for image_file in image_files:
classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer)
if output is not None:
output.close()
if __name__ == '__main__':
tf.app.run()
微軟臉部圖片識(shí)別性別涝动、年齡網(wǎng)站 http://how-old.net/ 。圖片識(shí)別年齡炬灭、性別醋粟。根據(jù)問(wèn)題搜索圖片。
參考資料:
《TensorFlow技術(shù)解析與實(shí)戰(zhàn)》
歡迎推薦上海機(jī)器學(xué)習(xí)工作機(jī)會(huì)重归,我的微信:qingxingfengzi