二、AlexNet
1. conv1階段DFD(data flow diagram):
這些像素層經(jīng)過(guò)pool運(yùn)算(池化運(yùn)算)的處理九巡,池化運(yùn)算的尺度為33图贸,運(yùn)算的步長(zhǎng)為2,則池化后圖像的尺寸為(55-3)/2+1=27冕广。 即池化后像素的規(guī)模為272796疏日;然后經(jīng)過(guò)歸一化處理,歸一化運(yùn)算的尺度為55撒汉;第一卷積層運(yùn)算結(jié)束后形成的像素層的規(guī)模為272796沟优。分別對(duì)應(yīng)96個(gè)卷積核所運(yùn)算形成。這96層像素層分為2組睬辐,每組48個(gè)像素層挠阁,每組在一個(gè)獨(dú)立的GPU上進(jìn)行運(yùn)算。
反向傳播時(shí)溯饵,每個(gè)卷積核對(duì)應(yīng)一個(gè)偏差值侵俗。即第一層的96個(gè)卷積核對(duì)應(yīng)上層輸入的96個(gè)偏差值。
2. conv2階段DFD(data flow diagram):
這些像素層經(jīng)過(guò)pool運(yùn)算(池化運(yùn)算)的處理椎工,池化運(yùn)算的尺度為3x3饭于,運(yùn)算的步長(zhǎng)為2,則池化后圖像的尺寸為(57-3)/2+1=13维蒙。 即池化后像素的規(guī)模為2組13x13x128的像素層掰吕;然后經(jīng)過(guò)歸一化處理,歸一化運(yùn)算的尺度為5*5颅痊;第二卷積層運(yùn)算結(jié)束后形成的像素層的規(guī)模為2組13x13x128的像素層殖熟。分別對(duì)應(yīng)2組128個(gè)卷積核所運(yùn)算形成。每組在一個(gè)GPU上進(jìn)行運(yùn)算斑响。即共256個(gè)卷積核菱属,共2個(gè)GPU進(jìn)行運(yùn)算。
反向傳播時(shí)舰罚,每個(gè)卷積核對(duì)應(yīng)一個(gè)偏差值纽门。即第一層的96個(gè)卷積核對(duì)應(yīng)上層輸入的256個(gè)偏差值。
3. conv3階段DFD(data flow diagram):
4. conv4階段DFD(data flow diagram):
5. conv5階段DFD(data flow diagram):
2組1313128像素層分別在2個(gè)不同GPU中進(jìn)行池化(pool)運(yùn)算處理。池化運(yùn)算的尺度為33润文,運(yùn)算的步長(zhǎng)為2姐呐,則池化后圖像的尺寸為(13-3)/2+1=6。 即池化后像素的規(guī)模為兩組66128的像素層數(shù)據(jù)典蝌,共66*256規(guī)模的像素層數(shù)據(jù)曙砂。
6. fc6階段DFD(data flow diagram):
由于第六層的運(yùn)算過(guò)程中涵妥,采用的濾波器的尺寸(66256)與待處理的feature map的尺寸(66256)相同,即濾波器中的每個(gè)系數(shù)只與feature map中的一個(gè)像素值相乘眶熬;而其它卷積層中妹笆,每個(gè)濾波器的系數(shù)都會(huì)與多個(gè)feature map中像素值相乘块请;因此娜氏,將第六層稱為全連接層。
第五層輸出的66256規(guī)模的像素層數(shù)據(jù)與第六層的4096個(gè)神經(jīng)元進(jìn)行全連接墩新,然后經(jīng)由relu6進(jìn)行處理后生成4096個(gè)數(shù)據(jù)贸弥,再經(jīng)過(guò)dropout6處理后輸出4096個(gè)數(shù)據(jù)。
7. fc7階段DFD(data flow diagram):
第六層輸出的4096個(gè)數(shù)據(jù)與第七層的4096個(gè)神經(jīng)元進(jìn)行全連接海渊,然后經(jīng)由relu7進(jìn)行處理后生成4096個(gè)數(shù)據(jù)绵疲,再經(jīng)過(guò)dropout7處理后輸出4096個(gè)數(shù)據(jù)哲鸳。
8. fc8階段DFD(data flow diagram):
Alexnet網(wǎng)絡(luò)中各個(gè)層發(fā)揮的作用如下表所述:
實(shí)現(xiàn)代碼
#model.py
import torch.nn as nn
import torch
class AlexNet(nn.Module):
def __init__(self, num_classes=1000, init_weights=False):
super(AlexNet, self).__init__()
self.features = nn.Sequential( #打包
nn.Conv2d(3, 48, kernel_size=11, stride=4, padding=2), # input[3, 224, 224] output[48, 55, 55] 自動(dòng)舍去小數(shù)點(diǎn)后
nn.ReLU(inplace=True), #inplace 可以載入更大模型
nn.MaxPool2d(kernel_size=3, stride=2), # output[48, 27, 27] kernel_num為原論文一半
nn.Conv2d(48, 128, kernel_size=5, padding=2), # output[128, 27, 27]
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2), # output[128, 13, 13]
nn.Conv2d(128, 192, kernel_size=3, padding=1), # output[192, 13, 13]
nn.ReLU(inplace=True),
nn.Conv2d(192, 192, kernel_size=3, padding=1), # output[192, 13, 13]
nn.ReLU(inplace=True),
nn.Conv2d(192, 128, kernel_size=3, padding=1), # output[128, 13, 13]
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2), # output[128, 6, 6]
)
self.classifier = nn.Sequential(
nn.Dropout(p=0.5),
#全鏈接
nn.Linear(128 * 6 * 6, 2048),
nn.ReLU(inplace=True),
nn.Dropout(p=0.5),
nn.Linear(2048, 2048),
nn.ReLU(inplace=True),
nn.Linear(2048, num_classes),
)
if init_weights:
self._initialize_weights()
def forward(self, x):
x = self.features(x)
x = torch.flatten(x, start_dim=1) #展平 或者view()
x = self.classifier(x)
return x
def _initialize_weights(self):
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') #何教授方法
if m.bias is not None:
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.Linear):
nn.init.normal_(m.weight, 0, 0.01) #正態(tài)分布賦值
nn.init.constant_(m.bias, 0)
下載數(shù)據(jù)集
DATA_URL = 'http://download.tensorflow.org/example_images/flower_photos.tgz'
下載完后執(zhí)行下面腳本徙菠,將數(shù)據(jù)集進(jìn)行分類
#spile_data.py
import os
from shutil import copy
import random
def mkfile(file):
if not os.path.exists(file):
os.makedirs(file)
file = 'flower_data/flower_photos'
flower_class = [cla for cla in os.listdir(file) if ".txt" not in cla]
mkfile('flower_data/train')
for cla in flower_class:
mkfile('flower_data/train/'+cla)
mkfile('flower_data/val')
for cla in flower_class:
mkfile('flower_data/val/'+cla)
split_rate = 0.1
for cla in flower_class:
cla_path = file + '/' + cla + '/'
images = os.listdir(cla_path)
num = len(images)
eval_index = random.sample(images, k=int(num*split_rate))
for index, image in enumerate(images):
if image in eval_index:
image_path = cla_path + image
new_path = 'flower_data/val/' + cla
copy(image_path, new_path)
else:
image_path = cla_path + image
new_path = 'flower_data/train/' + cla
copy(image_path, new_path)
print("\r[{}] processing [{}/{}]".format(cla, index+1, num), end="") # processing bar
print()
print("processing done!")
之后應(yīng)該是這樣:
train.py
import torch
import torch.nn as nn
from torchvision import transforms, datasets, utils
import matplotlib.pyplot as plt
import numpy as np
import torch.optim as optim
from model import AlexNet
import os
import json
import time
#device : GPU or CPU
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
#數(shù)據(jù)轉(zhuǎn)換
data_transform = {
"train": transforms.Compose([transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]),
"val": transforms.Compose([transforms.Resize((224, 224)), # cannot 224, must (224, 224)
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])}
#data_root = os.path.abspath(os.path.join(os.getcwd(), "../..")) # get data root path
data_root = os.getcwd()
image_path = data_root + "/flower_data/" # flower data set path
train_dataset = datasets.ImageFolder(root=image_path + "/train",
transform=data_transform["train"])
train_num = len(train_dataset)
# {'daisy':0, 'dandelion':1, 'roses':2, 'sunflower':3, 'tulips':4}
flower_list = train_dataset.class_to_idx
cla_dict = dict((val, key) for key, val in flower_list.items())
# write dict into json file
json_str = json.dumps(cla_dict, indent=4)
with open('class_indices.json', 'w') as json_file:
json_file.write(json_str)
batch_size = 32
train_loader = torch.utils.data.DataLoader(train_dataset,
batch_size=batch_size, shuffle=True,
num_workers=0)
validate_dataset = datasets.ImageFolder(root=image_path + "/val",
transform=data_transform["val"])
val_num = len(validate_dataset)
validate_loader = torch.utils.data.DataLoader(validate_dataset,
batch_size=batch_size, shuffle=True,
num_workers=0)
test_data_iter = iter(validate_loader)
test_image, test_label = test_data_iter.next()
#print(test_image[0].size(),type(test_image[0]))
#print(test_label[0],test_label[0].item(),type(test_label[0]))
#顯示圖像,之前需把validate_loader中batch_size改為4
# def imshow(img):
# img = img / 2 + 0.5 # unnormalize
# npimg = img.numpy()
# plt.imshow(np.transpose(npimg, (1, 2, 0)))
# plt.show()
#
# print(' '.join('%5s' % cla_dict[test_label[j].item()] for j in range(4)))
# imshow(utils.make_grid(test_image))
net = AlexNet(num_classes=5, init_weights=True)
net.to(device)
#損失函數(shù):這里用交叉熵
loss_function = nn.CrossEntropyLoss()
#優(yōu)化器 這里用Adam
optimizer = optim.Adam(net.parameters(), lr=0.0002)
#訓(xùn)練參數(shù)保存路徑
save_path = './AlexNet.pth'
#訓(xùn)練過(guò)程中最高準(zhǔn)確率
best_acc = 0.0
#開始進(jìn)行訓(xùn)練和測(cè)試郁岩,訓(xùn)練一輪婿奔,測(cè)試一輪
for epoch in range(10):
# train
net.train() #訓(xùn)練過(guò)程中,使用之前定義網(wǎng)絡(luò)中的dropout
running_loss = 0.0
t1 = time.perf_counter()
for step, data in enumerate(train_loader, start=0):
images, labels = data
optimizer.zero_grad()
outputs = net(images.to(device))
loss = loss_function(outputs, labels.to(device))
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()
# print train process
rate = (step + 1) / len(train_loader)
a = "*" * int(rate * 50)
b = "." * int((1 - rate) * 50)
print("\rtrain loss: {:^3.0f}%[{}->{}]{:.3f}".format(int(rate * 100), a, b, loss), end="")
print()
print(time.perf_counter()-t1)
# validate
net.eval() #測(cè)試過(guò)程中不需要dropout问慎,使用所有的神經(jīng)元
acc = 0.0 # accumulate accurate number / epoch
with torch.no_grad():
for val_data in validate_loader:
val_images, val_labels = val_data
outputs = net(val_images.to(device))
predict_y = torch.max(outputs, dim=1)[1]
acc += (predict_y == val_labels.to(device)).sum().item()
val_accurate = acc / val_num
if val_accurate > best_acc:
best_acc = val_accurate
torch.save(net.state_dict(), save_path)
print('[epoch %d] train_loss: %.3f test_accuracy: %.3f' %
(epoch + 1, running_loss / step, val_accurate))
print('Finished Training')
最后進(jìn)行預(yù)測(cè)
predict.py
import torch
from model import AlexNet
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt
import json
data_transform = transforms.Compose(
[transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
# load image
img = Image.open("./sunflower.jpg") #驗(yàn)證太陽(yáng)花
#img = Image.open("./roses.jpg") #驗(yàn)證玫瑰花
plt.imshow(img)
# [N, C, H, W]
img = data_transform(img)
# expand batch dimension
img = torch.unsqueeze(img, dim=0)
# read class_indict
try:
json_file = open('./class_indices.json', 'r')
class_indict = json.load(json_file)
except Exception as e:
print(e)
exit(-1)
# create model
model = AlexNet(num_classes=5)
# load model weights
model_weight_path = "./AlexNet.pth"
model.load_state_dict(torch.load(model_weight_path))
model.eval()
with torch.no_grad():
# predict class
output = torch.squeeze(model(img))
predict = torch.softmax(output, dim=0)
predict_cla = torch.argmax(predict).numpy()
print(class_indict[str(predict_cla)], predict[predict_cla].item())
plt.show()
參考自
d5224