最近在kaggle上找比賽,發(fā)現(xiàn)了一個(gè)圖像入門比賽Digit Recognizer楔脯,你對(duì)R或Python和機(jī)器學(xué)習(xí)基礎(chǔ)有一些經(jīng)驗(yàn)忙芒,但你對(duì)計(jì)算機(jī)視覺(jué)還不熟悉。這個(gè)比賽旨在幫助大家熟悉計(jì)算機(jī)視覺(jué)损拢。
接下來(lái)我會(huì)做一個(gè)比較完整的demo來(lái)完成這個(gè)比賽:
1.數(shù)據(jù)準(zhǔn)備
在比賽頁(yè)面的data欄可以下載到三份數(shù)據(jù)
1.sample_submission.csv 需要提交的結(jié)果文件示例
2.test.csv 測(cè)試數(shù)據(jù)
3.train.csv 訓(xùn)練數(shù)據(jù)
我們將數(shù)據(jù)下載下來(lái),假設(shè)你當(dāng)前目錄路徑為/digit_recognizer, 將數(shù)據(jù)下載到/digit_recognizer/data/ 目錄下撒犀。
2.查看數(shù)據(jù)
import pandas as pd
train_data = pd.read_csv('/digit_recognizer/data/train.csv')
test_data = pd.read_csv('/digit_recognizer/data/test.csv')
print(f'訓(xùn)練數(shù)據(jù)shape: {train_data.shape}')
print(f'測(cè)試數(shù)據(jù)shape: {test_data.shape}')
[out]:
訓(xùn)練數(shù)據(jù)shape: (42000, 785)
測(cè)試數(shù)據(jù)shape: (28000, 784)
可以看到測(cè)試集合比訓(xùn)練集少一維福压,因?yàn)橛?xùn)練數(shù)據(jù)的第0列是類標(biāo)簽(0-9)掏秩,
手寫體數(shù)據(jù)實(shí)際上是一張28*28的矩陣,這里把這個(gè)矩陣平鋪開(kāi)了變成784維度的數(shù)據(jù)
取一張圖片看看
import matplotlib.pyplot as plt
one_img = test_data.iloc[0,:].values.reshape(28,28)
plt.imshow(one_img, cmap="Greys")
3.建模
接下來(lái)會(huì)有一個(gè)比較完整的pytorch建模過(guò)程荆姆,因?yàn)楹芎?jiǎn)單蒙幻,這里基本不需要特征工。
3.1 引入必要的包胆筒,都是一些比較常規(guī)的包
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torch.optim.lr_scheduler import StepLR
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
3.2 加載數(shù)據(jù)
# reshape函數(shù)
class ReshapeTransform:
def __init__(self, new_size, minmax=None):
self.new_size = new_size
self.minmax = minmax
def __call__(self, img):
if self.minmax:
img = img/self.minmax # 這里需要縮放到0-1邮破,不然transforms.Normalize會(huì)報(bào)錯(cuò)
img = torch.from_numpy(img)
return torch.reshape(img, self.new_size)
# 預(yù)處理Pipeline
transform = transforms.Compose([
ReshapeTransform((-1,28,28), 255), # 一維向量變?yōu)?8*28圖片并且縮放(0-255)到0-1
transforms.Normalize((0.1307,), (0.3081,)) # 均值方差標(biāo)準(zhǔn)化, (0.1307,), (0.3081,)是一個(gè)經(jīng)驗(yàn)值不必糾結(jié)
# Dataset類,配合DataLoader使用
class myDataset(Dataset):
def __init__(self, path, transform=None, is_train=True, seed=777):
"""
:param path: 文件路徑
:param transform: 數(shù)據(jù)預(yù)處理
:param train: 是否是訓(xùn)練集
"""
self.data = pd.read_csv(path) # 讀取數(shù)據(jù)
# 一般來(lái)說(shuō)訓(xùn)練集會(huì)分為訓(xùn)練集和驗(yàn)證集仆救,這里拆分比例為8: 2
if is_train:
self.data, _ = train_test_split(self.data, train_size=0.8, random_state=seed)
else:
_, self.data = train_test_split(self.data, train_size=0.8, random_state=seed)
self.transform = transform # 數(shù)據(jù)轉(zhuǎn)化器
self.is_train = is_train
def __len__(self):
# 返回data長(zhǎng)度
return len(self.data)
def __getitem__(self, idx):
# 根據(jù)index返回一行
data, lab = self.data.iloc[idx, 1:].values, self.data.iloc[idx, 0]
if self.transform:
data = self.transform(data)
return data, lab
])
# 加載訓(xùn)練集
train_data = myDataset('digit_recognizer/train.csv', transform, True)
train = DataLoader(train_data, batch_size=64, shuffle=True, num_workers=4)
vail_data = myDataset('digit_recognizer/train.csv', transform, False)
vail = DataLoader(vail_data, batch_size=64, shuffle=True, num_workers=4)
# 加載測(cè)試集
test_data = pd.read_csv('digit_recognizer/test.csv')
test_data = transform(test_data.values)
到此數(shù)據(jù)已經(jīng)加載完畢抒和,一般來(lái)說(shuō)推薦使用Dataset, DataLoader類配合transforms來(lái)封裝數(shù)據(jù),transforms中轉(zhuǎn)換函數(shù)可以自己定義彤蔽,定義方法如ReshapeTransform就可以了
3.3 定義網(wǎng)絡(luò)
為了簡(jiǎn)單起見(jiàn)摧莽,這里定義一個(gè)兩層卷積,兩層全連接的網(wǎng)絡(luò)
# 初始化權(quán)重
def _weight_init(m):
if isinstance(m, nn.Linear):
nn.init.xavier_uniform_(m.weight)
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.Conv2d):
nn.init.xavier_uniform_(m.weight)
elif isinstance(m, nn.BatchNorm1d):
nn.init.constant_(m.weight, 1)
nn.init.constant_(m.bias, 0)
# 建立網(wǎng)絡(luò)
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 10, kernel_size=3)
self.conv2 = nn.Conv2d(10, 20, kernel_size=3)
self.drop2d = nn.Dropout2d(p=0.2)
self.linr1 = nn.Linear(20*5*5, 32)
self.linr2 = nn.Linear(32, 10)
self.apply(_weight_init) # 初始化權(quán)重
# 正向傳播
def forward(self, x):
x = F.relu(self.drop2d(self.conv1(x)))
x = F.max_pool2d(x, 2)
x = F.relu(self.drop2d(self.conv2(x)))
x = F.max_pool2d(x, 2)
x = x.view(-1, 20*5*5) # 卷積接全連接需要計(jì)算好卷積輸出維度顿痪,將卷積輸出結(jié)果平鋪開(kāi)
x = self.linr1(x)
x = F.dropout(x,p=0.5)
x = self.linr2(x)
return x
net = Net()
這里使用TensorBoard畫(huà)出網(wǎng)絡(luò)圖镊辕,如下:
3.4 優(yōu)化器和損失函數(shù)
# 優(yōu)化器和損失函數(shù)
optimizer = optim.Adam(net.parameters(), lr=0.0005) # 使用Adam作為優(yōu)化器
criterion = nn.CrossEntropyLoss() # 損失函數(shù)為CrossEntropyLoss,CrossEntropyLoss()=log_softmax() + NLLLoss()
scheduler = StepLR(optimizer, step_size=10, gamma=0.5) # 這里使用StepLR蚁袭,每十步學(xué)習(xí)率lr衰減50%
3.5訓(xùn)練數(shù)據(jù)
# 轉(zhuǎn)化為GPU(可選)
device = 'cuda' if torch.cuda.is_available else 'cpu'
if torch.cuda.is_available:
net = net.to(device)
criterion = criterion.to(device)
epochs = 100
loss_history = []
# 訓(xùn)練模型
for epoch in range(epochs):
train_loss = []
val_loss = []
with torch.set_grad_enabled(True):
net.train()
for batch, (data, target) in enumerate(train):
data = data.to(device).float()
target = target.to(device)
optimizer.zero_grad()
predict = net(data)
loss = criterion(predict, target)
loss.backward()
optimizer.step()
train_loss.append(loss.item())
scheduler.step() # 經(jīng)過(guò)一個(gè)epoch丑蛤,步長(zhǎng)+1
with torch.set_grad_enabled(False):
net.eval() # 網(wǎng)絡(luò)中有drop層,需要使用eval模式
for batch, (data, target) in enumerate(vail):
data = data.to(device).float()
target = target.to(device)
predict = net(data)
loss = criterion(predict, target)
val_loss.append(loss.item())
loss_history.append([np.mean(train_loss), np.mean(val_loss)])
print('epoch:%d train_loss: %.5f val_loss: %.5f' %(epoch+1, np.mean(train_loss), np.mean(val_loss)))
['out']:
epoch:1 train_loss: 0.96523 val_loss: 0.35177
epoch:2 train_loss: 0.37922 val_loss: 0.22583
epoch:3 train_loss: 0.28509 val_loss: 0.18644
epoch:4 train_loss: 0.24072 val_loss: 0.15961
epoch:5 train_loss: 0.20989 val_loss: 0.13630
epoch:6 train_loss: 0.19612 val_loss: 0.12432
epoch:7 train_loss: 0.17479 val_loss: 0.11251
epoch:8 train_loss: 0.16251 val_loss: 0.10917
epoch:9 train_loss: 0.15625 val_loss: 0.10470
.
.
.
現(xiàn)在模型訓(xùn)練好了我們使用TensorBoard看一下訓(xùn)練集損失和驗(yàn)證集損失撕阎,藍(lán)色為訓(xùn)練集損失,紅色為驗(yàn)證集損失
3.6 上傳結(jié)果至kaggle
net.eval()
label = net(test_data.to(device).float().unsqueeze(1))
label = torch.argmax(label, dim=1)
submission = pd.read_csv('/digit_recognizer/sample_submission.csv')
submission.Label = label.cpu().numpy()
submission.to_csv('/digit_recognizer/version.csv', index=False)