未經(jīng)允許,不得轉(zhuǎn)載,謝謝~~
我們現(xiàn)在已經(jīng)知道了:
- 怎么樣用pytorch定義一個(gè)神經(jīng)網(wǎng)絡(luò)亿鲜;
- 怎么樣計(jì)算損失值台舱;
- 怎么樣更新網(wǎng)絡(luò)的權(quán)重;
現(xiàn)在剩下的問題就是怎么樣獲取數(shù)據(jù)了,pytorch除了支持將包含數(shù)據(jù)信息的numpy array轉(zhuǎn)換成Tensor以外,也提供了各個(gè)常見數(shù)據(jù)集的加載方式,并封裝到了torchvision
中绪商,本文簡單介紹數(shù)據(jù)獲取的方式,然后訓(xùn)練一個(gè)簡單的分類網(wǎng)絡(luò)作為入門級(jí)的example辅鲸。
數(shù)據(jù)獲取
當(dāng)你想要處理圖像格郁,文本,語音或者視頻信息時(shí)独悴,一般可以用標(biāo)準(zhǔn)的python包將數(shù)據(jù)加載到numpy array中例书,然后將其轉(zhuǎn)換成Tensor.
- 對(duì)于圖像,常用的有:Pillow刻炒,OpenCV
- 對(duì)于語音决采,常用的有:scipy, libosa
- 對(duì)于文本,常用的有:NLTK, SpaCy
Pytorch提供的torchvision
包封裝了常見數(shù)據(jù)集的數(shù)據(jù)加載函數(shù)坟奥,比如Imagenet树瞭,CIFAR10拇厢,MNIST等等它都提供了數(shù)據(jù)加載的功能。除此晒喷,它還提供了torchvision.datasets
和torch.utils.data.DataLoader
用于實(shí)現(xiàn)圖像數(shù)據(jù)轉(zhuǎn)換的功能孝偎。
訓(xùn)練圖像分類器
加載并處理CIFAR10
import torch
import torchvision
import torchvision.transforms as transforms
transform = transforms.Compose(
[transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False,
download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4,
shuffle=False, num_workers=2)
classes = ('plane', 'car', 'bird', 'cat',
'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
注:torchvision的輸出結(jié)果為PILImage, 處于[0,1]之間,所以我們將其轉(zhuǎn)換為[-1,1]之間的張量厨埋。
如果沒有CIFAR10的數(shù)據(jù)邪媳,代碼會(huì)自動(dòng)下載捐顷,輸出結(jié)果如下所示:
來來來荡陷,讓我們把訓(xùn)練圖片輸出來看看~
import matplotlib.pyplot as plt
import numpy as np
# functions to show an image
def imshow(img):
img = img / 2 + 0.5 # unnormalize
npimg = img.numpy()
plt.imshow(np.transpose(npimg, (1, 2, 0)))
# get some random training images
dataiter = iter(trainloader)
images, labels = dataiter.next()
# show images
imshow(torchvision.utils.make_grid(images))
# print labels
print(' '.join('%5s' % classes[labels[j]] for j in range(4)))
如下所示:
定義卷積神經(jīng)網(wǎng)絡(luò)
這部分的實(shí)現(xiàn)跟之前定義的神經(jīng)網(wǎng)絡(luò)是一樣的,除了cifar10是三通道輸入的迅涮,代碼如下:
from torch.autograd import Variable
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 16 * 5 * 5)
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
net = Net()
用print net
語句可以將網(wǎng)絡(luò)結(jié)構(gòu)打印出來:
定義損失函數(shù)和優(yōu)化器
我們用交叉熵作為損失值废赞,用帶動(dòng)量的SGD隨機(jī)梯度下降法作為網(wǎng)絡(luò)的優(yōu)化器。
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
訓(xùn)練神經(jīng)網(wǎng)絡(luò)
在pytorch中叮姑,我們不需要自己計(jì)算梯度唉地,只要不斷將訓(xùn)練數(shù)據(jù)喂給網(wǎng)絡(luò),然后調(diào)用優(yōu)化器進(jìn)行優(yōu)化就可以了传透。
for epoch in range(2): # loop over the dataset multiple times
running_loss = 0.0
for i, data in enumerate(trainloader, 0):
# get the inputs
inputs, labels = data
# wrap them in Variable
inputs, labels = Variable(inputs), Variable(labels)
# zero the parameter gradients
optimizer.zero_grad()
# forward + backward + optimize
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.data[0]
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 2000))
running_loss = 0.0
print('Finished Training')
如上展示了訓(xùn)練的過程耘沼,實(shí)際中epoch=2還不夠,可以增大epoch來提高精度朱盐。
用測(cè)試數(shù)據(jù)進(jìn)行測(cè)試
現(xiàn)在已經(jīng)在訓(xùn)練數(shù)據(jù)上做了2輪的訓(xùn)練群嗤,我們現(xiàn)在可以檢查一下網(wǎng)絡(luò)是否有學(xué)習(xí)到東西。
我們先展示一些測(cè)試集里面的數(shù)據(jù):
dataiter = iter(testloader)
images, labels = dataiter.next()
# print images
imshow(torchvision.utils.make_grid(images))
print('GroundTruth: ', ' '.join('%5s' % classes[labels[j]] for j in range(4)))
現(xiàn)在來看看網(wǎng)絡(luò)的預(yù)測(cè)結(jié)果
outputs = net(Variable(images))
# transform from score to label
_, predicted = torch.max(outputs.data, 1)
print('Predicted: ', ' '.join('%5s' % classes[predicted[j]]
for j in range(4)))
可以看到預(yù)測(cè)結(jié)果還是挺準(zhǔn)確的兵琳。
再來看看整個(gè)測(cè)試集的運(yùn)行情況:
correct = 0
total = 0
for data in testloader:
images, labels = data
outputs = net(Variable(images))
_, predicted = torch.max(outputs.data, 1)
total += labels.size(0)
correct += (predicted == labels).sum()
print('Accuracy of the network on the 10000 test images: %d %%' % (
100 * correct / total))
最后得到的結(jié)果為:
以下是輸出了每個(gè)類別的判斷
class_correct = list(0. for i in range(10))
class_total = list(0. for i in range(10))
for data in testloader:
images, labels = data
outputs = net(Variable(images))
_, predicted = torch.max(outputs.data, 1)
c = (predicted == labels).squeeze()
for i in range(4):
label = labels[i]
class_correct[label] += c[i]
class_total[label] += 1
for i in range(10):
print('Accuracy of %5s : %2d %%' % (
classes[i], 100 * class_correct[i] / class_total[i]))
結(jié)果如下:
比隨機(jī)識(shí)別的概率10%已經(jīng)要高很多了狂秘,當(dāng)然要精度更高可以增加訓(xùn)練的輪數(shù),改變學(xué)習(xí)率等等躯肌。
以上就完成了一個(gè)簡單分類器網(wǎng)絡(luò)的定義集訓(xùn)練者春,可以用這個(gè)為入門example跑跑看~~