卷積層
1钾唬、class torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)
一維卷積層,輸入的尺度是(N, C_in,L_in)树碱,輸出尺度( N,C_out,L_out)的計(jì)算方式:
N為批次北秽,C_in即為in_channels,即一批內(nèi)輸入一維數(shù)據(jù)個(gè)數(shù)综苔,L_in是是一維數(shù)據(jù)基數(shù)
shape:
輸入: (N,C_in,L_in)
輸出: (N,C_out,L_out)
輸入輸出的計(jì)算方式:
更好理解in out_channel,stride,kernal_size之間的關(guān)系
2、class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)
二維卷積層, 輸入的尺度是(N, C_in,H,W)逼泣,輸出尺度(N,C_out,H_out,W_out)的計(jì)算方式:
shape:
input: (N,C_in,H_in,W_in)
output: (N,C_out,H_out,W_out)
理解pytorch的padding策略
再去理解參數(shù)之間的關(guān)系
池化層
1、class torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
對于輸入信號的輸入通道舟舒,提供1維最大池化(max pooling)操作
如果輸入的大小是(N,C,L)拉庶,那么輸出的大小是(N,C,L_out)的計(jì)算方式是:
參數(shù):
- kernel_size(int or tuple) - max pooling的窗口大小
- stride(int or tuple, optional) - max pooling的窗口移動(dòng)的步長。默認(rèn)值是kernel_size
- padding(int or tuple, optional) - 輸入的每一條邊補(bǔ)充0的層數(shù)
- dilation(int or tuple, optional) – 一個(gè)控制窗口中元素步幅的參數(shù)
- return_indices - 如果等于True秃励,會(huì)返回輸出最大值的序號氏仗,對于上采樣操作會(huì)有幫助
- ceil_mode - 如果等于True,計(jì)算輸出信號大小的時(shí)候夺鲜,會(huì)使用向上取整皆尔,代替默認(rèn)的向下取整的操作
shape:
輸入: (N,C_in,L_in)
輸出: (N,C_out,L_out)
2呐舔、class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
對于輸入信號的輸入通道,提供2維最大池化(max pooling)操作
如果輸入的大小是(N,C,H,W)慷蠕,那么輸出的大小是(N,C,H_out,W_out)和池化窗口大小(kH,kW)的關(guān)系是:
參數(shù):
kernel_size(int or tuple) - max pooling的窗口大小
stride(int or tuple, optional) - max pooling的窗口移動(dòng)的步長珊拼。默認(rèn)值是kernel_size
padding(int or tuple, optional) - 輸入的每一條邊補(bǔ)充0的層數(shù)
dilation(int or tuple, optional) – 一個(gè)控制窗口中元素步幅的參數(shù)
return_indices - 如果等于True,會(huì)返回輸出最大值的序號流炕,對于上采樣操作會(huì)有幫助
ceil_mode - 如果等于True澎现,計(jì)算輸出信號大小的時(shí)候,會(huì)使用向上取整每辟,代替默認(rèn)的向下取整的操作
shape:
輸入: (N,C,H_{in},W_in)
輸出: (N,C,H_out,W_out)
nn與nn.functional有什么區(qū)別
需要維持狀態(tài)的時(shí)候剑辫,用nn下的conv
不需要維持狀態(tài)的時(shí)候,用nn.function下的conv
共享一部分參數(shù)的時(shí)候影兽,適合用nn.function揭斧,具體見下文
PyTorch 中,nn 與 nn.functional 有什么區(qū)別
接下來講講shape在network中的存在
Define the neural network that has some learnable parameters (or weights)
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
# 1 input image channel, 6 output channels, 3x3 square convolution
# kernel
self.conv1 = nn.Conv2d(1, 6, 3)
self.conv2 = nn.Conv2d(6, 16, 3)
# an affine operation: y = Wx + b
self.fc1 = nn.Linear(16 * 6 * 6, 120) # 6*6 from image dimension
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
# Max pooling over a (2, 2) window
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
# If the size is a square you can only specify a single number
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
return num_features
上面是一個(gè)典型的網(wǎng)絡(luò)架構(gòu)例子峻堰,從這個(gè)例子中可以看出讹开,在定義conv的時(shí)候,只輸入了channel的參數(shù)捐名,不存在每個(gè)tensor shape的描述旦万,shape的變化對網(wǎng)絡(luò)并沒有影響。
代碼
import torch
import torch.nn as nn
m=nn.Conv2d(16,33,3,stride=2)
input=torch.randn(20,16,10,10)
output=m(input)
#H=(10-3)/2+1=4
#W=(10-3)/2+1=4
print(output.shape)
輸出
torch.Size([20, 33, 4, 4])