深度學(xué)習(xí)（8）深入理解pytorch的卷積池化及tensor shape的計(jì)算

卷積層

`1钾唬、class torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)`

一維卷積層，輸入的尺度是(N, C_in,L_in)树碱，輸出尺度（ N,C_out,L_out）的計(jì)算方式：

N為批次北秽，C_in即為in_channels，即一批內(nèi)輸入一維數(shù)據(jù)個(gè)數(shù)综苔，L_in是是一維數(shù)據(jù)基數(shù)

shape:
輸入: (N,C_in,L_in)
輸出: (N,C_out,L_out)
輸入輸出的計(jì)算方式：
$L_{out}=floor((L_{in}+2padding-dilation(kernerl_size-1)-1)/stride+1)$

更好理解in out_channel，stride,kernal_size之間的關(guān)系

`2、class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)`

二維卷積層, 輸入的尺度是(N, C_in,H,W)逼泣，輸出尺度（N,C_out,H_out,W_out）的計(jì)算方式：

$out(N_i, C_{out_j})=bias(C_{out_j})+\sum^{C_{in}-1}{k=0}weight(C{out_j},k)\bigotimes input(N_i,k)$

shape:
input: (N,C_in,H_in,W_in)
output: (N,C_out,H_out,W_out)
$H_{out}=floor((H_{in}+2padding[0]-dilation[0](kernerl_size[0]-1)-1)/stride[0]+1)$

$W_{out}=floor((W_{in}+2padding[1]-dilation[1](kernerl_size[1]-1)-1)/stride[1]+1)$

理解pytorch的padding策略
 再去理解參數(shù)之間的關(guān)系

池化層

1、`class torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)`

對于輸入信號的輸入通道舟舒，提供1維最大池化（max pooling）操作

如果輸入的大小是(N,C,L)拉庶，那么輸出的大小是(N,C,L_out)的計(jì)算方式是：
$out(N_i, C_j,k)=max^{kernel_size-1}{m=0}input(N{i},C_j,stride*k+m)$

參數(shù)：

kernel_size(int or tuple) - max pooling的窗口大小
stride(int or tuple, optional) - max pooling的窗口移動(dòng)的步長。默認(rèn)值是kernel_size
padding(int or tuple, optional) - 輸入的每一條邊補(bǔ)充0的層數(shù)
dilation(int or tuple, optional) – 一個(gè)控制窗口中元素步幅的參數(shù)
return_indices - 如果等于True秃励，會(huì)返回輸出最大值的序號氏仗，對于上采樣操作會(huì)有幫助
ceil_mode - 如果等于True，計(jì)算輸出信號大小的時(shí)候夺鲜，會(huì)使用向上取整皆尔，代替默認(rèn)的向下取整的操作

shape:
輸入: (N,C_in,L_in)
輸出: (N,C_out,L_out)
$L_{out}=floor((L_{in} + 2padding - dilation(kernel_size - 1) - 1)/stride + 1$

2呐舔、`class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)`

對于輸入信號的輸入通道，提供2維最大池化（max pooling）操作

如果輸入的大小是(N,C,H,W)慷蠕，那么輸出的大小是(N,C,H_out,W_out)和池化窗口大小(kH,kW)的關(guān)系是：
$out(N_i, C_j,k)=max^{kH-1}{m=0}max^{kW-1}{m=0}input(N_{i},C_j,stride[0]h+m,stride[1]w+n)$

參數(shù)：

kernel_size(int or tuple) - max pooling的窗口大小
stride(int or tuple, optional) - max pooling的窗口移動(dòng)的步長珊拼。默認(rèn)值是kernel_size
padding(int or tuple, optional) - 輸入的每一條邊補(bǔ)充0的層數(shù)
dilation(int or tuple, optional) – 一個(gè)控制窗口中元素步幅的參數(shù)
return_indices - 如果等于True，會(huì)返回輸出最大值的序號流炕，對于上采樣操作會(huì)有幫助
ceil_mode - 如果等于True澎现，計(jì)算輸出信號大小的時(shí)候，會(huì)使用向上取整每辟，代替默認(rèn)的向下取整的操作
shape:
輸入: (N,C,H_{in},W_in)
輸出: (N,C,H_out,W_out)
$H_{out}=floor((H_{in} + 2padding[0] - dilation[0](kernel_size[0] - 1) - 1)/stride[0] + 1$

$W_{out}=floor((W_{in} + 2padding[1] - dilation[1](kernel_size[1] - 1) - 1)/stride[1] + 1$

nn與nn.functional有什么區(qū)別

需要維持狀態(tài)的時(shí)候剑辫，用nn下的conv
不需要維持狀態(tài)的時(shí)候，用nn.function下的conv
共享一部分參數(shù)的時(shí)候影兽，適合用nn.function揭斧，具體見下文
PyTorch 中，nn 與 nn.functional 有什么區(qū)別

接下來講講shape在network中的存在

Define the neural network that has some learnable parameters (or weights)

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 3x3 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 3)
        self.conv2 = nn.Conv2d(6, 16, 3)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 6 * 6, 120)  # 6*6 from image dimension
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

上面是一個(gè)典型的網(wǎng)絡(luò)架構(gòu)例子峻堰，從這個(gè)例子中可以看出讹开，在定義conv的時(shí)候，只輸入了channel的參數(shù)捐名，不存在每個(gè)tensor shape的描述旦万，shape的變化對網(wǎng)絡(luò)并沒有影響。

代碼

import torch
import torch.nn as nn

m=nn.Conv2d(16,33,3,stride=2)
input=torch.randn(20,16,10,10)
output=m(input)
#H=(10-3)/2+1=4
#W=(10-3)/2+1=4
print(output.shape)

輸出

torch.Size([20, 33, 4, 4])

深度學(xué)習(xí)（8）深入理解pytorch的卷積池化及tensor shape的計(jì)算

卷積層

1钾唬、class torch.nn.Conv1d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

2、class torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

池化層

1、class torch.nn.MaxPool1d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)

2呐舔、class torch.nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)