7 激活函數(shù) -庖丁解牛之pytorch

pytorch中實現(xiàn)了大部分激活函數(shù)，你也可以自定義激活函數(shù)克伊，激活函數(shù)的實現(xiàn)在torch.nn.functional中，每個激活函數(shù)都對應(yīng)激活模塊類不从，但最終還是調(diào)用torch.nn.functional犁跪，看了定義歹袁，你也能自定義激活函數(shù),我們從最早的激活函數(shù)來看

sigmoid

def sigmoid(input):
    r"""sigmoid(input) -> Tensor

    Applies the element-wise function :math:`\text{Sigmoid}(x) = \frac{1}{1 + \exp(-x)}`

    See :class:`~torch.nn.Sigmoid` for more details.
    """
    warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
    return input.sigmoid()

Sigmoid

源碼顯示這個激活函數(shù)直接調(diào)用tensor.sigmoid函數(shù)宇攻，值域在[0,1]之間倡勇，也就是把數(shù)據(jù)的所有值都壓縮在[0,1]之間嘉涌，映射概率不錯，如果作為激活函數(shù)有如下缺點

神經(jīng)元容易飽和扔役，其值不在[-5, 5]之間警医，梯度基本為0，導(dǎo)致權(quán)重更新非常緩慢
值域中心不是0侈玄，相當(dāng)于舍棄負值部分
計算有點小貴吟温，畢竟每次都算兩個exp，一定要做內(nèi)存和計算的葛朗臺

tanh

def tanh(input):
    r"""tanh(input) -> Tensor

    Applies element-wise,
    :math:`\text{Tanh}(x) = \tanh(x) = \frac{\exp(x) - \exp(-x)}{\exp(x) + \exp(-x)}`

    See :class:`~torch.nn.Tanh` for more details.
    """
    warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
    return input.tanh()

tanh

這個函數(shù)的值域正常了潘悼，避免了sigmoid的問題爬橡，是[-1, 1]，以0為中心宾添，但是依然存在一些問題梯度消失的神經(jīng)元飽和問題郭宝，而且計算更貴！

relu

def relu(input, inplace=False):
    if inplace:
        return torch.relu_(input)
    return torch.relu(input)

ReLu

relu的函數(shù)定義就是max(0, x)榄檬，解決了梯度消失的飽和問題衔统，計算高效海雪，線性值舱殿，一般來說比Sigmoid/tanh快6倍左右。而且有資料顯示湾宙，和生物神經(jīng)激活機制非常相近冈绊。但是引入了新的問題，就是負值容易引起神經(jīng)死亡伟恶，也就是說每次這個激活函數(shù)會擼掉負值的部分毅该。

Leaky Relu

def leaky_relu(input, negative_slope=0.01, inplace=False):
    r"""
    leaky_relu(input, negative_slope=0.01, inplace=False) -> Tensor

    Applies element-wise,
    :math:`\text{LeakyReLU}(x) = \max(0, x) + \text{negative\_slope} * \min(0, x)`

    See :class:`~torch.nn.LeakyReLU` for more details.
    """
    if inplace:
        return torch._C._nn.leaky_relu_(input, negative_slope)
    return torch._C._nn.leaky_relu(input, negative_slope)

LReLu

為了處理負值的情況眶掌，Relu有了變種，其函數(shù)是max(0.01*x, x),這個函數(shù)解決了神經(jīng)飽和問題畏线，計算高效，而且神經(jīng)不死了蒿叠。

PRelu

def prelu(input, weight):
    r"""prelu(input, weight) -> Tensor

    Applies element-wise the function
    :math:`\text{PReLU}(x) = \max(0,x) + \text{weight} * \min(0,x)` where weight is a
    learnable parameter.

    See :class:`~torch.nn.PReLU` for more details.
    """
    return torch.prelu(input, weight)

PRelu

這個函數(shù)的定義是max(ax, x)市咽，其中參數(shù)a可以隨時調(diào)整抵蚊。

Elu Exponential Line Unit

def elu(input, alpha=1., inplace=False):
    r"""Applies element-wise,
    :math:`\text{ELU}(x) = \max(0,x) + \min(0, \alpha * (\exp(x) - 1))`.

    See :class:`~torch.nn.ELU` for more details.
    """
    if inplace:
        return torch._C._nn.elu_(input, alpha)
    return torch._C._nn.elu(input, alpha)

Elu

這個函數(shù)的定義是max(x, a*(exp(x)-1))，繼承了Relu的所有優(yōu)點谷醉，but貴一點冈闭，均值為0的輸出、而且處處一階可導(dǎo)遇八，眼看著就順滑啊，哈哈刃永，負值很好的處理了，魯棒性很好囚玫， nice读规！學(xué)完批標(biāo)準(zhǔn)化后，我們展示一個小示例，它居然在那個例子中干掉了批標(biāo)準(zhǔn)化供汛。
于是其他變種應(yīng)運而生

SELU

def selu(input, inplace=False):
    r"""selu(input, inplace=False) -> Tensor

    Applies element-wise,
    :math:`\text{SELU}(x) = scale * (\max(0,x) + \min(0, \alpha * (\exp(x) - 1)))`,
    with :math:`\alpha=1.6732632423543772848170429916717` and
    :math:`scale=1.0507009873554804934193349852946`.

    See :class:`~torch.nn.SELU` for more details.
    """
    if inplace:
        return torch.selu_(input)
    return torch.selu(input)

SELU

還有其他變種relu6怔昨、celu等等

這些激活函數(shù)我們來個經(jīng)驗參考：

首先使用Relu，然后慢慢調(diào)整學(xué)習(xí)率
可以嘗試Lecky Relu/Elu
試一下tanh趁舀，不要期望太多
不要嘗試sigmoid

最后編輯于：2018.11.04 14:17:24

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者