VIDEO |
---|
[機(jī)器學(xué)習(xí)入門] 李宏毅機(jī)器學(xué)習(xí)筆記-10 (Tips for Deep Learning典予;深度學(xué)習(xí)小貼士)
VIDEO |
---|
Recipe of Deep Learning
在 training data上的performance不好
Deeper usually does not imply better
Vanishing Gradient Problem
ReLU(Rectified Linear Unit)
ReLU - variant
那么除了ReLU有沒有別的activation function了呢颖杏?
所以我們用 Maxout來根據(jù)training data自動生成activation function
ReLU is a special cases of Maxout
Maxout
ReLU is a special cases of Maxout
More than ReLU
Maxout - Training
Adaptive Learning Rate
RMSProp
Hard to find optimal network parameters
Momentum(gradient descent 融入慣性作用)
所以找默,加了momentum后: