Pytorch 中的學(xué)習(xí)率調(diào)整方法
Pytorch中的學(xué)習(xí)率調(diào)整有兩種方式:
- 直接修改optimizer中的lr參數(shù);
- 利用lr_scheduler()提供的幾種衰減函數(shù)
1. 修改optimizer中的lr:
import torch
import matplotlib.pyplot as plt
%matplotlib inline
from torch.optim import *
import torch.nn as nn
class net(nn.Module):
def __init__(self):
super(net,self).__init__()
self.fc = nn.Linear(1,10)
def forward(self,x):
return self.fc(x)
model = net()
LR = 0.01
optimizer = Adam(model.parameters(),lr = LR)
lr_list = []
for epoch in range(100):
if epoch % 5 == 0:
for p in optimizer.param_groups:
p['lr'] *= 0.9
lr_list.append(optimizer.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100),lr_list,color = 'r')
2. lr_scheduler
2.1 torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda, last_epoch=-1)
lr_lambda 會(huì)接收到一個(gè)int參數(shù):epoch迫靖,然后根據(jù)epoch計(jì)算出對(duì)應(yīng)的lr。如果設(shè)置多個(gè)lambda函數(shù)的話铣口,會(huì)分別作用于Optimizer中的不同的params_group
import numpy as np
lr_list = []
model = net()
LR = 0.01
optimizer = Adam(model.parameters(),lr = LR)
lambda1 = lambda epoch:np.sin(epoch) / epoch
scheduler = lr_scheduler.LambdaLR(optimizer,lr_lambda = lambda1)
for epoch in range(100):
scheduler.step()
lr_list.append(optimizer.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100),lr_list,color = 'r')
2.2 torch.optim.lr_scheduler.StepLR(optimizer, step_size, gamma=0.1, last_epoch=-1)
每個(gè)一定的epoch,lr會(huì)自動(dòng)乘以gamma
lr_list = []
model = net()
LR = 0.01
optimizer = Adam(model.parameters(),lr = LR)
scheduler = lr_scheduler.StepLR(optimizer,step_size=5,gamma = 0.8)
for epoch in range(100):
scheduler.step()
lr_list.append(optimizer.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100),lr_list,color = 'r')
2.3 torch.optim.lr_scheduler.MultiStepLR(optimizer, milestones, gamma=0.1, last_epoch=-1)
三段式lr,epoch進(jìn)入milestones范圍內(nèi)即乘以gamma垢揩,離開(kāi)milestones范圍之后再乘以gamma
這種衰減方式也是在學(xué)術(shù)論文中最常見(jiàn)的方式话侄,一般手動(dòng)調(diào)整也會(huì)采用這種方法亏推。
lr_list = []
model = net()
LR = 0.01
optimizer = Adam(model.parameters(),lr = LR)
scheduler = lr_scheduler.MultiStepLR(optimizer,milestones=[20,80],gamma = 0.9)
for epoch in range(100):
scheduler.step()
lr_list.append(optimizer.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100),lr_list,color = 'r')
2.4 torch.optim.lr_scheduler.ExponentialLR(optimizer, gamma, last_epoch=-1)
每個(gè)epoch中l(wèi)r都乘以gamma
lr_list = []
model = net()
LR = 0.01
optimizer = Adam(model.parameters(),lr = LR)
scheduler = lr_scheduler.ExponentialLR(optimizer, gamma=0.9)
for epoch in range(100):
scheduler.step()
lr_list.append(optimizer.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100),lr_list,color = 'r')
2.5 torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max, eta_min=0, last_epoch=-1)
T_max 對(duì)應(yīng)1/2個(gè)cos周期所對(duì)應(yīng)的epoch數(shù)值
eta_min 為最小的lr值,默認(rèn)為0
lr_list = []
model = net()
LR = 0.01
optimizer = Adam(model.parameters(),lr = LR)
scheduler = lr_scheduler.CosineAnnealingLR(optimizer, T_max = 20)
for epoch in range(100):
scheduler.step()
lr_list.append(optimizer.state_dict()['param_groups'][0]['lr'])
plt.plot(range(100),lr_list,color = 'r')
2.6 torch.optim.lr_scheduler.ReduceLROnPlateau(optimizer, mode='min', factor=0.1, patience=10, verbose=False, threshold=0.0001, threshold_mode='rel', cooldown=0, min_lr=0, eps=1e-08)
在發(fā)現(xiàn)loss不再降低或者acc不再提高之后年堆,降低學(xué)習(xí)率吞杭。各參數(shù)意義如下:
mode:'min'模式檢測(cè)metric是否不再減小,'max'模式檢測(cè)metric是否不再增大变丧;
factor: 觸發(fā)條件后lr*=factor芽狗;
patience:不再減小(或增大)的累計(jì)次數(shù)痒蓬;
verbose:觸發(fā)條件后print童擎;
threshold:只關(guān)注超過(guò)閾值的顯著變化;
threshold_mode:有rel和abs兩種閾值計(jì)算模式攻晒,rel規(guī)則:max模式下如果超過(guò)best(1+threshold)為顯著柔昼,min模式下如果低于best(1-threshold)為顯著;abs規(guī)則:max模式下如果超過(guò)best+threshold為顯著炎辨,min模式下如果低于best-threshold為顯著捕透;
cooldown:觸發(fā)一次條件后,等待一定epoch再進(jìn)行檢測(cè)碴萧,避免lr下降過(guò)速乙嘀;
min_lr:最小的允許lr;
eps:如果新舊lr之間的差異小與1e-8破喻,則忽略此次更新虎谢。
一起討論機(jī)器學(xué)習(xí)與Pytorch,可以加群747537854