深度學習&PyTorch 之 DNN-回歸中使用HR數據集進行了實現(xiàn),但是HR數據集中只有一個變量,這里我們使用多變量在進行模擬一下
流程還是跟前面一樣
graph TD
A[數據導入] --> B[數據拆分]
B[數據拆分] --> C[Tensor轉換]
C[Tensor轉換] --> D[數據重構]
D[數據重構] --> E[模型定義]
E[模型定義] --> F[模型訓練]
F[模型訓練] --> G[結果展示]
1.1 數據導入
我們使用波士頓房價預測數據克滴,這是個開源的數據集蟀淮,所以通用性更強
data = pd.read_csv('./boston_house_prices.csv')
data
1.2 數據拆分
from sklearn.model_selection import train_test_split
train,test = train_test_split(data, train_size=0.7)
train_x = train[['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT']].values
test_x = test[['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT']].values
train_y = train.MEDV.values.reshape(-1, 1)
test_y = test.MEDV.values.reshape(-1, 1)
1.3 To Tensor
train_x = torch.from_numpy(train_x).type(torch.FloatTensor)
test_x = torch.from_numpy(test_x).type(torch.FloatTensor)
train_y = torch.from_numpy(train_y).type(torch.FloatTensor)
test_y = torch.from_numpy(test_y).type(torch.FloatTensor)
1.4 數據重構
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader
train_ds = TensorDataset(X, Y)
train_dl = DataLoader(train_ds, batch_size=batch, shuffle=True)
train_ds = TensorDataset(train_x, train_y)
train_dl = DataLoader(train_ds, batch_size=batch, shuffle=True)
test_ds = TensorDataset(test_x, test_y)
test_dl = DataLoader(test_ds, batch_size=batch * 2)
與之前是一樣的
1.5 網絡定義
class LinearModel(nn.Module):
def __init__(self):
super(LinearModel, self).__init__()
self.linear = nn.Linear(13, 1)
def forward(self, inputs):
logits = self.linear(inputs)
return logits
我們這里有13個特征變量
1.6 訓練
model = LinearModel()
loss_fn = nn.MSELoss()
opt = torch.optim.SGD(model.parameters(), lr=lr) # 定義優(yōu)化器
train_loss = []
train_acc = []
test_loss = []
test_acc = []
for epoch in range(epochs+1):
model.train()
for xb, yb in train_dl:
pred = model(xb)
loss = loss_fn(pred, yb)
loss.backward()
opt.step()
opt.zero_grad()
if epoch%10==0:
model.eval()
with torch.no_grad():
train_epoch_loss = sum(loss_fn(model(xb), yb) for xb, yb in train_dl)
test_epoch_loss = sum(loss_fn(model(xb), yb) for xb, yb in test_dl)
train_loss.append(train_epoch_loss.data.item() / len(train_dl))
test_loss.append(test_epoch_loss.data.item() / len(test_dl))
template = ("epoch:{:2d}, 訓練損失:{:.5f}, 驗證損失:{:.5f}")
print(template.format(epoch, train_epoch_loss.data.item() / len(train_dl), test_epoch_loss.data.item() / len(test_dl)))
print('訓練完成')
epoch: 0, 訓練損失:469.15608, 驗證損失:440.95737
epoch:10, 訓練損失:101.80890, 驗證損失:109.48333
epoch:20, 訓練損失:91.18239, 驗證損失:100.17014
epoch:30, 訓練損失:100.83169, 驗證損失:97.70323
epoch:40, 訓練損失:89.96843, 驗證損失:97.37273
epoch:50, 訓練損失:94.20027, 驗證損失:96.82300
......
epoch:480, 訓練損失:74.97700, 驗證損失:81.29946
epoch:490, 訓練損失:74.74702, 驗證損失:80.76858
epoch:500, 訓練損失:89.31947, 驗證損失:83.06767
訓練完成
1.7 結果展示
import matplotlib.pyplot as plt
plt.plot(range(len(train_loss)), train_loss, label='train_loss')
plt.plot(range(len(test_loss)), test_loss, label='test_loss')
plt.legend()