1、LTSM
傳統(tǒng)循環(huán)網(wǎng)絡(luò)RNN可以通過(guò)記憶體實(shí)現(xiàn)短期記憶進(jìn)行連續(xù)數(shù)據(jù)的預(yù)測(cè)。但是當(dāng)連續(xù)數(shù)據(jù)的序列變長(zhǎng)時(shí),會(huì)使展開時(shí)間步長(zhǎng)過(guò)長(zhǎng)挤土,在反向傳播更新參數(shù)時(shí),梯度要按照時(shí)間步連續(xù)相乘误算,會(huì)導(dǎo)致梯度消失仰美。所以在1997年Hochreitere等人提出了LTSM(長(zhǎng)短記憶網(wǎng)絡(luò))。LTSM中引入了三個(gè)門限:
- 細(xì)胞態(tài):表征長(zhǎng)期記憶
- 候選態(tài): 等待存入長(zhǎng)期記憶的候選態(tài)(歸納出的儿礼、待存入細(xì)胞態(tài)的新知識(shí))
- 記憶體: 短期記憶咖杂,屬于長(zhǎng)期記憶的一部分
- xt: 當(dāng)前時(shí)刻的輸入特征
- ht-1: 上個(gè)時(shí)刻的短期記憶
- wi,wf,wo: 待訓(xùn)練參數(shù)矩陣
- bi,bf,bo: 待訓(xùn)練偏置項(xiàng)
- σ: sigmoid激活函數(shù)(使門限的范圍在0-1之間)
- wc: 待訓(xùn)練參數(shù)矩陣
- bc: 待訓(xùn)練偏置項(xiàng)
1.1、 LSTM計(jì)算過(guò)程
舉個(gè)栗子(來(lái)源于視頻老師):LSTM就是你聽我講課的過(guò)程蚊夫,你現(xiàn)在腦袋里記住的內(nèi)容诉字,是今天ppt第1頁(yè)到第45頁(yè)的長(zhǎng)期記憶ct。這個(gè)長(zhǎng)期記憶ct由兩部分組成:一部分是ppt第1頁(yè)到第44頁(yè)的內(nèi)容,也就是上一時(shí)刻的長(zhǎng)期記憶ct-1壤圃。你不可能一字不差的記住全部?jī)?nèi)容陵霉,你會(huì)不自覺(jué)的忘掉一部分,所以上個(gè)時(shí)刻的長(zhǎng)期記憶ct-1要乘以遺忘門伍绳,這個(gè)成績(jī)項(xiàng)表示留存在你腦中的對(duì)過(guò)去的記憶踊挠。我現(xiàn)在講的內(nèi)容是新知識(shí),是即將存入你腦中的現(xiàn)在的記憶〕迳保現(xiàn)在的記憶由兩部分組成:一部分是我正在講的第45頁(yè)ppt效床,是當(dāng)前時(shí)刻的輸入xt;還有一部分是第44頁(yè)ppt的短期記憶留存漠趁,這是上一時(shí)刻的短期記憶ht-1扁凛。你的腦袋把當(dāng)前時(shí)刻的輸入xt和上一時(shí)刻的短期記憶ht-1歸納形成即將存入你腦中的現(xiàn)在的記憶ct波浪號(hào)。現(xiàn)在的記憶ct波浪號(hào)乘以輸入門與過(guò)去記憶一同存儲(chǔ)為長(zhǎng)期記憶闯传。當(dāng)你把這一講復(fù)述給你的朋友時(shí),你不可能一字不落的講出來(lái)卤妒,你講的是存留在你腦中的長(zhǎng)期記憶甥绿,經(jīng)過(guò)輸出門篩選后的內(nèi)容,這就是記憶體的輸出ht则披。當(dāng)有多層循環(huán)網(wǎng)絡(luò)時(shí)共缕,第二層循環(huán)網(wǎng)絡(luò)的輸入xt就是第一層循環(huán)網(wǎng)絡(luò)的輸出ht。輸入第二層網(wǎng)絡(luò)的是第一層網(wǎng)絡(luò)提取出的精華士复。你可以認(rèn)為我現(xiàn)在扮演的就是第一層循環(huán)網(wǎng)絡(luò)图谷,每一頁(yè)ppt我都是從一篇一篇論文中提取出的精華,輸出給你阱洪;作為第二層循環(huán)網(wǎng)絡(luò)的你便贵,接收到的數(shù)據(jù),是我的長(zhǎng)期記憶過(guò)tanh激活函數(shù)乘以輸出門提取出的短期記憶ht冗荸。這就是LSTM的計(jì)算過(guò)程承璃。
1.2、TF描述LSTM
tf.keras.layers.LSTM(
記憶體個(gè)數(shù),
return_sequences=是否有返回輸出
)
return_sequences=True # 各時(shí)間同步輸出ht
return_sequences=False # 僅最后時(shí)間步輸出ht(默認(rèn))
# 例
model = tf.keras.layers.LSTM(
LSTM(80, return_sequences=True),
Dropout(0.2),
LSTM(100), # 僅最后時(shí)間步輸出ht
Dropout(0.2),
Dense(1)
)
1.3蚌本、LSTM實(shí)現(xiàn)股票預(yù)測(cè)
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.keras.layers import Dense, Dropout, LSTM
import os
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math
maotai = pd.read_csv("./SH600519.csv") # 讀取股票文件
training_set = maotai.iloc[0:2126, 2:3].values # 前2126天的開盤價(jià)作為訓(xùn)練集盔粹,取C列開盤價(jià)
test_set = maotai.iloc[2126:, 2:3] # 后300天的開盤價(jià)作為測(cè)試集
# 歸一化
sc = MinMaxScaler(feature_range=(0, 1)) # 進(jìn)行歸一化,歸一化到(0程癌,1)之間
training_set_scaled = sc.fit_transform(training_set) # 訓(xùn)練集上進(jìn)行歸一化
test_set = sc.transform(test_set) # 利用訓(xùn)練集的屬性對(duì)測(cè)試集進(jìn)行歸一化
x_train = []
y_train = []
x_test = []
y_test = []
# 提取連續(xù)60天的開盤價(jià)作為輸入特征x_train,第61天的數(shù)據(jù)作為標(biāo)簽
for i in range(60, len(training_set_scaled)):
x_train.append(training_set_scaled[i-60: i, 0])
y_train.append(training_set_scaled[i, 0])
np.random.seed(8)
np.random.shuffle(x_train)
np.random.seed(8)
np.random.shuffle(y_train)
tf.random.set_seed(8)
x_train, y_train = np.array(x_train), np.array(y_train)
# 使x_train符合RNN輸入要求: [送入樣本數(shù)舷嗡, 循環(huán)核時(shí)間展開步數(shù), 每個(gè)時(shí)間同步輸入特征個(gè)數(shù)]
# 此處整個(gè)數(shù)據(jù)集送入嵌莉,送入樣本數(shù)為x_train.shape[0]組數(shù)據(jù):輸入60個(gè)開盤價(jià)进萄,預(yù)測(cè)第61天的開盤價(jià),
# 循環(huán)核展開步數(shù)為60;每個(gè)時(shí)間步進(jìn)入的特征是第一天的開盤價(jià)垮斯,只有一個(gè)數(shù)據(jù)郎仆,故每個(gè)時(shí)間步
# 輸入特征個(gè)數(shù)為1
x_train = np.reshape(x_train, (x_train.shape[0], 60, 1))
# 同上處理測(cè)試集
for i in range(60, len(test_set)):
x_test.append(test_set[i-60:i, 0])
y_test.append(test_set[i, 0])
x_test, y_test = np.array(x_test), np.array(y_test)
x_test = np.reshape(x_test, (x_test.shape[0], 60, 1))
model = tf.keras.Sequential([
LSTM(80, return_sequences=True),
Dropout(0.2),
LSTM(100),
Dropout(0.2),
Dense(1)
])
model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
loss="mean_squared_error")
check_point_save_path = "./checkpoint_stock/lstm_stock.ckpt"
if os.path.exists(check_point_save_path + ".index"):
print("******load model******")
model.load_weights(check_point_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=check_point_save_path,
save_weights_only=True,
save_best_only=True,
monitor="val_loss"
)
history = model.fit(x_train, y_train, batch_size=64, epochs=50,
validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])
model.summary()
with open("./lstm__stock_weight.txt", "w") as f:
for v in model.trainable_variables:
f.write(str(v.name) + "\n")
f.write(str(v.shape) + "\n")
f.write(str(v.numpy()) + "\n")
loss = history.history["loss"]
val_loss = history.history["val_loss"]
plt.plot(loss, label="Training Loss")
plt.plot(val_loss, label="Validation Loss")
plt.title("Training and Validation Loss")
plt.legend()
plt.show()
# 測(cè)試集輸入模型進(jìn)行預(yù)測(cè)
predictd_stock_price = model.predict(x_test)
# 對(duì)預(yù)測(cè)數(shù)據(jù)進(jìn)行還原,反歸一化
predictd_stock_price = sc.inverse_transform(predictd_stock_price)
# 對(duì)真實(shí)數(shù)據(jù)進(jìn)行還原兜蠕,反歸一化
real_stock_price = sc.inverse_transform(test_set[60:])
# 畫出真實(shí)數(shù)據(jù)和預(yù)測(cè)數(shù)據(jù)的對(duì)比曲線
plt.plot(real_stock_price, color="red", label="MaoTai Stock Price")
plt.plot(predictd_stock_price, color="blue", label="Predicted MaoTai Stock Price")
plt.title("MaoTai Stock Price Rrediction")
plt.xlabel("Time")
plt.ylabel("MaoTai Stock Price")
plt.legend()
plt.show()
# 評(píng)價(jià)模型:均方誤差扰肌,均方根誤差,平均絕對(duì)誤差
mse = mean_squared_error(predictd_stock_price, real_stock_price)
rmse = math.sqrt(mse)
mae = mean_absolute_error(predictd_stock_price, real_stock_price)
print("均方誤差: %.6f" % mse)
print("均方根誤差: %.6f" % rmse)
print("平均絕對(duì)誤差: %.6f" % mae)
部分結(jié)果如下:
均方誤差: 3709.382565
均方根誤差: 60.904701
平均絕對(duì)誤差: 53.672697
2熊杨、GRU
在2014年cho等人簡(jiǎn)化了LSTM結(jié)構(gòu)曙旭,提出了GRU網(wǎng)絡(luò)。GRU使記憶體ht融合了長(zhǎng)期記憶和短期記憶晶府。ht包含了過(guò)去信息ht-1和現(xiàn)在信息ht波浪號(hào)」瘐铮現(xiàn)在信息是過(guò)去信息ht-1過(guò)重置門與當(dāng)前輸入共同決定。兩個(gè)門限的取值范圍也是0到1之間川陆。前向傳播時(shí)直接使用記憶體更新公式剂习,就可以算出每個(gè)時(shí)刻的ht值了
2.1、TF描述GRU層
tf.keras.layers.GRU(
記憶體個(gè)數(shù),
return_sequences=是否有返回輸出
)
return_sequences=True # 各時(shí)間同步輸出ht
return_sequences=False # 僅最后時(shí)間步輸出ht(默認(rèn))
# 例
model = tf.keras.layers.GRU(
GRU(80, return_sequences=True),
Dropout(0.2),
GRU(100), # 僅最后時(shí)間步輸出ht
Dropout(0.2),
Dense(1)
)
2.2较沪、GRU實(shí)現(xiàn)股票預(yù)測(cè)代碼
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
from tensorflow.keras.layers import Dense, Dropout, GRU
import os
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error
import math
maotai = pd.read_csv("./SH600519.csv") # 讀取股票文件
training_set = maotai.iloc[0:2126, 2:3].values # 前2126天的開盤價(jià)作為訓(xùn)練集鳞绕,取C列開盤價(jià)
test_set = maotai.iloc[2126:, 2:3] # 后300天的開盤價(jià)作為測(cè)試集
# 歸一化
sc = MinMaxScaler(feature_range=(0, 1)) # 進(jìn)行歸一化,歸一化到(0尸曼,1)之間
training_set_scaled = sc.fit_transform(training_set) # 訓(xùn)練集上進(jìn)行歸一化
test_set = sc.transform(test_set) # 利用訓(xùn)練集的屬性對(duì)測(cè)試集進(jìn)行歸一化
x_train = []
y_train = []
x_test = []
y_test = []
# 提取連續(xù)60天的開盤價(jià)作為輸入特征x_train,第61天的數(shù)據(jù)作為標(biāo)簽
for i in range(60, len(training_set_scaled)):
x_train.append(training_set_scaled[i-60: i, 0])
y_train.append(training_set_scaled[i, 0])
np.random.seed(8)
np.random.shuffle(x_train)
np.random.seed(8)
np.random.shuffle(y_train)
tf.random.set_seed(8)
x_train, y_train = np.array(x_train), np.array(y_train)
# 使x_train符合RNN輸入要求: [送入樣本數(shù)们何, 循環(huán)核時(shí)間展開步數(shù), 每個(gè)時(shí)間同步輸入特征個(gè)數(shù)]
# 此處整個(gè)數(shù)據(jù)集送入控轿,送入樣本數(shù)為x_train.shape[0]組數(shù)據(jù):輸入60個(gè)開盤價(jià)冤竹,預(yù)測(cè)第61天的開盤價(jià),
# 循環(huán)核展開步數(shù)為60茬射;每個(gè)時(shí)間步進(jìn)入的特征是第一天的開盤價(jià)鹦蠕,只有一個(gè)數(shù)據(jù),故每個(gè)時(shí)間步
# 輸入特征個(gè)數(shù)為1
x_train = np.reshape(x_train, (x_train.shape[0], 60, 1))
# 同上處理測(cè)試集
for i in range(60, len(test_set)):
x_test.append(test_set[i-60:i, 0])
y_test.append(test_set[i, 0])
x_test, y_test = np.array(x_test), np.array(y_test)
x_test = np.reshape(x_test, (x_test.shape[0], 60, 1))
model = tf.keras.Sequential([
GRU(80, return_sequences=True),
Dropout(0.2),
GRU(100),
Dropout(0.2),
Dense(1)
])
model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
loss="mean_squared_error")
check_point_save_path = "./checkpoint_stock/gru_stock.ckpt"
if os.path.exists(check_point_save_path + ".index"):
print("******load model******")
model.load_weights(check_point_save_path)
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=check_point_save_path,
save_weights_only=True,
save_best_only=True,
monitor="val_loss"
)
history = model.fit(x_train, y_train, batch_size=64, epochs=50,
validation_data=(x_test, y_test), validation_freq=1, callbacks=[cp_callback])
model.summary()
with open("./gru__stock_weight.txt", "w") as f:
for v in model.trainable_variables:
f.write(str(v.name) + "\n")
f.write(str(v.shape) + "\n")
f.write(str(v.numpy()) + "\n")
loss = history.history["loss"]
val_loss = history.history["val_loss"]
plt.plot(loss, label="Training Loss")
plt.plot(val_loss, label="Validation Loss")
plt.title("Training and Validation Loss")
plt.legend()
plt.show()
# 測(cè)試集輸入模型進(jìn)行預(yù)測(cè)
predictd_stock_price = model.predict(x_test)
# 對(duì)預(yù)測(cè)數(shù)據(jù)進(jìn)行還原躲株,反歸一化
predictd_stock_price = sc.inverse_transform(predictd_stock_price)
# 對(duì)真實(shí)數(shù)據(jù)進(jìn)行還原片部,反歸一化
real_stock_price = sc.inverse_transform(test_set[60:])
# 畫出真實(shí)數(shù)據(jù)和預(yù)測(cè)數(shù)據(jù)的對(duì)比曲線
plt.plot(real_stock_price, color="red", label="MaoTai Stock Price")
plt.plot(predictd_stock_price, color="blue", label="Predicted MaoTai Stock Price")
plt.title("MaoTai Stock Price Rrediction")
plt.xlabel("Time")
plt.ylabel("MaoTai Stock Price")
plt.legend()
plt.show()
# 評(píng)價(jià)模型:均方誤差,均方根誤差霜定,平均絕對(duì)誤差
mse = mean_squared_error(predictd_stock_price, real_stock_price)
rmse = math.sqrt(mse)
mae = mean_absolute_error(predictd_stock_price, real_stock_price)
print("均方誤差: %.6f" % mse)
print("均方根誤差: %.6f" % rmse)
print("平均絕對(duì)誤差: %.6f" % mae)
部分結(jié)果如下:
均方誤差: 863.598247
均方根誤差: 29.387042
平均絕對(duì)誤差: 24.093177