1例隆、摘要
本文主要講解:bilstm-cnn-attention對(duì)時(shí)序數(shù)據(jù)進(jìn)行預(yù)測(cè)
主要思路:
- 對(duì)時(shí)序數(shù)據(jù)進(jìn)行分塊,生成三維時(shí)序數(shù)據(jù)塊
- 建立模型,卷積層-bilstm層-attention按順序建立鹃锈,attention層可放中間也可放前面窜醉,效果各不相同
- 訓(xùn)練模型宪萄,使用訓(xùn)練好的模型進(jìn)行預(yù)測(cè)
- 調(diào)參優(yōu)化,保存模型
2榨惰、數(shù)據(jù)介紹
需要完整代碼和數(shù)據(jù)介紹請(qǐng)移步我的下載拜英,技術(shù)實(shí)力不錯(cuò)的大佬應(yīng)該能不用代碼就能復(fù)用,看大家大展神通琅催,小白請(qǐng)移步以下鏈接:
cnn+lstm+attention對(duì)時(shí)序數(shù)據(jù)進(jìn)行預(yù)測(cè)
3居凶、相關(guān)技術(shù)
BiLSTM:前向和方向的兩條LSTM網(wǎng)絡(luò),被稱(chēng)為雙向LSTM藤抡,也叫BiLSTM侠碧。其思想是將同一個(gè)輸入序列分別接入向前和先后的兩個(gè)LSTM中,然后將兩個(gè)網(wǎng)絡(luò)的隱含層連在一起缠黍,共同接入到輸出層進(jìn)行預(yù)測(cè)弄兜。
attention注意力機(jī)制
一維卷積
cnn+lstm+attention 網(wǎng)絡(luò)結(jié)構(gòu)圖
4、完整代碼和步驟
此程序運(yùn)行代碼版本為:
tensorflow==2.5.0
numpy==1.19.5
keras==2.6.0
matplotlib==3.5.2
主運(yùn)行程序入口
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.layers import Conv1D, Bidirectional, Multiply, LSTM
from keras.layers.core import *
from keras.models import *
from sklearn.metrics import mean_absolute_error
from keras import backend as K
from tensorflow.python.keras.layers import CuDNNLSTM
from my_utils.read_write import pdReadCsv
import numpy as np
SINGLE_ATTENTION_VECTOR = False
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
os.environ["TF_KERAS"] = '1'
# 注意力機(jī)制
def attention_3d_block(inputs):
input_dim = int(inputs.shape[2])
a = inputs
a = Dense(input_dim, activation='softmax')(a)
# 根據(jù)給定的模式(dim)置換輸入的維度 例如(2,1)即置換輸入的第1和第2個(gè)維度
a_probs = Permute((1, 2), name='attention_vec')(a)
# Layer that multiplies (element-wise) a list of inputs.
output_attention_mul = Multiply()([inputs, a_probs])
return output_attention_mul
# 創(chuàng)建時(shí)序數(shù)據(jù)塊
def create_dataset(dataset, look_back):
dataX, dataY = [], []
for i in range(len(dataset) - look_back - 1):
a = dataset[i:(i + look_back), :]
dataX.append(a)
dataY.append(dataset[i + look_back, :])
TrainX = np.array(dataX)
Train_Y = np.array(dataY)
return TrainX, Train_Y
# 建立cnn-BiLSTM-并添加注意力機(jī)制
def attention_model():
inputs = Input(shape=(TIME_STEPS, INPUT_DIMS))
# 卷積層和dropout層
x = Conv1D(filters=64, kernel_size=1, activation='relu')(inputs) # , padding = 'same'
x = Dropout(0.3)(x)
# For GPU you can use CuDNNLSTM cpu LSTM
lstm_out = Bidirectional(CuDNNLSTM(lstm_units, return_sequences=True))(x)
lstm_out = Dropout(0.3)(lstm_out)
attention_mul = attention_3d_block(lstm_out)
# 用于將輸入層的數(shù)據(jù)壓成一維的數(shù)據(jù)嫁佳,一般用再卷積層和全連接層之間
attention_mul = Flatten()(attention_mul)
# output = Dense(1, activation='sigmoid')(attention_mul) 分類(lèi)
output = Dense(1, activation='linear')(attention_mul)
model = Model(inputs=[inputs], outputs=output)
return model
# 歸一化
def fit_size(x, y):
from sklearn import preprocessing
x_MinMax = preprocessing.MinMaxScaler()
y_MinMax = preprocessing.MinMaxScaler()
x = x_MinMax.fit_transform(x)
y = y_MinMax.fit_transform(y)
return x, y, y_MinMax
def flatten(X):
flattened_X = np.empty((X.shape[0], X.shape[2]))
for i in range(X.shape[0]):
flattened_X[i] = X[i, (X.shape[1] - 1), :]
return (flattened_X)
src = r'E:\dat'
path = r'E:\dat'
trials_path = r'E:\dat'
train_path = src + r'merpre.csv'
df = pdReadCsv(train_path, ',')
df = df.replace("--", '0')
df.fillna(0, inplace=True)
INPUT_DIMS = 43
TIME_STEPS = 12
lstm_units = 64
def load_data(df_train):
X_train = df_train.drop(['Per'], axis=1)
y_train = df_train['wap'].values.reshape(-1, 1)
return X_train, y_train, X_train, y_train
groups = df.groupby(['Per'])
for name, group in groups:
X_train, y_train, X_test, y_test = load_data(group)
# 歸一化
train_x, train_y, train_y_MinMax = fit_size(X_train, y_train)
test_x, test_y, test_y_MinMax = fit_size(X_test, y_test)
train_X, _ = create_dataset(train_x, TIME_STEPS)
_, train_Y = create_dataset(train_y, TIME_STEPS)
print(train_X.shape, train_Y.shape)
m = attention_model()
m.summary()
m.compile(loss='mae', optimizer='Adam', metrics=['mae'])
model_path = r'me_pre\\'
callbacks = [
EarlyStopping(monitor='val_loss', patience=2, verbose=0), # 當(dāng)兩次迭代損失未改善挨队,Keras停止訓(xùn)練
ModelCheckpoint(model_path, monitor='val_loss', save_best_only=True, verbose=0),
]
m.fit(train_X, train_Y, batch_size=32, epochs=111, shuffle=True, verbose=1,
validation_split=0.1, callbacks=callbacks)
# m.fit(train_X, train_Y, epochs=111, batch_size=32)
test_X, _ = create_dataset(test_x, TIME_STEPS)
_, test_Y = create_dataset(test_y, TIME_STEPS)
pred_y = m.predict(test_X)
inv_pred_y = test_y_MinMax.inverse_transform(pred_y)
inv_test_Y = test_y_MinMax.inverse_transform(test_Y)
mae = int(mean_absolute_error(inv_test_Y, inv_pred_y))
print('test_mae : ', mae)
mae = str(mae)
print(name)
m.save(
model_path + name[0] + '_' + name[1] + '_' + name[2] + '_' + mae + '.h5')
需要數(shù)據(jù)和代碼代寫(xiě)請(qǐng)私聊,pytorch版本的也有蒿往,cnn+lstm+attention對(duì)時(shí)序數(shù)據(jù)進(jìn)行預(yù)測(cè)
參考論文:
基于CNN和LSTM的多通道注意力機(jī)制文本分類(lèi)模型