金融相關(guān)時間序列分析全指南

來源：yv.l1.pnn - kesci.com
原文鏈接：時間序列分析入門齿兔，看這一篇就夠了
點(diǎn)擊以上鏈接?? 不用配置環(huán)境，直接在線運(yùn)行
數(shù)據(jù)集下載：DJIA 30股票時間序列

本文包含時間序列的數(shù)據(jù)結(jié)構(gòu)、可視化、統(tǒng)計(jì)學(xué)理論及模型介紹腔长，主要側(cè)重金融相關(guān)數(shù)據(jù)實(shí)踐。翻譯自：kaggle - everything you can do with a time series

# 導(dǎo)入必要的包
import os
import warnings
warnings.filterwarnings('ignore')
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight') 
# Above is a special style template for matplotlib, highly useful for visualizing time series data
%matplotlib inline
from pylab import rcParams
from plotly import tools
import plotly.plotly as py
from plotly.offline import init_notebook_mode, iplot
init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.figure_factory as ff
import statsmodels.api as sm
from numpy.random import normal, seed
from scipy.stats import norm
from statsmodels.tsa.arima_model import ARMA
from statsmodels.tsa.stattools import adfuller
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.arima_process import ArmaProcess
from statsmodels.tsa.arima_model import ARIMA
import math
from sklearn.metrics import mean_squared_error
print(os.listdir("../input"))

Matplotlib is building the font cache using fc-list. This may take a moment.

['DJIA3716', 'djia307822']

1.介紹日期與時間

1.1 導(dǎo)入時間序列數(shù)據(jù)

使用的數(shù)據(jù)

Google Stocks Data
Humidity in different world cities
Microsoft Stocks Data
Pressure in different world cities

這里使用parse_dates參數(shù)將所需的時間序列列導(dǎo)入為datetime列验残，并使用index_col參數(shù)將其選擇為數(shù)據(jù)框的索引捞附。

google = pd.read_csv('/home/kesci/input/DJIA3716/GOOGL_2006-01-01_to_2018-01-01.csv', 
index_col='Date', parse_dates=['Date'])

google.head()

1.2 清洗以及準(zhǔn)備時間序列數(shù)據(jù)

如何準(zhǔn)備數(shù)據(jù)？

Google 數(shù)據(jù)沒有缺失值您没，所以我們這里不用處理缺失值鸟召。如果有缺失值的話，我們可以使用 fillna() 方法并設(shè)置參數(shù) ffill 來填充缺失值紊婉。

該參數(shù) ffill 意味著用空白處前面的的最后一個的有效觀察值來填補(bǔ)缺失值药版。

google.isnull().any()

Open      False
High      False
Low       False
Close     False
Volume    False
Name      False
dtype: bool

google.isnull().sum()

Open      0
High      0
Low       0
Close     0
Volume    0
Name      0
dtype: int64

以上代碼表示這幾列都沒有缺失值。

1.3 可視化數(shù)據(jù)集

plt.figure()
google['2008':'2010'].plot(subplots=True, figsize=(10,12))
plt.title('2008年至2010年Google股票各個屬性的表現(xiàn)')
plt.show()

1.4 Timestamps和Periods

什么是timestamps 與 periods喻犁？為什么說他們很有用槽片？

Timestamps 用于表示時間點(diǎn). Periods 代表時間間隔.

Periods 可用于檢查給定期間內(nèi)是否有特定事件.

它們也可以轉(zhuǎn)換為彼此的形式。

# 創(chuàng)建一個Timestamp
timestamp = pd.Timestamp(2017, 1, 1, 12)
timestamp

Timestamp('2017-01-01 12:00:00')

# 創(chuàng)建一個period
period = pd.Period('2017-01-01')
period

Period('2017-01-01', 'D')

# 檢查給定timestamp是否在給定period內(nèi)
period.start_time < timestamp < period.end_time

True

# timestamp 轉(zhuǎn)換為 period
new_period = timestamp.to_period(freq='H')
new_period

Period('2017-01-01 12:00', 'H')

# period 轉(zhuǎn)換為 timestamp
new_timestamp = period.to_timestamp(freq='H', how='start')
new_timestamp

Timestamp('2017-01-01 00:00:00')

1.5 使用 date_range

什么是 date_range ?

date_range 是一種返回固定頻率datetimeindex的方法肢础。

當(dāng)創(chuàng)建自己的時間序列屬性——以用于預(yù)先存在的數(shù)據(jù)还栓，或圍繞您創(chuàng)建的時間序列屬性排列整個數(shù)據(jù)時，此函數(shù)非常有用传轰。

# 創(chuàng)建一個 datetimeindex 剩盒，使用日頻率（freq 默認(rèn)是 daily）
dr1 = pd.date_range(start='1/1/18', end='1/9/18')
dr1

DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
               '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08',
               '2018-01-09'],
              dtype='datetime64[ns]', freq='D')

# 創(chuàng)建一個 datetimeindex ，使用月（month）頻率
dr2 = pd.date_range(start='1/1/18', end='1/1/19', freq='M')
dr2

DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30',
               '2018-05-31', '2018-06-30', '2018-07-31', '2018-08-31',
               '2018-09-30', '2018-10-31', '2018-11-30', '2018-12-31'],
              dtype='datetime64[ns]', freq='M')

# 創(chuàng)建一個 datetimeindex 慨蛙，并且不指定起始日期辽聊，只使用 periods
dr3 = pd.date_range(end='1/4/2014', periods=8)
dr3

DatetimeIndex(['2013-12-28', '2013-12-29', '2013-12-30', '2013-12-31',
               '2014-01-01', '2014-01-02', '2014-01-03', '2014-01-04'],
              dtype='datetime64[ns]', freq='D')

# 創(chuàng)建一個 datetimeindex ，并且指定起始日期期贫，終止日期與 periods
dr4 = pd.date_range(start='2013-04-24', end='2014-11-27', periods=3)
dr4

DatetimeIndex(['2013-04-24', '2014-02-09', '2014-11-27'], dtype='datetime64[ns]', freq=None)

1.6 使用 to_datetime

pandas.to_datetime() 用于將參數(shù)轉(zhuǎn)換為日期時間跟匆。

在這里，DataFrame轉(zhuǎn)換為日期時間Series通砍。

df = pd.DataFrame({'year': [2015, 2016], 'month': [2, 3], 'day': [4, 5]})
df

# 轉(zhuǎn)化為時間Series
df = pd.to_datetime(df)
df

0   2015-02-04
1   2016-03-05
dtype: datetime64[ns]

# 轉(zhuǎn)化為Timestamp
df = pd.to_datetime(df)
df

Timestamp('2017-01-01 00:00:00')

1.7 移動（Shift）和滯后（Lags）

我們可以使用可選的時間頻率將索引按所需的周期數(shù)移動玛臂。

可讓我們將時間序列與其自身的過去進(jìn)行比較時，這很有用

google.loc['2008':'2009'].High.plot(legend=True)
google.loc['2008':'2009'].High.shift(10).plot(legend=True,label='移動后High')
plt.show()

1.8 重新采樣（Resampling）

Upsampling - 向上采樣封孙，時間序列從低頻到高頻（每月到每天）重新采樣迹冤。它涉及填充或內(nèi)插丟失的數(shù)據(jù)

Downsampling - 向下采樣，時間序列從高頻重新采樣到低頻（每周一次到每月一次）虎忌。它涉及現(xiàn)有數(shù)據(jù)的匯總泡徙。

這里要用一個極其不完整的數(shù)據(jù)集來展示

pressure = pd.read_csv('/home/kesci/input/djia307822/pressure.csv', 
index_col='datetime', parse_dates=['datetime'])
pressure.tail()

可以看到有非常多的缺失值需要我們?nèi)ヌ畛?/p>

pressure = pressure.iloc[1:]
pressure = pressure.fillna(method='ffill')
pressure.tail()

pressure.head(3)

# 向上填充
pressure = pressure.fillna(method='bfill')
pressure.head(3)

解釋一下

首先，我們使用 ffill 參數(shù)向下填充缺失值膜蠢。

然后锋勺，我們使用 bfill 向上填充缺失值蚀瘸。

# downsampling前查看數(shù)據(jù)形狀
pressure.shape

(45252, 36)

# 使用3日頻率進(jìn)行downsample，這里使用了mean （減少數(shù)據(jù)量——高頻到低頻）
pressure = pressure.resample('3D').mean()
pressure.head()

# downsampling后數(shù)據(jù)形狀
pressure.shape

(629, 36)

可以看見庶橱，downsample后剩下的行要少得多贮勃。

現(xiàn)在，我們將從3天的頻率 upsample 到每日頻率苏章，即低頻到高頻

pressure = pressure.resample('D').pad()
pressure.head()

pressure.shape

(1885, 36)

行數(shù)再次增加寂嘉。

如果使用正確，Resample的操作是很酷炫的枫绅。

2. 金融與統(tǒng)計(jì)

2.1 百分比變化量

# div為除法泉孩，shift()表示向后移動一期，這里是一天并淋，因?yàn)槭侨諗?shù)據(jù)
google['Change'] = google.High.div(google.High.shift())
google['Change'].plot(figsize=(20,8))

2.2 股票收益

plt.figure()
google['Return'] = google.Change.sub(1).mul(100)
google['Return'].plot(figsize=(20,8))
plt.axhline(y=0, color='r', linestyle='-')
plt.show()

# 另一個計(jì)算收益的方法
plt.figure()
google.High.pct_change().mul(100).plot(figsize=(20,6),color='b') 
google['Return'].plot(figsize=(20,8))
plt.axhline(y=0, color='r', linestyle='-')
plt.show()

2.3 連續(xù)的行中的絕對數(shù)值變化

# 用差分
google.High.diff().plot(figsize=(20,6))
plt.axhline(y=0, color='r', linestyle='-')
plt.show()

2.4 比較兩個或多個時間序列

我們將通過歸一化兩個時間序列再進(jìn)行比較寓搬。

這是通過將所有時間序列的每個時間序列元素除以第一個元素來實(shí)現(xiàn)的。

這樣县耽，兩個系列都從同一點(diǎn)開始句喷，可以輕松進(jìn)行比較。

# 我們將比較微軟和Google的股價
microsoft = pd.read_csv('/home/kesci/input/djia307822/MSFT_2006-01-01_to_2018-01-01.csv', index_col='Date', parse_dates=['Date'])

# 可視化
plt.figure()
google.High.plot()
microsoft.High.plot()
plt.legend(['Google','Microsoft'])
plt.show()

# 標(biāo)準(zhǔn)化并進(jìn)行比較
# 全部都除以第一行的數(shù)據(jù)
# 兩支股票都從100開始
normalized_google = google.High.div(google.High.iloc[0]).mul(100)
normalized_microsoft = microsoft.High.div(microsoft.High.iloc[0]).mul(100)

# 可視化
plt.figure()
normalized_google.plot()
normalized_microsoft.plot()
plt.legend(['Google','Microsoft'])
plt.show()

您可以清楚地看到Google在一段時間內(nèi)的表現(xiàn)勝過微軟兔毙。

2.5 Window函數(shù)

Window 函數(shù)用于標(biāo)識子periods唾琼，計(jì)算子periods的子度量metrics。

Rolling - 大小相同澎剥，且滑動步數(shù)相同

Expanding - 包含所有先前值

# Rolling window 函數(shù)
# Rolling 90天
rolling_google = google.High.rolling('90D').mean()

# 與Rolling前數(shù)據(jù)進(jìn)行比較
google.High.plot()
rolling_google.plot()
plt.legend(['High','Rolling Mean'])
plt.show()

可看出锡溯，Rolling均值圖是原始圖的平滑版本。

# Expanding window 函數(shù)
microsoft_mean = microsoft.High.expanding().mean()
microsoft_std = microsoft.High.expanding().std()
microsoft.High.plot()
microsoft_mean.plot()
microsoft_std.plot()
plt.legend(['High','Expanding Mean','Expanding Standard Deviation'])
plt.show()

2.6 OHLC圖

O:代表開盤價
Open H:代表最高價
High L: 代表最低價
Low C: 代表收盤價

OHLC圖是顯示特定時間段的開盤價哑姚，最高價祭饭，最低價和收盤價的任何類型的價格圖。開盤-高-低-收盤圖表（或OHLC圖）用作交易工具來可視化和分析證券叙量，貨幣甜癞，股票，債券宛乃，商品等隨時間的價格變化。

OHLC圖可解釋當(dāng)日價格-當(dāng)今市場的情緒蒸辆，并通過產(chǎn)生的模式預(yù)測未來的價格變化征炼。

OHLC圖上的y軸用于價格標(biāo)度，而x軸是時間標(biāo)度躬贡。在每個單個時間段上谆奥，OHLC圖都會繪制一個代表兩個范圍的符號：最高和最低交易價格，以及該單個時間段（例如一天）中的開盤價和收盤價拂玻。

在范圍符號上酸些，高和低價格范圍由主垂直線的長度表示宰译。開盤價和收盤價由刻度線的垂直位置表示，這些刻度線出現(xiàn)在高低垂直線的左側(cè)（代表開盤價）和右側(cè)（代表收盤價）魄懂。

可以為每個OHLC圖符號分配顏色沿侈，以區(qū)分市場是“看漲”（收盤價高于開盤價）還是“看漲”（收盤價低于開盤價）。

# 2008六月的OHLC圖
trace = go.Ohlc(x=google['06-2008'].index,
                open=google['06-2008'].Open,
                high=google['06-2008'].High,
                low=google['06-2008'].Low,
                close=google['06-2008'].Close)
data = [trace]
iplot(data, filename='simple_ohlc')

# 2008年OHLC圖
trace = go.Ohlc(x=google['2008'].index,
                open=google['2008'].Open,
                high=google['2008'].High,
                low=google['2008'].Low,
                close=google['2008'].Close)
data = [trace]
iplot(data, filename='simple_ohlc')

# 2006-2018年OLHC圖
trace = go.Ohlc(x=google.index,
                open=google.Open,
                high=google.High,
                low=google.Low,
                close=google.Close)
data = [trace]
iplot(data, filename='simple_ohlc')

2.7 蠟燭圖

這種圖表用作交易工具市栗，以可視化和分析證券缀拭，衍生工具，貨幣填帽，股票蛛淋，債券，商品等隨時間的價格變動篡腌。盡管燭臺圖中使用的符號類似于箱形圖褐荷，但它們的功能不同，不要彼此混淆嘹悼。

蠟燭圖通過使用類似燭臺的符號來顯示價格信息的多個維度叛甫，例如開盤價，收盤價绘迁，最高價和最低價合溺。每個符號代表單個時間段（分鐘，小時缀台，天棠赛，月等）的交易活動。每個燭臺符號均沿時間軸繪制在x軸上膛腐，以顯示一段時間內(nèi)的交易活動睛约。

蠟燭符號中的主要矩形稱為real body，用于顯示該時間段的開盤價和收盤價之間的范圍哲身。從實(shí)體的底部和頂部延伸的線被稱為上下陰影（或燈芯）辩涝。每個陰影代表所表示的時間段內(nèi)交易的最高或最低價格。當(dāng)市場看漲（收盤價高于開盤價）時勘天，機(jī)構(gòu)的顏色通常為白色或綠色怔揩。但是，當(dāng)市場看跌（收盤價低于開盤價）時脯丝，機(jī)構(gòu)通常會被涂成黑色或紅色商膊。

蠟燭圖非常適合檢測和預(yù)測一段時間內(nèi)的市場趨勢，并且通過每個蠟燭符號的顏色和形狀來解釋市場的日常情緒很有用宠进。例如晕拆，身體越長，銷售或購買壓力就越大材蹬。雖然這是一個非常短的主體实幕，但這表明該時間段內(nèi)價格變動很小吝镣。

蠟燭圖通過各種指標(biāo)（例如形狀和顏色）以及燭臺圖表中可以找到的許多可識別模式，幫助揭示市場心理（買賣雙方所經(jīng)歷的恐懼和貪婪）昆庇∧┘郑總共有42種公認(rèn)的模式，分為簡單模式和復(fù)雜模式凰锡。燭臺圖表中的這些模式對于顯示價格關(guān)系很有用未舟，并可用于預(yù)測市場的未來走勢。您可以在此處找到每個模式的列表和說明掂为。

請記住裕膀，蠟燭圖不表示開盤價和收盤價之間發(fā)生的事件，僅表示兩種價格之間的關(guān)系勇哗。因此昼扛，您無法確定在那個時間段內(nèi)交易的波動性。

Source: Datavizcatalogue

# 2008三月蠟燭圖
trace = go.Candlestick(x=google['03-2008'].index,
                open=google['03-2008'].Open,
                high=google['03-2008'].High,
                low=google['03-2008'].Low,
                close=google['03-2008'].Close)
data = [trace]
iplot(data, filename='simple_candlestick')

# 2008蠟燭圖
trace = go.Candlestick(x=google['2008'].index,
                open=google['2008'].Open,
                high=google['2008'].High,
                low=google['2008'].Low,
                close=google['2008'].Close)
data = [trace]
iplot(data, filename='simple_candlestick')

# 2006-2018蠟燭圖
trace = go.Candlestick(x=google.index,
                open=google.Open,
                high=google.High,
                low=google.Low,
                close=google.Close)
data = [trace]
iplot(data, filename='simple_candlestick')

2.8 自回歸系數(shù)Autocorrelation與偏自回歸系數(shù)Partial Autocorrelation

自回歸系數(shù)Autocorrelation - 自相關(guān)函數(shù)（ACF）可測量序列在不同滯后情況下欲诺，與其自身的相關(guān)關(guān)系抄谐。
偏自回歸系數(shù)Partial Autocorrelation - 偏自相關(guān)函數(shù)在另一角度上可以解釋為該序列相對于其過去滯后值的做回歸后的回歸系數(shù)∪欧ǎ可以用與標(biāo)準(zhǔn)線性回歸相同的方式解釋這些術(shù)語蛹含，即：在使其他滯后（lags）保持不變的同時，特定滯后變化的貢獻(xiàn)塞颁。

Source: Quora

自回歸系數(shù)Autocorrelation

# Google股票自回歸系數(shù)
plot_acf(google.High,lags=365,title="Lag=365的自回歸系數(shù)")
plt.show()

偏自回歸系數(shù)Partial Autocorrelation

# google票價偏自回歸系數(shù)
plot_pacf(google.High,lags=365)
plt.show()

# 微軟收盤價的PACF
plot_pacf(microsoft["Close"],lags=25)
plt.show()

在此浦箱，只有第0，第1和第20個滯后在統(tǒng)計(jì)上是有顯著的祠锣。

3. 時間序列分解與隨機(jī)游走

3.1. 趨勢酷窥、季節(jié)性與噪聲

這些是時間序列的組成部分

趨勢 - 時間序列的向上或向下一致的斜率
季節(jié)性 - 時間序列的清晰周期性模式（例如正弦函數(shù)）
噪聲 - 離群值或缺失值

google["High"].plot(figsize=(16,8))

# 現(xiàn)在我們分解它
rcParams['figure.figsize'] = 11, 9
decomposed_google_volume = sm.tsa.seasonal_decompose(google["High"],freq=360) 
figure = decomposed_google_volume.plot()
plt.show()

從上面的分解圖我們可以看到：

上圖中明顯有上升趨勢。
還可以看到統(tǒng)一的季節(jié)性變化伴网。
代表異常值和缺失值的非均勻噪聲

3.2. 白噪聲

白噪聲具有如下性質(zhì)：

恒定均值
恒定方差
所有滯后之間的相關(guān)關(guān)系均為0

# 畫出白噪聲
rcParams['figure.figsize'] = 16, 6
white_noise = np.random.normal(loc=0, scale=1, size=1000)
# 以上：loc表示期望蓬推，scale表示方差
plt.plot(white_noise)

# 白噪聲之間的自相關(guān)系數(shù)圖
plot_acf(white_noise,lags=20)
plt.show()

所有滯后在置信區(qū)間（陰影部分）內(nèi)，在統(tǒng)計(jì)上不顯著

3.3. 隨機(jī)游走

隨機(jī)游走是一種數(shù)學(xué)對象澡腾，它描述由某些數(shù)學(xué)空間（例如整數(shù)）上的一系列隨機(jī)步長構(gòu)成的路徑沸伏。

總的來說我們在談?wù)摴善钡臅r候：

$今日股價 = 前一天的股價 + 噪聲$

$P_t = P_{t-1}+\epsilon_t$

隨機(jī)游走是不能很好地被預(yù)測的，因?yàn)樵肼暰哂须S機(jī)性动分。

有漂移的隨機(jī)游走

（漂移 $\mu$ 具有0均值的特性）

$P_t - P_{t-1} = \mu + \epsilon_t$

Augmented Dickey-Fuller test

Augmented Dickey-Fuller Test 是一種統(tǒng)計(jì)學(xué)的檢測方法毅糟，用于檢測一個有某種趨勢的時間序列的平穩(wěn)性。是一種重要的單根檢測方法刺啦。
其初始假設(shè)null hypothesis是該序列不穩(wěn)定(存在單位根)，檢測可以有兩種途徑:

一是統(tǒng)計(jì)量小于特定拒絕域值纠脾；
二是p-value大于相應(yīng)域值玛瘸。如果是蜕青，則拒絕假設(shè)，認(rèn)為序列是穩(wěn)定的糊渊。

# 對谷歌和微軟股票交易量進(jìn)行Augmented Dickey-Fuller檢驗(yàn)
adf = adfuller(microsoft["Volume"])
print("p-value of microsoft: {}".format(float(adf[1])))
adf = adfuller(google["Volume"])
print("p-value of google: {}".format(float(adf[1])))

p-value of microsoft: 0.00032015252776520505
p-value of google: 6.510719605768349e-07

因?yàn)槲④浀?p-value 為 0.0003201525 小于 0.05, 所以拒絕 $H_0$ 右核，我們有充足理由認(rèn)為它不是隨機(jī)游走.

因?yàn)楣雀璧?p-value 0.0000006510 小于 0.05, 所以拒絕 $H_0$ ，我們有充足理由認(rèn)為它不是隨機(jī)游走.

生成隨機(jī)游走

seed(42)
rcParams['figure.figsize'] = 16, 6

# 從高斯分布生成隨機(jī)數(shù)
random_walk = normal(loc=0, scale=0.01, size=1000)
plt.plot(random_walk)
plt.show()

fig = ff.create_distplot([random_walk],['Random Walk'],bin_size=0.001)
iplot(fig, filename='Basic Distplot')

3.4 平穩(wěn)性

平穩(wěn)時間序列是一種統(tǒng)計(jì)特性渺绒，例如均值埃篓，方差舷手，自相關(guān)等不論時間怎么變化，他們都是恒定的。

強(qiáng)平穩(wěn)性：是一個隨著時間的推移丧没，其無條件聯(lián)合概率分布不會改變的隨機(jī)過程。因此冠摄，諸如均值和方差之類的參數(shù)也不會隨時間變化而變化桐经。其條件性非常強(qiáng)，一般很難滿足主到。
弱平穩(wěn)性：在整個過程中茶行，均值，方差登钥，自相關(guān)都是“恒定”的過程

平穩(wěn)性很重要畔师，因?yàn)橐蕾嚂r間的非平穩(wěn)序列具有太多的參數(shù)，無法在對時間序列建模時考慮到牧牢。

diff()（差分）方法可以輕松地將非平穩(wěn)序列轉(zhuǎn)換為平穩(wěn)序列看锉。

我們將嘗試解析出上述分解后的時間序列中的季節(jié)成分。

# 繪制原始的非平穩(wěn)時間序列
decomposed_google_volume.trend.plot()

# 新的平穩(wěn)的時間序列（使用diff()做了一階差分）
decomposed_google_volume.trend.diff().plot()

4. 用統(tǒng)計(jì)工具建模

建議記住統(tǒng)計(jì)模型的英文名稱即可结执，不必糾結(jié)其中文譯名

4.1 AR 模型

自回歸模型（AR）模型是一種隨機(jī)過程的表示度陆；因此，自回歸模型指定輸出變量線性地依賴于其自身的先前值和隨機(jī)項(xiàng)（一個不完全可預(yù)測的項(xiàng)）献幔。因此該模型采用隨機(jī)差分方程的形式來表示懂傀。

AR(1)

$R_t = \mu + \phi R_{t-1} + \epsilon_t$

由于公式右邊只有一階滯后項(xiàng)( $R_{t-1}$ )，這也成為在時間 $t$ 的一階 AR模型蜡感，其中 $\mu$ 是常數(shù)而 $\epsilon$ 是隨機(jī)游走

如果 $\phi = 1$ , 那么它是隨機(jī)游走.

如果 $\phi = 0$ , 它是白噪聲.

如果 $-1 < \phi < 1$ , 它是平穩(wěn)的.

如果 $\phi$ 是 -ve, 存在均值回歸.

如果 $\phi$ 是 +ve, 存在動量.

AR(2)

$R_t = \mu + \phi_1 R_{t-1} + \phi_2 R_{t-2} + \epsilon_t$

AR(3)

$R_t = \mu + \phi_1 R_{t-1} + \phi_2 R_{t-2} + \cdots + \phi_p R_{t-p} + \epsilon_t$

模擬 AR(1) 過程

# AR(1) model:AR parameter = +0.9
rcParams['figure.figsize'] = 16, 12
plt.subplot(4,1,1)
ar1 = np.array([1, -0.9]) # We choose -0.9 as AR parameter is +0.9
ma1 = np.array([1])
AR1 = ArmaProcess(ar1, ma1)
sim1 = AR1.generate_sample(nsample=1000)
plt.title('AR(1) model: AR parameter = +0.9')
plt.plot(sim1)
# 后面會介紹MA(q)模型
# AR(1) MA(1) AR parameter = -0.9
plt.subplot(4,1,2)
ar2 = np.array([1, 0.9]) # We choose +0.9 as AR parameter is -0.9
ma2 = np.array([1])
AR2 = ArmaProcess(ar2, ma2)
sim2 = AR2.generate_sample(nsample=1000)
plt.title('AR(1) model: AR parameter = -0.9')
plt.plot(sim2)
# AR(2) MA(1) AR parameter = 0.9
plt.subplot(4,1,3)
ar3 = np.array([2, -0.9]) # 我們選擇 -0.9 作為 AR 參數(shù)： +0.9
ma3 = np.array([1])
AR3 = ArmaProcess(ar3, ma3)
sim3 = AR3.generate_sample(nsample=1000)
plt.title('AR(2) model: AR parameter = +0.9')
plt.plot(sim3)
# AR(2) MA(1) AR parameter = -0.9
plt.subplot(4,1,4)
ar4 = np.array([2, 0.9]) # 我們選擇 +0.9 作為 AR 參數(shù)： -0.9
ma4 = np.array([1])
AR4 = ArmaProcess(ar4, ma4)
sim4 = AR4.generate_sample(nsample=1000)
plt.title('AR(2) model: AR parameter = -0.9')
plt.plot(sim4)
plt.show()

預(yù)測

model = ARMA(sim1, order=(1,0))
result = model.fit()
print(result.summary())
print("μ={} ,?={}".format(result.params[0],result.params[1]))

                              ARMA Model Results                              
==============================================================================
Dep. Variable:                      y   No. Observations:                 1000
Model:                     ARMA(1, 0)   Log Likelihood               -1415.701
Method:                       css-mle   S.D. of innovations              0.996
Date:                Wed, 16 Oct 2019   AIC                           2837.403
Time:                        05:23:12   BIC                           2852.126
Sample:                             0   HQIC                          2842.998
                                                                              
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.7072      0.288      2.454      0.014       0.142       1.272
ar.L1.y        0.8916      0.014     62.742      0.000       0.864       0.919
                                    Roots                                    
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
AR.1            1.1216           +0.0000j            1.1216            0.0000
-----------------------------------------------------------------------------
μ=0.707201867170821 ,?=0.8915815672242373

$\phi$ 在 0.9 附近蹬蚁，我們可以選其作為我們AR模型的參數(shù)。

預(yù)測模型

# Predicting simulated AR(1) model 
result.plot_predict(start=900, end=1010)
plt.show()

rmse = math.sqrt(mean_squared_error(sim1[900:1011], result.predict(start=900,end=999)))
print("The root mean squared error is {}.".format(rmse))

The root mean squared error is 1.0408054564799076.

y 是預(yù)測結(jié)果的可視化

# 預(yù)測Minneapolis的氣壓水平
humid = ARMA(pressure['Minneapolis'].diff().iloc[2:].values, order=(1,0))
res = humid.fit()
res.plot_predict(start=1000, end=1100)
plt.show()

再試試谷歌股票

# Predicting closing prices of google
humid = ARMA(google["Close"].diff().iloc[1:].values, order=(1,0))
res = humid.fit()
res.plot_predict(start=900, end=1010)
plt.show()

也許會有更好的模型

4.2 MA 模型

即移動平均模型 (MA) model專門針對單變量時間序列建模郑兴。

移動平均模型指定輸出變量線性地取決于隨機(jī)（不完全可預(yù)測）項(xiàng)的當(dāng)前值和各種過去值犀斋。

MA(1)

$R_t = \mu + \epsilon_t + \theta \epsilon_{t-1}$

可理解為 $今日回報 = 均值 + 今日噪聲 + 前一天的噪聲$

因?yàn)榈仁接疫呏挥幸浑A滯后項(xiàng), 所以此為一階MA過程。

模擬 MA(1)

rcParams['figure.figsize'] = 16, 6
ar1 = np.array([1])
ma1 = np.array([1, -0.5])
MA1 = ArmaProcess(ar1, ma1)
sim1 = MA1.generate_sample(nsample=1000)
plt.plot(sim1)

預(yù)測模擬的 MA 模型

model = ARMA(sim1, order=(0,1))
result = model.fit()
print(result.summary())
print("μ={} ,θ={}".format(result.params[0],result.params[1]))

                              ARMA Model Results                              
==============================================================================
Dep. Variable:                      y   No. Observations:                 1000
Model:                     ARMA(0, 1)   Log Likelihood               -1423.276
Method:                       css-mle   S.D. of innovations              1.004
Date:                Wed, 16 Oct 2019   AIC                           2852.553
Time:                        05:30:51   BIC                           2867.276
Sample:                             0   HQIC                          2858.148
                                                                              
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const         -0.0228      0.014     -1.652      0.099      -0.050       0.004
ma.L1.y       -0.5650      0.027    -20.797      0.000      -0.618      -0.512
                                    Roots                                    
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
MA.1            1.7699           +0.0000j            1.7699            0.0000
-----------------------------------------------------------------------------
μ=-0.022847164219747917 ,θ=-0.5650012133945569

用 MA 模型做預(yù)測

model = ARMA(pressure['Minneapolis'].diff().iloc[2:].values, order=(0,3))
result = model.fit()
print(result.summary())
print("μ={} ,θ={}".format(result.params[0],result.params[1]))
result.plot_predict(start=1000, end=1100)
plt.show()

                              ARMA Model Results                              
==============================================================================
Dep. Variable:                      y   No. Observations:                 1883
Model:                     ARMA(0, 3)   Log Likelihood               -5242.754
Method:                       css-mle   S.D. of innovations              3.915
Date:                Wed, 16 Oct 2019   AIC                          10495.508
Time:                        05:31:42   BIC                          10523.212
Sample:                             0   HQIC                         10505.712
                                                                              
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0003      0.026      0.010      0.992      -0.050       0.050
ma.L1.y    -1.101e-06      0.018  -6.06e-05      1.000      -0.036       0.036
ma.L2.y    -1.162e-06      0.018  -6.39e-05      1.000      -0.036       0.036
ma.L3.y       -0.7176      0.027    -26.450      0.000      -0.771      -0.664
                                    Roots                                    
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
MA.1           -0.5585           -0.9673j            1.1170           -0.3333
MA.2           -0.5585           +0.9673j            1.1170            0.3333
MA.3            1.1170           -0.0000j            1.1170           -0.0000
-----------------------------------------------------------------------------
μ=0.0002522391526654763 ,θ=-1.1005662984144147e-06

image

rmse = math.sqrt(mean_squared_error(pressure['Minneapolis'].diff().iloc[1000:1101].values, result.predict(start=1000,end=1100)))
print("The root mean squared error is {}.".format(rmse))

The root mean squared error is 3.0713710764189415.

4.3 ARMA 模型

自回歸移動平均模型 (ARMA) 用兩個多項(xiàng)式來簡要描述（弱）平穩(wěn)隨機(jī)過程情连，一個多項(xiàng)式用于自回歸叽粹，第二個多項(xiàng)式用于移動平均值。它是AR和MA模型的融合。

ARMA(1,1)

$R_t = \mu + \phi R_{t-1} + \epsilon_t + \theta \epsilon_{t-1}$

基本上

今日回報 = 均值 + 前一天的回報 + 噪聲 + 昨天的噪聲

使用 ARMA 作預(yù)測

# 預(yù)測Microsoft庫存量
model = ARMA(microsoft["Volume"].diff().iloc[1:].values, order=(3,3))
result = model.fit()
print(result.summary())
print("μ={}, ?={}, θ={}".format(result.params[0],result.params[1],result.params[2]))
result.plot_predict(start=1000, end=1100)
plt.show()

                              ARMA Model Results                              
==============================================================================
Dep. Variable:                      y   No. Observations:                 3018
Model:                     ARMA(3, 3)   Log Likelihood              -55408.974
Method:                       css-mle   S.D. of innovations       22751608.082
Date:                Wed, 16 Oct 2019   AIC                         110833.948
Time:                        05:32:58   BIC                         110882.047
Sample:                             0   HQIC                        110851.244
                                                                              
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const       -2.03e+04   9915.585     -2.047      0.041   -3.97e+04    -863.030
ar.L1.y        0.2053      0.160      1.287      0.198      -0.107       0.518
ar.L2.y        0.7297      0.179      4.081      0.000       0.379       1.080
ar.L3.y       -0.1413      0.057     -2.467      0.014      -0.254      -0.029
ma.L1.y       -0.8117      0.157     -5.166      0.000      -1.120      -0.504
ma.L2.y       -0.7692      0.258     -2.979      0.003      -1.275      -0.263
ma.L3.y        0.5853      0.130      4.494      0.000       0.330       0.841
                                    Roots                                    
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
AR.1           -1.1772           +0.0000j            1.1772            0.5000
AR.2            1.1604           +0.0000j            1.1604            0.0000
AR.3            5.1820           +0.0000j            5.1820            0.0000
MA.1           -1.1579           +0.0000j            1.1579            0.5000
MA.2            1.0075           +0.0000j            1.0075            0.0000
MA.3            1.4647           +0.0000j            1.4647            0.0000
-----------------------------------------------------------------------------
μ=-20297.220675944438, ?=0.20526115960300076, θ=0.7297015548306405

image

rmse = math.sqrt(mean_squared_error(microsoft["Volume"].diff().iloc[1000:1101].values, result.predict(start=1000,end=1100)))
print("The root mean squared error is {}.".format(rmse))

The root mean squared error is 38038294.04431978.

ARMA 模型展現(xiàn)出了比 AR 與 MA 更好的結(jié)果.

4.4 ARIMA 模型

自回歸綜合移動平均（ARIMA）模型是自回歸移動平均（ARMA）模型的概括虫几。

這兩個模型都適合于時間序列數(shù)據(jù)锤灿，以更好地理解數(shù)據(jù)或預(yù)測序列中的未來點(diǎn)（預(yù)測）。

ARIMA模型適用于數(shù)據(jù)顯示出非平穩(wěn)特性的某些情況辆脸，其可以執(zhí)行一次或多次差分以消除非平穩(wěn)性但校。

ARIMA 模型的形式: ARIMA(p,d,q): p 是 AR 參數(shù), d 是差分參數(shù), q 是 MA 參數(shù)。

ARIMA(1,0,0)

等價于一階自回歸模型AR(1)

ARIMA(1,0,1)

等價與ARMA(1,1)

ARIMA(1,1,1)

$\Delta y_t = a_1 \Delta y_{t-1} + \epsilon_t + b_1 \epsilon_{t-1}$

其中 $\Delta y_t = y_t - y_{t-1}$

使用 ARIMA 做預(yù)測

# 預(yù)測microsoft股票交易量
rcParams['figure.figsize'] = 16, 6
model = ARIMA(microsoft["Volume"].diff().iloc[1:].values, order=(2,1,0))
result = model.fit()
print(result.summary())
result.plot_predict(start=700, end=1000)
plt.show()

                             ARIMA Model Results                              
==============================================================================
Dep. Variable:                    D.y   No. Observations:                 3017
Model:                 ARIMA(2, 1, 0)   Log Likelihood              -56385.467
Method:                       css-mle   S.D. of innovations       31647215.008
Date:                Wed, 16 Oct 2019   AIC                         112778.933
Time:                        05:33:01   BIC                         112802.981
Sample:                             1   HQIC                        112787.581
                                                                              
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const       9984.0302   2.48e+05      0.040      0.968   -4.75e+05    4.95e+05
ar.L1.D.y     -0.8716      0.016    -53.758      0.000      -0.903      -0.840
ar.L2.D.y     -0.4551      0.016    -28.071      0.000      -0.487      -0.423
                                    Roots                                    
=============================================================================
                  Real          Imaginary           Modulus         Frequency
-----------------------------------------------------------------------------
AR.1           -0.9575           -1.1315j            1.4823           -0.3618
AR.2           -0.9575           +1.1315j            1.4823            0.3618
-----------------------------------------------------------------------------

rmse = math.sqrt(mean_squared_error(microsoft["Volume"].diff().iloc[700:1001].values, result.predict(start=700,end=1000)))
print("The root mean squared error is {}.".format(rmse))

The root mean squared error is 61937614.65140498.

考慮到輕微的滯后啡氢，這是一個很好的模型状囱。

4.5 VAR 模型

向量自回歸（VAR）是一種隨機(jī)過程模型，用于捕獲多個時間序列之間的線性相互依賴性倘是。

VAR模型通過允許多個evolving variable來概括單變量AR模型亭枷。

VAR中的所有變量都以相同的方式進(jìn)入模型：每個變量都有一個方程式，該方程式根據(jù)其自身的滯后值辨绊，其他模型變量的滯后值和誤差項(xiàng)來解釋其演化奶栖。

VAR建模不需要像具有聯(lián)立方程的結(jié)構(gòu)模型那樣，需要了解背后有什么“推力”在影響變量：唯一要求的先驗(yàn)知識是一個假設(shè)會在跨時間相互影響的變量列表门坷。

# 預(yù)測谷歌和微軟的收盤價格
train_sample = pd.concat([google["Close"].diff().iloc[1:],microsoft["Close"].diff().iloc[1:]],axis=1)
model = sm.tsa.VARMAX(train_sample,order=(2,1),trend='c')
result = model.fit(maxiter=1000,disp=False)
print(result.summary())
predicted_result = result.predict(start=0, end=1000)
result.plot_diagnostics()
# 計(jì)算誤差
rmse = math.sqrt(mean_squared_error(train_sample.iloc[1:1002].values, predicted_result.values))
print("The root mean squared error is {}.".format(rmse))

                           Statespace Model Results                           
==============================================================================
Dep. Variable:     ['Close', 'Close']   No. Observations:                 3018
Model:                     VARMA(2,1)   Log Likelihood              -12184.882
                          + intercept   AIC                          24403.764
Date:                Wed, 16 Oct 2019   BIC                          24505.974
Time:                        05:33:18   HQIC                         24440.517
Sample:                             0                                         
                               - 3018                                         
Covariance Type:                  opg                                         
===================================================================================
Ljung-Box (Q):                77.46, 79.29   Jarque-Bera (JB):   48076.50, 14930.54
Prob(Q):                        0.00, 0.00   Prob(JB):                   0.00, 0.00
Heteroskedasticity (H):         3.31, 1.62   Skew:                      1.15, -0.03
Prob(H) (two-sided):            0.00, 0.00   Kurtosis:                 22.42, 13.90
                           Results for equation Close                          
===============================================================================
                  coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------
const           0.3663      0.290      1.265      0.206      -0.201       0.934
L1.Close       -0.3546      0.534     -0.664      0.507      -1.402       0.693
L1.Close       -0.0416      5.614     -0.007      0.994     -11.044      10.961
L2.Close        0.0126      0.034      0.375      0.707      -0.053       0.079
L2.Close        0.3345      0.443      0.755      0.450      -0.533       1.203
L1.e(Close)     0.3946      0.534      0.739      0.460      -0.652       1.441
L1.e(Close)    -0.2233      5.642     -0.040      0.968     -11.281      10.835
                           Results for equation Close                          
===============================================================================
                  coef    std err          z      P>|z|      [0.025      0.975]
-------------------------------------------------------------------------------
const           0.0185      0.028      0.656      0.512      -0.037       0.074
L1.Close        0.0316      0.053      0.591      0.554      -0.073       0.136
L1.Close       -0.3797      0.613     -0.619      0.536      -1.582       0.822
L2.Close        0.0016      0.004      0.450      0.653      -0.005       0.009
L2.Close       -0.0433      0.044     -0.986      0.324      -0.129       0.043
L1.e(Close)    -0.0293      0.053     -0.549      0.583      -0.134       0.075
L1.e(Close)     0.3336      0.614      0.544      0.587      -0.869       1.536
                                Error covariance matrix                                 
========================================================================================
                           coef    std err          z      P>|z|      [0.025      0.975]
----------------------------------------------------------------------------------------
sqrt.var.Close           6.9016      0.041    167.252      0.000       6.821       6.982
sqrt.cov.Close.Close     0.2924      0.005     57.757      0.000       0.282       0.302
sqrt.var.Close           0.4809      0.003    163.163      0.000       0.475       0.487
========================================================================================


The root mean squared error is 3.6743099933249246.

image

4.6.1 SARIMA models

SARIMA模型對于季節(jié)性時間序列的建模非常有用宣鄙，在季節(jié)性時間序列中，給定季節(jié)的平均值和其他統(tǒng)計(jì)數(shù)據(jù)在各年中都不是平穩(wěn)的默蚌。定義的SARIMA模型是非季節(jié)自回歸綜合移動平均（ARMA）和自回歸移動平均（ARIMA）模型的直接擴(kuò)展

# Predicting closing price of Google'
train_sample = google["Close"].diff().iloc[1:].values
model = sm.tsa.SARIMAX(train_sample,order=(4,0,4),trend='c')
result = model.fit(maxiter=1000,disp=False)
print(result.summary())
predicted_result = result.predict(start=0, end=500)
result.plot_diagnostics()
# calculating error
rmse = math.sqrt(mean_squared_error(train_sample[1:502], predicted_result))
print("The root mean squared error is {}.".format(rmse))

                           Statespace Model Results                           
==============================================================================
Dep. Variable:                      y   No. Observations:                 3018
Model:               SARIMAX(4, 0, 4)   Log Likelihood              -10098.505
Date:                Wed, 16 Oct 2019   AIC                          20217.009
Time:                        05:33:31   BIC                          20277.133
Sample:                             0   HQIC                         20238.629
                               - 3018                                         
Covariance Type:                  opg                                         
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
intercept      0.1014      0.047      2.137      0.033       0.008       0.194
ar.L1          0.1878      0.008     24.882      0.000       0.173       0.203
ar.L2          1.1923      0.006    190.557      0.000       1.180       1.205
ar.L3          0.2170      0.007     32.332      0.000       0.204       0.230
ar.L4         -0.9787      0.008   -127.674      0.000      -0.994      -0.964
ma.L1         -0.1934      0.008    -23.486      0.000      -0.210      -0.177
ma.L2         -1.1942      0.007   -166.466      0.000      -1.208      -1.180
ma.L3         -0.2249      0.008    -28.546      0.000      -0.240      -0.209
ma.L4          0.9738      0.008    116.740      0.000       0.957       0.990
sigma2        47.1073      0.408    115.374      0.000      46.307      47.908
===================================================================================
Ljung-Box (Q):                       67.27   Jarque-Bera (JB):             52919.32
Prob(Q):                              0.00   Prob(JB):                         0.00
Heteroskedasticity (H):               3.30   Skew:                             1.21
Prob(H) (two-sided):                  0.00   Kurtosis:                        23.37
===================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
The root mean squared error is 4.398362762846205.

plt.plot(train_sample[1:502],color='red')
plt.plot(predicted_result,color='blue')
plt.legend(['Actual','Predicted'])
plt.title('Google Closing prices')
plt.show()

4.6.3 動態(tài)因子模型

動態(tài)因子模型是用于多元時間序列的靈活模型冻晤，其中觀察到的內(nèi)生變量是外生協(xié)變量和未觀察因子的線性函數(shù)，具有向量自回歸結(jié)構(gòu)绸吸。

未觀察到的因素也可能是外部協(xié)變量的函數(shù)鼻弧。

因變量方程中的擾動可能是自相關(guān)的。

# 預(yù)測微軟和谷歌的收盤價
train_sample = pd.concat([google["Close"].diff().iloc[1:],microsoft["Close"].diff().iloc[1:]],axis=1)
model = sm.tsa.DynamicFactor(train_sample, k_factors=1, factor_order=2)
result = model.fit(maxiter=1000,disp=False)
print(result.summary())
predicted_result = result.predict(start=0, end=1000)
result.plot_diagnostics()
# 計(jì)算誤差
rmse = math.sqrt(mean_squared_error(train_sample.iloc[1:1002].values, predicted_result.values))
print("The root mean squared error is {}.".format(rmse))

                                   Statespace Model Results                                  
=============================================================================================
Dep. Variable:                    ['Close', 'Close']   No. Observations:                 3018
Model:             DynamicFactor(factors=1, order=2)   Log Likelihood              -12198.578
Date:                               Wed, 16 Oct 2019   AIC                          24409.156
Time:                                       05:33:39   BIC                          24445.230
Sample:                                            0   HQIC                         24422.128
                                              - 3018                                         
Covariance Type:                                 opg                                         
===================================================================================
Ljung-Box (Q):                77.67, 95.04   Jarque-Bera (JB):   48193.46, 15038.05
Prob(Q):                        0.00, 0.00   Prob(JB):                   0.00, 0.00
Heteroskedasticity (H):         3.36, 1.62   Skew:                      1.14, -0.04
Prob(H) (two-sided):            0.00, 0.00   Kurtosis:                 22.44, 13.94
                          Results for equation Close                          
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
loading.f1    -6.9110      1.822     -3.794      0.000     -10.481      -3.341
                          Results for equation Close                          
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
loading.f1    -0.2922      0.077     -3.784      0.000      -0.444      -0.141
                        Results for factor equation f1                        
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
L1.f1          0.0292      0.021      1.382      0.167      -0.012       0.071
L2.f1          0.0137      0.016      0.865      0.387      -0.017       0.045
                            Error covariance matrix                             
================================================================================
                   coef    std err          z      P>|z|      [0.025      0.975]
--------------------------------------------------------------------------------
sigma2.Close  6.184e-05     25.182   2.46e-06      1.000     -49.355      49.355
sigma2.Close     0.2327      0.045      5.119      0.000       0.144       0.322
================================================================================

Warnings:
[1] Covariance matrix calculated using the outer product of gradients (complex-step).
The root mean squared error is 3.682205152159832.

可供參考文獻(xiàn):

Manipulating Time Series Data in Python
Introduction to Time Series Analysis in Python
Visualizing Time Series Data in Python
VAR models and LSTM
State space models

最后編輯于：2020.03.16 14:57:35

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末锦茁，一起剝皮案震驚了整個濱河市攘轩，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌码俩，老刑警劉巖度帮，帶你破解...
沈念sama閱讀 219,188評論 6贊 508
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異稿存，居然都是意外死亡笨篷，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,464評論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門瓣履，熙熙樓的掌柜王于貴愁眉苦臉地迎上來率翅，“玉大人，你說我怎么就攤上這事袖迎∶岢簦” “怎么了腺晾？”我有些...
開封第一講書人閱讀 165,562評論 0贊 356
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長辜贵。經(jīng)常有香客問我丘喻，道長，這世上最難降的妖魔是什么念颈？我笑而不...
開封第一講書人閱讀 58,893評論 1贊 295
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮连霉，結(jié)果婚禮上榴芳，老公的妹妹穿的比我還像新娘。我一直安慰自己跺撼，他們只是感情好窟感，可當(dāng)我...
茶點(diǎn)故事閱讀 67,917評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著歉井，像睡著了一般柿祈。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上哩至，一...
開封第一講書人閱讀 51,708評論 1贊 305
城市分裂傳說
那天躏嚎，我揣著相機(jī)與錄音，去河邊找鬼菩貌。笑死卢佣，一個胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的箭阶。我是一名探鬼主播虚茶，決...
沈念sama閱讀 40,430評論 3贊 420
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼仇参！你這毒婦竟也來了嘹叫？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 39,342評論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤诈乒，失蹤者是張志新（化名）和其女友劉穎罩扇，沒想到半個月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體抓谴，經(jīng)...
沈念sama閱讀 45,801評論 1贊 317
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡暮蹂，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,976評論 3贊 337
?白月光啟示錄
正文我和宋清朗相戀三年，在試婚紗的時候發(fā)現(xiàn)自己被綠了癌压。大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片仰泻。...
茶點(diǎn)故事閱讀 40,115評論 1贊 351
活死人
序言：一個原本活蹦亂跳的男人離奇死亡，死狀恐怖滩届，靈堂內(nèi)的尸體忽然破棺而出集侯，到底是詐尸還是另有隱情被啼，我是刑警寧澤，帶...
沈念sama閱讀 35,804評論 5贊 346
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布棠枉，位于F島的核電站浓体，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏辈讶。R本人自食惡果不足惜命浴，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,458評論 3贊 331
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望贱除。院中可真熱鬧生闲，春花似錦、人聲如沸月幌。這莊子的主人今日做“春日...
開封第一講書人閱讀 32,008評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽扯躺。三九已至捉兴，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間录语，已是汗流浹背倍啥。一陣腳步聲響...
開封第一講書人閱讀 33,135評論 1贊 272
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留澎埠，地道東北人逗栽。一個月前我還...
沈念sama閱讀 48,365評論 3贊 373
代替公主和親
正文我出身青樓，卻偏偏與公主長得像失暂，于是被迫代替她去往敵國和親彼宠。傳聞我的和親對象是個殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 45,055評論 2贊 355

金融相關(guān)時間序列分析全指南

1.介紹日期與時間

1.1 導(dǎo)入時間序列數(shù)據(jù)

使用的數(shù)據(jù)

1.2 清洗以及準(zhǔn)備時間序列數(shù)據(jù)

如何準(zhǔn)備數(shù)據(jù)？

1.3 可視化數(shù)據(jù)集

1.4 Timestamps和Periods

什么是timestamps 與 periods喻犁？ 為什么說他們很有用槽片？

1.5 使用 date_range

什么是 date_range ?

1.6 使用 to_datetime

1.7 移動（Shift）和滯后（Lags）

1.8 重新采樣（Resampling）

2. 金融與統(tǒng)計(jì)

2.1 百分比變化量

2.2 股票收益

2.3 連續(xù)的行中的絕對數(shù)值變化

2.4 比較兩個或多個時間序列

2.5 Window函數(shù)

2.6 OHLC圖

2.7 蠟燭圖

2.8 自回歸系數(shù)Autocorrelation與偏自回歸系數(shù)Partial Autocorrelation

自回歸系數(shù)Autocorrelation

偏自回歸系數(shù)Partial Autocorrelation

3. 時間序列分解與隨機(jī)游走

3.1. 趨勢酷窥、季節(jié)性與噪聲

3.2. 白噪聲

3.3. 隨機(jī)游走

有漂移的隨機(jī)游走

相關(guān)的統(tǒng)計(jì)檢驗(yàn)

Augmented Dickey-Fuller test

生成隨機(jī)游走

3.4 平穩(wěn)性

4. 用統(tǒng)計(jì)工具建模

4.1 AR 模型

AR(1)

AR(2)

AR(3)

模擬 AR(1) 過程

預(yù)測

預(yù)測模型

4.2 MA 模型

MA(1)

模擬 MA(1)

預(yù)測模擬的 MA 模型

用 MA 模型做預(yù)測

4.3 ARMA 模型

ARMA(1,1)

使用 ARMA 作預(yù)測

4.4 ARIMA 模型

ARIMA(1,0,0)

ARIMA(1,0,1)

ARIMA(1,1,1)

使用 ARIMA 做預(yù)測

4.5 VAR 模型

4.6.1 SARIMA models

4.6.3 動態(tài)因子模型

推薦閱讀更多精彩內(nèi)容

什么是timestamps 與 periods喻犁？為什么說他們很有用槽片？