ARIMA
Notation
- T-n : a prior or lag time
- T: current time and point of reference
- T+n: future or forecast time
components
- Level: baseline value
- trend: often linear increasing or decreasing over time
- Seasonality: repeating patterns over time
- Noise: cannot be explained by the model
Some concerns
Sample size
Updated frequently over time or be made once and remain static
-
Down-sampling or up-sampling
- Frequency
- outliers
- Missing
-
As a supervised Machine Learning
- Sliding window with univariate time series / multivariate time series
Q&A
- (Python) Difference between autocorrelation_plot and plot_acf / plot_pacf ?
- autocorrelation_plot and plot_acf are the same
- Definition
- { Yt } 嚴(yán)平穩(wěn): 對一切 k 和時點 t1, t2, …, tn, 都有T_t1, T_t2, … T_tn 與T_{t1-k}, T_{t2-k}, …., T_{tn-k} 的聯(lián)合分布相關(guān)
- { Yt } 弱平穩(wěn)條件
- 均值函數(shù)在所有時間上恒為常數(shù)
- Gamma_{t, t-k} = gamma_{0, k}, 對所有時間 t 和 滯后 k
- { Yt } 弱平穩(wěn)條件
- methods
- Line plot
- Randomly split data into 2 or more parts then check the mean and covariance
- Statistical test - ADF(augmented Dicky-Fuller test)
- Explanation
- H0: time series has a unit root, meaning is is non-stationary
- { Yt } 嚴(yán)平穩(wěn): 對一切 k 和時點 t1, t2, …, tn, 都有T_t1, T_t2, … T_tn 與T_{t1-k}, T_{t2-k}, …., T_{tn-k} 的聯(lián)合分布相關(guān)
- Transforms
- Difference
- Log
- 當(dāng)序列散度與序列值有正相關(guān)關(guān)系時嗦玖,即序列的值越大霜定,圍繞該值的波動就越大
- 對數(shù)的差分通常稱為收益率
- Box-Cox/冪變換
- 估計lambda
- 當(dāng)lambda = 0 時,退化為log變換
- Add Seasonality
- How to interpret the key results for ARIMA :
- For each coef, the null hypothesis is that the term is not significantly different from 0, which indicates that no association exists between the term and the response.
- https://support.minitab.com/en-us/minitab/18/help-and-how-to/modeling-statistics/time-series/how-to/arima/interpret-the-results/key-results/?SID=117600
- Residuals test
- Residuals time series -> exist trend or not
- qq plot -> lies in a line
- Residuals acf graph
- Residual Ljung-box test
- 將相關(guān)系數(shù)的值作為一個組來檢驗尚揣,定義統(tǒng)計量 Q