【說(shuō)明】
應(yīng)大家要求究驴,我把第二部分文檔提供出來(lái)磺送,這部分文檔是由我的前同事Thomas整理的旅掂,大家對(duì)于完整的md文件可以參見(jiàn)我們的GitHub項(xiàng)目https://github.com/SimaShanhe/tsfresh-feature-translation沈跨。
max_langevin_fixed_point(x, r, m)
- 譯:langevin模型的最大定點(diǎn)
- 數(shù)學(xué)解釋:從
多項(xiàng)式中估計(jì)
,它已經(jīng)能適應(yīng)Langevin模型的確定性動(dòng)力學(xué)
這被下述的文章描述:
Friedrich et al. (2000): Physics Letters A 271, p. 217-222 Extracting model equations from experimental data
對(duì)于短時(shí)間序列家制,這個(gè)方法高度依賴于參數(shù)饼拍。
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(int)適合估計(jì)動(dòng)力學(xué)固定點(diǎn)的多項(xiàng)式的階數(shù)
-
(float)用于平均的分位數(shù)
-
- 返回:最大的確定性動(dòng)力學(xué)定點(diǎn)(float浮點(diǎn)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.max_langevin_fixed_point(ts, m, r)
mean(x)
- 譯:計(jì)算x序列的平均值
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.mean(ts)
mean_abs_change(x)
- 譯:時(shí)間序列連續(xù)兩點(diǎn)值的變化的絕對(duì)值的平均值
- 返回后續(xù)時(shí)間序列值之間的絕對(duì)差值的平均值:
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.mean_abs_change(ts)
mean_change(x)
- 譯:時(shí)間序列連續(xù)兩點(diǎn)值的變化的平均值
- 返回后續(xù)時(shí)間序列值之間的差值的平均值:
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.mean_change(ts)
mean_second_derivative_central(x)
- 譯:二階導(dǎo)數(shù)的中心的均值
- 返回二階導(dǎo)數(shù)的中心近似的平均值:
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.mean_second_derivative_central(ts)
median(x)
- 譯:計(jì)算
序列的中位數(shù)
- 返回
序列的中位數(shù)
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.median(ts)
minimum(x)
- 譯:計(jì)算
序列的最小值
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.minimum(ts)
number_crossing_m(x, m)
- 譯:計(jì)算
上的
的交叉數(shù)渣磷。交叉數(shù)被定義為兩個(gè)連續(xù)值,其中第一個(gè)值小于
而下一個(gè)值更大罢猪,反之亦然近她。如果將
設(shè)置為零,則將獲得零交叉的數(shù)量膳帕。
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(float)交叉項(xiàng)的閾值
-
- 返回:這個(gè)特征的值(int整數(shù)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.number_crossing_m(ts, m)
number_cwt_peaks(x, n)
- 譯:此特征計(jì)算器搜索
中的不同峰值粘捎。為此,
由ricker小波平滑备闲,寬度范圍從1到n晌端。此特征計(jì)算器返回在足夠?qū)挾确秶鷥?nèi)出現(xiàn)的峰值數(shù)量捅暴,并具有足夠高的信噪比(SNR)恬砂。
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(int)考慮的最大寬度
-
- 返回:這個(gè)特征的值(整數(shù)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.number_cwt_peaks(ts, n)
number_peaks(x, n)
- 譯:計(jì)算時(shí)間序列x中至少支持n的峰值數(shù)。支持n的峰值被定義為x的子序列蓬痒,其中出現(xiàn)值大于其左邊和右邊的n個(gè)鄰居泻骤。
因此在序列中:
>>> x = [3,0,0,4,0,0,13]
4是支持1和2的一個(gè)峰值,因?yàn)樵谧有蛄兄校?/p>
>>> [0,4,0] >>> [0,0,4,0,0]
4仍然是最大值梧奢。但是在這里狱掂,4不是支持3的峰值,因?yàn)?3是4右邊的第三個(gè)鄰居并且比4大亲轨。
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(int)峰的支持?jǐn)?shù)
-
- 返回:這個(gè)特征的值(整數(shù)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.number_cwt_peaks(ts, n)
partial_autocorrelation(x, param)
- 譯:計(jì)算給定滯后處的部分自相關(guān)函數(shù)的值趋惨。
- 計(jì)算公式:時(shí)間序列{
}的滯后k部分自相關(guān)等于
和
適應(yīng)中間變量{
}([1])的部分相關(guān)。根據(jù)[2]之后惦蚊,它可以定義為:
(a)和(b)
是可以由OLS擬合的AR(k-1)模型器虾。請(qǐng)注意,在(a)中蹦锋,回歸是對(duì)過(guò)去的值進(jìn)行預(yù)測(cè)
兆沙。而在(b)中,未來(lái)的值用于計(jì)算過(guò)去的值
莉掂。在[1]中說(shuō)“對(duì)于AR(p)葛圃,部分自相關(guān)[
]對(duì)于
將是非零且對(duì)于
將為零"。使用此屬性憎妙,它用于確定AR-過(guò)程的滯后库正。
參考:
[1] Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. John Wiley & Sons.
[2] https://onlinecourses.science.psu.edu/stat510/node/62
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 參數(shù)(list列表)包含多個(gè)字典{“l(fā)ag”: val},用整數(shù)(val)顯示返回的滯后值
-
- 返回:這個(gè)特征的值(float浮點(diǎn)型)
- 函數(shù)類型:組合器
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.partial_autocorrelation(ts, param)
percentage_of_reoccurring_datapoints_to_all_datapoints(x)
- 譯:返回多次出現(xiàn)在時(shí)間系列中的唯一值的百分比
- 計(jì)算公式:出現(xiàn)多于一次的不同值的個(gè)數(shù) / 不同值的個(gè)數(shù)
這意味著該百分比標(biāo)準(zhǔn)化為惟一值的數(shù)量厘唾,而不是重復(fù)出現(xiàn)的值占所有值的百分比褥符。 - 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.percentage_of_reoccurring_datapoints_to_all_datapoints(ts)
percentage_of_reoccurring_values_to_all_values(x)
- 譯:返回多次出現(xiàn)在時(shí)間序列中的唯一值的比率
- 計(jì)算公式:出現(xiàn)多于一次的數(shù)據(jù)點(diǎn)的個(gè)數(shù) / 所有數(shù)據(jù)點(diǎn)的個(gè)數(shù)
這意味著這個(gè)比率與時(shí)間序列中數(shù)據(jù)點(diǎn)的數(shù)量是標(biāo)準(zhǔn)化的,相比the percentage_of_reoccurring_datapoints_to_all_datapoints
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.percentage_of_reoccurring_values_to_all_values(ts)
quantile(x, q)
- 譯:計(jì)算
的q分位數(shù)阅嘶。這是大于
的有序值的前
的
值属瓣。
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(float)計(jì)算中位數(shù)
-
- 返回:這個(gè)特征的值(浮點(diǎn)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.quantile(ts, n)
range_count(x, min, max)
- 譯:計(jì)算區(qū)間[min载迄,max]內(nèi)的觀測(cè)值的個(gè)數(shù)。
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- min(int or float)范圍包含下限
- max(int or float)范圍包含上限
-
- 返回:范圍內(nèi)值的個(gè)數(shù)(整型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.range_count(ts, min, max)
ratio_beyond_r_sigma(x, r)
- 譯:偏離x的平均值大于r * std(x)(so r sigma)的值的比率抡蛙。
- 參數(shù):
(iterable)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(浮點(diǎn)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ae = tsf.feature_extraction.feature_calculators.ratio_beyond_r_sigma(ts, r)
ratio_value_number_to_time_series_length(x)
- 譯:如果時(shí)間序列中的所有值僅出現(xiàn)一次护昧,則返回1,如果不是這樣粗截,則小于1惋耙。原則上,它只是返回:
- 計(jì)算公式:?jiǎn)我坏闹?/ 所有的值
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.ratio_value_number_to_time_series_length(ts)
sample_entropy(x)
- 譯:計(jì)算和返回序列
的樣本熵
參考:
[1] http://en.wikipedia.org/wiki/Sample_Entropy
[2] https://www.ncbi.nlm.nih.gov/pubmed/10843903?dopt=Abstract - 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.sample_entropy(ts)
set_property(key, value)
該方法返回一個(gè)裝飾器熊昌,該裝飾器將函數(shù)的屬性鍵設(shè)置為value绽榛。
skewness(x)
- 譯:返回x的樣本偏度(使用調(diào)整后的Fisher-Pearson標(biāo)準(zhǔn)化力矩系數(shù)G1計(jì)算)
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.skewness(ts)
spkt_welch_density(x, param)
- 譯:該特征計(jì)算器估計(jì)不同頻率下時(shí)間序列x的交叉功率譜密度。為此婿屹,首先將時(shí)間序列從時(shí)域轉(zhuǎn)移到頻域灭美。
特征計(jì)算器返回不同頻率的功率譜。 - 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- param(list)包括多個(gè)字典{“coeff”: x}(
為整型)
-
- 返回:不同的特征值(pandas.Series)
- 函數(shù)類型:組合器
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) # 數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.spkt_welch_density(ts, param)
standard_deviation(x)
- 譯:返回
的標(biāo)準(zhǔn)偏差
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.standard_deviation(ts)
sum_of_reoccurring_data_points(x)
- 譯:返回時(shí)間序列中出現(xiàn)超過(guò)一次的所有數(shù)據(jù)點(diǎn)的個(gè)數(shù)總和
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.standard_deviation(ts)
sum_of_reoccurring_values(x)
- 譯:返回時(shí)間序列中出現(xiàn)超過(guò)一次的所有數(shù)據(jù)點(diǎn)的值總和
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)數(shù))
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.sum_of_reoccurring_values(ts)
sum_values(x)
- 譯:計(jì)算時(shí)間序列值的總和
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(bool布爾型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.sum_values(ts)
symmetry_looking(x, param)
- 譯:布爾變量標(biāo)識(shí)
的分布是否對(duì)稱昂利。這是一個(gè)案例如果:
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(float)對(duì)比的范圍的比例
-
- 返回:這個(gè)特征的值(bool布爾型)
- 函數(shù)類型:組合器
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.symmetry_looking(ts, r)
time_reversal_asymmetry_statistic(x, lag)
- 譯:這個(gè)函數(shù)計(jì)算下式的值:
它是:
其中是均值且
是滯后算子届腐。它在[1]中被提出,作為一個(gè)從序列中提出的有用的特征蜂奸。
參考:
[1] Fulcher, B.D., Jones, N.S. (2014). Highly comparative feature-based time-series classification. Knowledge and Data Engineering, IEEE Transactions on 26, 3026–3037. - 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(int)這個(gè)值應(yīng)該被特征計(jì)算使用
-
- 返回:這個(gè)特征的值(float浮點(diǎn)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.time_reversal_asymmetry_statistic(ts, lag)
value_count(x, value)
- 譯:計(jì)算時(shí)間序列
中
出現(xiàn)的次數(shù)
- 參數(shù):
-
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
-
(int or float)被計(jì)算的值
-
- 返回:計(jì)數(shù)(int整型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.value_count(ts, value)
variance(x)
- 譯:返回序列
的方差
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(float浮點(diǎn)型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.variance(ts)
variance_larger_than_standard_deviation(x)
- 譯:布爾變量犁苏,表示x的方差是否大于其標(biāo)準(zhǔn)差。是表示x的方差大于1
- 參數(shù):
(pandas.Series)計(jì)算時(shí)序特征的數(shù)據(jù)對(duì)象
- 返回:這個(gè)特征的值(bool布爾型)
- 函數(shù)類型:簡(jiǎn)單
- 代碼示例:
#!/usr/bin/python3
import tsfresh as tsf
import pandas as pd
ts = pd.Series(x) #數(shù)據(jù)x假設(shè)已經(jīng)獲取
ae = tsf.feature_extraction.feature_calculators.variance_larger_than_standard_deviation(ts)