時(shí)間序列(time series)數(shù)據(jù)是一種重要的結(jié)構(gòu)化數(shù)據(jù)形式馍佑,應(yīng)用于多個(gè)領(lǐng)域晃危,包括金融學(xué)叙赚、經(jīng)濟(jì)學(xué)、生態(tài)學(xué)僚饭、神經(jīng)科學(xué)震叮、物理學(xué)等。在多個(gè)時(shí)間點(diǎn)觀察或測(cè)量到的任何事物都可以形成一段時(shí)間序列鳍鸵。很多時(shí)間序列是固定頻率的冤荆,也就是說,數(shù)據(jù)點(diǎn)是根據(jù)某種規(guī)律定期出現(xiàn)的(比如每15秒权纤、每5分鐘、每月出現(xiàn)一次)乌妒。時(shí)間序列也可以是不定期的汹想,沒有固定的時(shí)間單位或單位之間的偏移量。時(shí)間序列數(shù)據(jù)的意義取決于具體的應(yīng)用場(chǎng)景撤蚊,主要有以下幾種:
時(shí)間戳(timestamp)古掏,特定的時(shí)刻。
固定時(shí)期(period)侦啸,如2007年1月或2010年全年槽唾。
時(shí)間間隔(interval)丧枪,由起始和結(jié)束時(shí)間戳表示。時(shí)期(period)可以被看做間隔(interval)的特例庞萍。
實(shí)驗(yàn)或過程時(shí)間拧烦,每個(gè)時(shí)間點(diǎn)都是相對(duì)于特定起始時(shí)間的一個(gè)度量。例如钝计,從放入烤箱時(shí)起恋博,每秒鐘餅干的直徑。
11.1 日期和時(shí)間數(shù)據(jù)類型及工具
Python標(biāo)準(zhǔn)庫(kù)包含用于日期(date)和時(shí)間(time)數(shù)據(jù)的數(shù)據(jù)類型私恬,而且還有日歷方面的功能债沮。我們主要會(huì)用到datetime、time以及calendar模塊本鸣。datetime.datetime(也可以簡(jiǎn)寫為datetime)是用得最多的數(shù)據(jù)類型:
In [10]: from datetime import datetime
In [11]: now = datetime.now()
In [12]: now
Out[12]: datetime.datetime(2017, 9, 25, 14, 5, 52, 72973)
In [13]: now.year, now.month, now.day
Out[13]: (2017, 9, 25)
生成日期范圍
雖然我之前用的時(shí)候沒有明說疫衩,但你可能已經(jīng)猜到pandas.date_range可用于根據(jù)指定的頻率生成指定長(zhǎng)度的DatetimeIndex:
In [74]: index = pd.date_range('2012-04-01', '2012-06-01')
In [75]: index
Out[75]:
DatetimeIndex(['2012-04-01', '2012-04-02', '2012-04-03', '2012-04-04',
'2012-04-05', '2012-04-06', '2012-04-07', '2012-04-08',
'2012-04-09', '2012-04-10', '2012-04-11', '2012-04-12',
'2012-04-13', '2012-04-14', '2012-04-15', '2012-04-16',
'2012-04-17', '2012-04-18', '2012-04-19', '2012-04-20',
'2012-04-21', '2012-04-22', '2012-04-23', '2012-04-24',
'2012-04-25', '2012-04-26', '2012-04-27', '2012-04-28',
'2012-04-29', '2012-04-30', '2012-05-01', '2012-05-02',
'2012-05-03', '2012-05-04', '2012-05-05', '2012-05-06',
'2012-05-07', '2012-05-08', '2012-05-09', '2012-05-10',
'2012-05-11', '2012-05-12', '2012-05-13', '2012-05-14',
'2012-05-15', '2012-05-16', '2012-05-17', '2012-05-18',
'2012-05-19', '2012-05-20', '2012-05-21', '2012-05-22',
'2012-05-23', '2012-05-24', '2012-05-25', '2012-05-26',
'2012-05-27', '2012-05-28', '2012-05-29', '2012-05-30',
'2012-05-31', '2012-06-01'],
dtype='datetime64[ns]', freq='D')