DataFrame.loc專題學(xué)習(xí)
Access a group of rows and columns by label(s) or a boolean array.
通過標(biāo)記或布爾數(shù)組獲取一組行列
.loc[] is primarily label based, but may also be used with a boolean array.
.loc[]主要是基于標(biāo)記的优妙,但也可以用布爾數(shù)組
Allowed inputs are:允許的輸入包括
A single label, e.g. 5 or 'a', (note that 5 is interpreted as a label of the index, and never as an integer position along the index).單個(gè)標(biāo)記嚣崭,例如5或‘a(chǎn)’妖枚,注意5會(huì)被解釋為一個(gè)索引的標(biāo)記签杈,而不會(huì)被認(rèn)為是一個(gè)整數(shù)位置
A list or array of labels, e.g. ['a', 'b', 'c'].一個(gè)標(biāo)簽組成的列表或數(shù)組 萌朱。
A slice object with labels, e.g. 'a':'f'.一個(gè)標(biāo)簽的切片
注意:與常規(guī)python不一樣的是废亭,loc的切片是兩端都包含的
A boolean array of the same length as the axis being sliced, e.g. [True, False, True].布爾數(shù)組
A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above)一個(gè)可調(diào)用的函數(shù)
例子
創(chuàng)建一個(gè)dataframe
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=['cobra', 'viper', 'sidewinder'],
columns=['max_speed', 'shield'])
df
max_speed shield
cobra 1 2
viper 4 5
sidewinder 7 8
單個(gè)標(biāo)簽进陡。注意返回的行是個(gè)series對(duì)象
df.loc['viper']
max_speed 4
shield 5
Name: viper, dtype: int64
通過一個(gè)標(biāo)簽列表. 使用 [[]] 返回一個(gè)dataframe亲铡、獲取對(duì)應(yīng)行
df.loc[['viper', 'sidewinder']]
max_speed shield
viper 4 5
sidewinder 7 8
通過一個(gè)行標(biāo)簽和一個(gè)列標(biāo)簽髓削,獲取對(duì)應(yīng)單個(gè)數(shù)據(jù)
df.loc['cobra', 'shield']
2
通過一個(gè)行標(biāo)簽的切片和單個(gè)列標(biāo)簽竹挡。獲取對(duì)應(yīng)行列
df.loc['cobra':'viper', 'max_speed']
cobra 1
viper 4
Name: max_speed, dtype: int64
通過與行軸長(zhǎng)度相同的布爾值列表、獲取行
df.loc[[False, False, True]]
max_speed shield
sidewinder 7 8
通過可以返回一個(gè)布爾值series對(duì)象的條件立膛,獲取行
df.loc[df['shield'] > 6]
max_speed shield
sidewinder 7 8
通過可以返回一個(gè)布爾值series對(duì)象的條件揪罕,并對(duì)列標(biāo)簽進(jìn)行指定
df.loc[df['shield'] > 6, ['max_speed']]
max_speed
sidewinder 7
通過返回布爾series的可調(diào)用對(duì)象
df.loc[lambda df: df['shield'] == 8]
max_speed shield
sidewinder 7 8
設(shè)定值
對(duì)標(biāo)簽列表中所有匹配項(xiàng)設(shè)定值
df.loc[['viper', 'sidewinder'], ['shield']] = 50
df
max_speed shield
cobra 1 2
viper 4 50
sidewinder 7 50
對(duì)整行設(shè)定值
df.loc['cobra'] = 10
df
max_speed shield
cobra 10 10
viper 4 50
sidewinder 7 50
對(duì)整列設(shè)定值
df.loc[:, 'max_speed'] = 30
df
max_speed shield
cobra 30 10
viper 30 50
sidewinder 30 50
對(duì)可調(diào)用對(duì)象條件匹配的行設(shè)定值
df.loc[df['shield'] > 35] = 0
df
max_speed shield
cobra 30 10
viper 0 0
sidewinder 0 0
Getting values on a DataFrame with an index that has integer labels
獲取一個(gè)dataframe中具有整數(shù)標(biāo)簽作為索引的值
#Another example using integers for the index
df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
index=[7, 8, 9], columns=['max_speed', 'shield'])
df
max_speed shield
7 1 2
8 4 5
9 7 8
用整數(shù)標(biāo)簽的切片獲取對(duì)應(yīng)行
df.loc[7:9]
max_speed shield
7 1 2
8 4 5
9 7 8
用復(fù)合索引獲取值
A number of examples using a DataFrame with a MultiIndex
#創(chuàng)建一個(gè)復(fù)合索引的dataframe
tuples = [
('cobra', 'mark i'), ('cobra', 'mark ii'),
('sidewinder', 'mark i'), ('sidewinder', 'mark ii'),
('viper', 'mark ii'), ('viper', 'mark iii')
]
index = pd.MultiIndex.from_tuples(tuples)
values = [[12, 2], [0, 4], [10, 20],
[1, 4], [7, 1], [16, 36]]
df = pd.DataFrame(values, columns=['max_speed', 'shield'], index=index)
df
max_speed shield
cobra mark i 12 2
mark ii 0 4
sidewinder mark i 10 20
mark ii 1 4
viper mark ii 7 1
mark iii 16 36
單個(gè)標(biāo)簽梯码。返回的是具有單組標(biāo)簽的dataframe
df.loc['cobra']
max_speed shield
mark i 12 2
mark ii 0 4
通過單個(gè)索引的元組。Note this returns a Series.
df.loc[('cobra', 'mark ii')]
max_speed 0
shield 4
Name: (cobra, mark ii), dtype: int64
通過單個(gè)行列索引. Similar to passing in a tuple, this returns a Series.
df.loc['cobra', 'mark i']
max_speed 12
shield 2
Name: (cobra, mark i), dtype: int64
通過單個(gè)元組好啰。Note using [[]] returns a DataFrame.
df.loc[[('cobra', 'mark ii')]]
max_speed shield
cobra mark ii 0 4
單個(gè)元組的行索引及一個(gè)列標(biāo)簽
df.loc[('cobra', 'mark i'), 'shield']
2
從行索引元組至單個(gè)標(biāo)簽的切片
df.loc[('cobra', 'mark i'):'viper']
max_speed shield
cobra mark i 12 2
mark ii 0 4
sidewinder mark i 10 20
mark ii 1 4
viper mark ii 7 1
mark iii 16 36
元組索引至元組索引的切片
df.loc[('cobra', 'mark i'):('viper', 'mark ii')]
max_speed shield
cobra mark i 12 2
mark ii 0 4
sidewinder mark i 10 20
mark ii 1 4
viper mark ii 7 1
DataFrame.iloc專題學(xué)習(xí)
Purely integer-location based indexing for selection by position.#純基于整數(shù)索引
.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.#主要基于整數(shù)位置轩娶,也可以基于布爾值數(shù)組
Allowed inputs are:
An integer, e.g. 5.#一個(gè)整數(shù)
A list or array of integers, e.g. [4, 3, 0].#整數(shù)構(gòu)成的列表或數(shù)組
A slice object with ints, e.g. 1:7.#整型的切片對(duì)象
A boolean array.#一個(gè)布爾型數(shù)組
A callable function with one argument (the calling Series or DataFrame) and that returns valid output for indexing (one of the above). This is useful in method chains, when you don’t have a reference to the calling object, but would like to base your selection on some value.#具有一個(gè)參數(shù)(調(diào)用series對(duì)象或dataframe對(duì)象)的可調(diào)用函數(shù),并返回上述其中一個(gè)用于索引的有效輸出框往。當(dāng)您沒有對(duì)調(diào)用對(duì)象的引用但希望基于某個(gè)值進(jìn)行選擇時(shí)鳄抒,這在方法鏈中很有用。
.iloc will raise IndexError if a requested indexer is out-of-bounds, except slice indexers which allow out-of-bounds indexing (this conforms with python/numpy slice semantics).
創(chuàng)建一個(gè)example
mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},
{'a': 100, 'b': 200, 'c': 300, 'd': 400},
{'a': 1000, 'b': 2000, 'c': 3000, 'd': 4000 }]
df = pd.DataFrame(mydict)
df
a b c d
0 1 2 3 4
1 100 200 300 400
2 1000 2000 3000 4000
僅索引行
通過一個(gè)標(biāo)量整數(shù)
>>>type(df.iloc[0])
<class 'pandas.core.series.Series'>#獲得第[0]行椰弊,返回對(duì)象為Series
>>>df.iloc[0]
a 1
b 2
c 3
d 4
Name: 0, dtype: int64
通過一個(gè)整數(shù)組成的列表
df.iloc[[0]]
a b c d
0 1 2 3 4
type(df.iloc[[0]])
<class 'pandas.core.frame.DataFrame'>#獲得第[0]行许溅,返回對(duì)象為DataFrame
#返回第[0]和[1]行數(shù)據(jù)
df.iloc[[0, 1]]
a b c d
0 1 2 3 4
1 100 200 300 400
通過一個(gè)切片對(duì)象
#返回第[0]至[2]行
df.iloc[:3]
a b c d
0 1 2 3 4
1 100 200 300 400
2 1000 2000 3000 4000
通過一個(gè)與行索引長(zhǎng)度一致的布爾值列表
df.iloc[[True, False, True]]
a b c d
0 1 2 3 4
2 1000 2000 3000 4000
通過一個(gè)可調(diào)用對(duì)象With a callable, useful in method chains. The x passed to the lambda is the DataFrame being sliced. This selects the rows whose index label even.
#返回偶數(shù)行
df.iloc[lambda x: x.index % 2 == 0]
a b c d
0 1 2 3 4
2 1000 2000 3000 4000
索引全部軸
You can mix the indexer types for the index and columns. Use : to select the entire axis.
可以將行和列的索引混合使用,以選擇全部軸的數(shù)據(jù)
通過標(biāo)量整數(shù)
df.iloc[0, 1]
2
通過整數(shù)列表
df.iloc[[0, 2], [1, 3]]
b d
0 2 4
2 2000 4000
通過切片對(duì)象
df.iloc[1:3, 0:3]
a b c
1 100 200 300
2 1000 2000 3000
通過布爾值列表
df.iloc[:, [True, False, True, False]]
a c
0 1 3
1 100 300
2 1000 3000
With a callable function that expects the Series or DataFrame.
通過一個(gè)返回series或dataframe的可調(diào)用對(duì)象
df.iloc[:, lambda df: [0, 2]]
a c
0 1 3
1 100 300
2 1000 3000