1. 讀excel 1000行:
????pd.read_excel('path', nRows=1000)
????head(10), tail(10)
2. 選擇某幾行:
?????iloc[1], iloc[1:9]简僧, iloc[:10]踱稍,loc['indexA']舌缤, loc['indexA':'indexC']
3. 選擇某幾列:
????df['colA'],df.colA弯屈,df[['cloA', 'colB', 'colC']]
4. 選擇符合條件的某幾行:?
????df[df['colA'] > 10]
????df[(df['colA'] > 10) & (df['colB'] == 'test')]
????df.query('(colA > 10) & (colB == "test")')
? ? df.where('(colA > 10) & (colB == "test")')
????isin ---> df[df['colA'].isin({['A', 'B']})]
5. 選擇符合條件的某幾行的某幾列:
????df.loc('(colA > 10) & (colB == "test")', ['colC', 'colD', 'colE'])
6. 刪除列:
????df.drop(columns = ['colA', 'colB'])
7. 增加列:
????df['new_col'] = {'key' : [1,2,3,4,5]}
????df['new_col'] = df['colA'] * df['colB']
????df['new_col'] = df['colA'].apply(lambda x : x **2)
????df['new_col'] = df['colA'].apply(lambda x : str(x) + '_' + x)
8. 看某一列不一樣的值都有哪些:
????df['actual_weight'].unique()
9. 看某一列不一樣的值有幾個:
????df['actual_weight'].nunique()
10. 看某列,每個元素有多少個聪全,相當于groupBy:
????df['actual_weight'].value_counts()
????idxmax:某列最大值所在的索引位置 df['colA'].idxmax()
????idxmin:某列最小值所在的索引位置?df['colA'].idxmin()
11. 按照某一列排序
????df.sort_values(by = ['colA', 'colB', 'colC'])
12. 某一列 非缺失值的個數:?
????df['colA'].count()
13. cut的用法:
????c = pd.DataFrame({'math' : [21, 39, 20, 11, 98, 72]})
????bins = [0, 20,40,80,90,100]
????c['cuts'] = pd.cut(c['math'], bins)
????c.groupby( by = ['cuts']).count()
14. 兩個DF結合:
df1[['id', 'poi_id']].join(df2'process_date'])
15. 兩個DF合并:
df1.append(df2)