1章蚣、Pandas實(shí)現(xiàn)數(shù)據(jù)的合并concat,增加一行(https://blog.csdn.net/weixin_47661174/article/details/124698328)
pd.concat([df1,df2])
2姨夹、Series纤垂、DataFrame(pandas)和ndarray(numpy)三者相互轉(zhuǎn)換(https://blog.csdn.net/qq_36743482/article/details/114678409)
ndarray => Series
npa = np.arange(12)
ser = pd.Series(npa)
Series => ndarray
npa_s = np.array(ser)
ndarray => DataFrame
npa2 = npa.reshape(3, -1)
df = pd.DataFrame(npa2)
DataFrame => ndarray
npa_d = np.array(df)
npa_v = df.values # npa_d npa_v 一樣
DataFrame -> Series
type(df[0]) # pandas.core.series.Series
Series -> DataFrame
pd.DataFrame(ser)
3、python中series轉(zhuǎn)dataframe的兩種方法(https://zhuanlan.zhihu.com/p/469512251)
pd.DataFrame([j.to_dict()]) #series有轉(zhuǎn)frame dict等方法
4磷账、pandas讀取某幾行(https://blog.csdn.net/weixin_39025679/article/details/109216669)
https://blog.csdn.net/bianxia123456/article/details/111396760
np.loc[0:m]
python.pandas.DataFrame初始化峭沦,dic寫(xiě)入,切片寫(xiě)入逃糟,存csv問(wèn)題合集(https://zhuanlan.zhihu.com/p/489099818)
df3.loc[['No.1','No.3'],['name','color']] # '[]', 索引特定行列
name color
No.1 apple red
No.3 watermelon green
5吼鱼、pandas定位某一行、選取列绰咽、列累加
for i in range(len(all_data)):
# print(all_data['飛靶號(hào)'][0])
# print(all_data[i])
if all_data['飛靶號(hào)'][0]=='退電品':
print(i)
print(all_data.iloc[i])
# all_data = np.delete(all_data,i,axis=0)
exit(1)
X=all_data[['溫度','PH','L','A','B']]
y=all_data[['二次染時(shí)']]
X.head()
x_train[['溫度']].apply(lambda x:x.sum())
6菇肃、使用numpy初始化數(shù)據(jù)類(lèi)型為object的空數(shù)組(https://www.cnpython.com/qa/1341898)
a = np.empty((12,), dtype=object)
7、Python: numpy數(shù)組添加一行或者一列, numpy數(shù)組的增刪查改(https://blog.csdn.net/qq_40765537/article/details/105869910)
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.array([2,5,8])
print(np.r_[a,[b]])
輸出:
[[1 2 3]
[4 5 6]
[7 8 9]
[2 5 8]]
8取募、【Python數(shù)據(jù)處理】用pandas將dataframe寫(xiě)入excel中(https://blog.csdn.net/chengyikang20/article/details/90139384)
將pycharm生成的數(shù)據(jù)用pandas庫(kù)中的to_excel保存為excel文檔時(shí)琐谤,報(bào)錯(cuò):numpy.ndarray object has no attribute to_excel(https://blog.csdn.net/m0_67870771/article/details/124603745)
import pandas as pd
file_path = 'E:/data/2.xlsx' #想要保存到的位置和文件名稱(chēng)、文件類(lèi)型玩敏。
df = pd.DataFrame(data)
dt.to_excel(file_path)
9斗忌、numpy列相加 python(https://www.csdn.net/tags/MtTaAgxsMDQ2MjQtYmxvZwO0O0OO0O0O.html)
x.sum(axis=0)
10、Python中numpy如何提取矩陣的某一行或某一列(https://www.yisu.com/zixun/179241.html)
矩陣的某一行
a[1]
Out[32]: array([3, 4, 5])
矩陣的某一列
a[:,1]
Out[33]: array([1, 4, 7])
11旺聚、numpy選擇特定的行列(https://blog.csdn.net/goodxin_ie/article/details/109659893)
x[[0,1]][:,[0,3]]
Out[31]:
array([[0, 3],
[4, 7]])
x_test = np.empty((1, 15), dtype=object) # Test數(shù)據(jù)集
test_feibahao = x_test[:, 0] # 取測(cè)試集飛靶號(hào)一列
12织阳、Numpy刪除行(多行操作)(https://blog.csdn.net/God_WZH/article/details/122575683)
https://blog.csdn.net/A_JI_97/article/details/116235753
刪除行:
x1 = np.delete(x, 0, axis=0)
y1 = np.delete(y, 1, axis=0)
print(x1)
print(y1)
13、numpy行列轉(zhuǎn)換(https://blog.csdn.net/m0_37294838/article/details/102743533)
14砰粹、如何輕松地將numpy數(shù)組(矩陣)從python提取到excel唧躲?(https://www.cnpython.com/qa/1678585)
import numpy
numpy.savetxt('your\location\yourfile.csv', numpy_array, delimiter=',')
15、Python合并兩個(gè)numpy矩陣(http://t.zoukankan.com/itdyb-p-5735911.html)
我們隨機(jī)生成了a,b這兩個(gè)矩陣惊窖,下面進(jìn)行合并操作:
hstack()在行上合并
np.hstack((a,b))
array([[ 8., 5., 1., 9.],
[ 1., 6., 8., 5.]])
vstack()在列上合并
np.vstack((a,b))
array([[ 8., 5.],
[ 1., 6.],
[ 1., 9.],
[ 8., 5.]])
16刽宪、Python教程:numpy數(shù)組初始化為相同的值(https://blog.csdn.net/sinat_38682860/article/details/111314885)
import numpy as np
a = np.ones((4,4)) * 10
[[10. 10. 10. 10.]
[10. 10. 10. 10.]
[10. 10. 10. 10.]
[10. 10. 10. 10.]]
17、數(shù)據(jù)分析入門(mén)之numpy數(shù)組數(shù)據(jù)大小比較與篩選去重(https://blog.csdn.net/ayouleyang/article/details/103757741)
18界酒、【Numpy】Numpy求均值圣拄、中位數(shù)、眾數(shù)的方法(https://blog.csdn.net/u013066730/article/details/108844068)
import numpy as np
均值
np.mean(nums)
中位數(shù)
np.median(nums)
from scipy import stats
stats.mode(nums)[0][0]
19毁欣、python讀取EXCEL表格中有相同列名的值(https://blog.csdn.net/qq_41821067/article/details/121798607)
import pandas as pd
df = pd.read_excel('test1.xls',header=0)#現(xiàn)在Excel表格與py代碼放在一個(gè)文件夾里
result = []
for s_li in df.columns:
打印列名
print(s_li)
if 'I' in str(s_li):
result.append(df[s_li])
print(result)
pd.DataFrame(result).to_excel(r'F:\python_project\result.xls')#保存的路徑
20庇谆、
import statistics
l = ['溫度','PH','DL1','DA1','DB1','DL2','一次染時(shí)']
l = ['wendu','ph','dl1','da1','db1','dl2','yici']
print(pandas_data.shape)
for i in l:
print(i+'方差為:%f' % np.var(pandas_data[i]),i+'標(biāo)準(zhǔn)差為:%f' % np.std(pandas_data[i]),i+'最大值為:%f' % np.max(pandas_data[i]),
i+'最小值為:%f' % np.min(pandas_data[i]),i+'平均值為:%f' % np.mean(pandas_data[i]),i+'中位數(shù)為:%f' % np.median(pandas_data[i]))
print(i+'眾數(shù)為:', statistics.mode(pandas_data[i]))
print()
21、DataFrame 取某一行某一列或取某N行某N列(https://blog.csdn.net/qq_42140717/article/details/124350979)
取已知index的某一行數(shù)據(jù):
df.loc[a]
取未知index某一行的數(shù)據(jù):
df[1:2]#括號(hào)下包含凭疮,如取第二行數(shù)據(jù)則為應(yīng)為[1:2]
取未知index某N行的數(shù)據(jù):
df[0:10]
取已知名稱(chēng)的某一列:
df['name']
取不知名稱(chēng)饭耳,但知道第幾列的數(shù)據(jù):
df.iloc[:,2]
取已知名稱(chēng)的N列:
df[['name','name2']]
取已知名稱(chēng)的N行M列:
df['name'][0:4]
取不知名稱(chēng)的N行M列:
df.iloc[0:N,0:M]
iloc是只取索引值即只取數(shù)值。loc取得是index索引值执解,和列名字寞肖。如數(shù)據(jù)中索引值有重復(fù)的情況,loc會(huì)報(bào)錯(cuò)衰腌。不使用loc和iloc則是選擇第幾行的指定名稱(chēng)的列新蟆。
22、# 怎樣取numpy數(shù)組指定行列
https://blog.csdn.net/goodxin_ie/article/details/109659893
b= a[c]先取想要的行數(shù)據(jù)
b = b[:,d]
print(b)
x[[0,1]][:,[0,3]]
Out[31]:
array([[0, 3],
[4, 7]])
23右蕊、Python中numpy數(shù)組的拼接琼稻、合并(https://blog.csdn.net/qq_39516859/article/details/80666070)
水平組合
np.hstack((a,b))
array([ 0, 1, 2, 0, 2, 4],
[ 3, 4, 5, 6, 8, 10],
[ 6, 7, 8, 12, 14, 16])
np.concatenate((a,b),axis=1)
array([ 0, 1, 2, 0, 2, 4],
[ 3, 4, 5, 6, 8, 10],
[ 6, 7, 8, 12, 14, 16])
data = pd.read_csv(f, low_memory=False)
25饶囚、python讀取csv文件的幾種方式(含實(shí)例說(shuō)明)(https://blog.csdn.net/qq_43160348/article/details/124331781)
import pandas as pd
df = pd.read_csv('../data_pro/audito_whole.csv')
print(df)
26帕翻、【Python】——篩選存在空值的行or非空值的行(https://blog.csdn.net/qq_40264559/article/details/124508563)
test = test[test['性別'].notna()] #去掉【性別】為空值的行
test
27、Pandas 創(chuàng)建一個(gè)空的Dataframe 并向其添加行與列(https://blog.csdn.net/qq_53817374/article/details/123771713)
import pandas as pd
df = pd.DataFrame(data=None,columns=['時(shí)間','車(chē)牌','北緯','東經(jīng)'])
df
拼接(pandas.concat用法詳解)(https://cloud.tencent.com/developer/news/372041)
pd.concat([df1,df2,df3]),默認(rèn)axis=0萝风,在0軸上合并嘀掸。
28、# 【Python小隨筆】Pandas讀取每一行數(shù)據(jù)
for indexs in data.index:
print(data.loc[indexs].values[0:-1])
29规惰、## pandas錯(cuò)誤處理:A value is trying to be set on a copy of a slice from a DataFrame
quchong = df_all.drop_duplicates(subset='虛擬飛靶號(hào)')
print(quchong.shape)
quchong.insert(loc=6, column='hour', value='')
new_data = quchong.copy()
for i in range(quchong.shape[0]):
new_data['hour'].iloc[i]=int(quchong['一次化拋進(jìn)槽時(shí)間'].iloc[i][11:13])
30横殴、pandas添加新列的5種常見(jiàn)方法(https://www.jb51.net/article/251192.htm)
df.insert(loc=2, column='c', value=3) # 在最后一列后,插入值全為3的c列
print('插入c列:\n', df)
31卿拴、python數(shù)據(jù)去重(pandas)(https://blog.csdn.net/qq_39012566/article/details/98633780)
1衫仑、整行去重。
DataFrame.drop_duplicates()
2堕花、按照其中某一列去重
DataFrame.drop_duplicates(subset=‘列名’)
32文狱、pandas通過(guò)AND,OR缘挽,NOT多個(gè)條件提让槌纭(選擇)行的代碼(https://blog.csdn.net/qq_18351157/article/details/105403779)
print((df['age'] < 35) & ~(df['state'] == 'NY'))
33呻粹、Pandas數(shù)據(jù)排序(https://blog.csdn.net/weixin_47661174/article/details/124697231)
df.sort_values(by="aqi")
34、設(shè)置不同的圖例(https://blog.csdn.net/qq_44039983/article/details/123510020)
plt.legend(['line1', 'line2'])
35苏研、pandas 實(shí)現(xiàn) in 和 not in 的用法及心得(https://blog.csdn.net/weixin_43064185/article/details/91374033)
IN
something.isin(somewhere)
NOT IN
~something.isin(somewhere)
36等浊、pandas: groupby()分組求平均值、最大值等等(https://blog.csdn.net/DoyWang/article/details/109137700)
df.groupby('分組的名字')['求的列名'].mean()
a = sort_new_data.groupby('虛擬飛靶號(hào)')[['GLOSS_P1','GLOSS_P3']].mean()
37摹蘑、pandas join操作詳解(https://blog.csdn.net/bitcarmanlee/article/details/113311113)
import pandas as pd
def joindemo():
age_df = pd.DataFrame({'name': ['lili', 'lucy', 'tracy', 'mike'],
'age': [18, 28, 24, 36]})
score_df = pd.DataFrame({'name': ['tony', 'mike', 'akuda', 'tracy'],
'score': ['A', 'B', 'C', 'B']})
result = age_df.join(score_df, on='name')
print(result)
38筹燕、pandas求每列的最大值和最小值(https://blog.csdn.net/Mtf007/article/details/108909604)
df.min()
用來(lái)求每列的最小值
df.max()
用來(lái)求每列的最大值
39、解決Pandas的to_excel()寫(xiě)入不同Sheet衅鹿,而不會(huì)被重寫(xiě)(https://blog.csdn.net/shykevin/article/details/111244838)
with pd.ExcelWriter('789.xlsx') as writer:
df1.to_excel(writer, sheet_name='Sheet1', index=False, header=True)
df2.to_excel(writer, sheet_name='Sheet2', index=False, header=True)
df3.to_excel(writer, sheet_name='Sheet3', index=False, header=True)
40撒踪、DataFrame在指定位置插入行和列(https://blog.csdn.net/weixin_46599926/article/details/126164876)
插入數(shù)據(jù)到第一列
df.insert(0,"col0",[99,99])
41、Numpy 中 np.vstack() 和 np.hstack() 簡(jiǎn)單解析(https://blog.csdn.net/nanhuaibeian/article/details/100597342)
42大渤、DataFrame在指定位置插入行和列(https://blog.csdn.net/weixin_46599926/article/details/126164876)
插入數(shù)據(jù)到第一列
df.insert(0,"col0",[99,99])
43制妄、pandas 刪除某一行/列(https://blog.csdn.net/weixin_43914402/article/details/121077282)
test_data.drop(test_data[test_data['虛擬飛靶號(hào)'] == i].index)
44、設(shè)置坐標(biāo)軸
plt.rcParams['font.sans-serif']=['SimHei'] #顯示中文標(biāo)簽
plt.rcParams['axes.unicode_minus']=False
fig = plt.figure(figsize=(35, 12))
plt.scatter(feiba, pr,c='r', alpha=0.5, marker='.') # 預(yù)測(cè)值
plt.scatter(feiba, y_test,c='b', alpha=0.5, marker='.') # 真實(shí)值
plt.legend(['預(yù)測(cè)值','真實(shí)值'])
plt.xticks(rotation=-90)
plt.xlabel('飛靶號(hào)')
plt.ylabel('化拋時(shí)間')
45泵三、df.isnull使用細(xì)節(jié)(https://blog.csdn.net/ningyanggege/article/details/80752299)
46耕捞、pandas 如何移動(dòng)列的位置(https://blog.csdn.net/Ghjkku/article/details/125021162)
https://blog.csdn.net/weixin_43848614/article/details/126315910
mid = df['采集時(shí)間'] # 取備采集時(shí)間的值
df.pop('采集時(shí)間') # 刪除備采集時(shí)間
df.insert(0, '采集時(shí)間', mid) # 插入采集時(shí)間列
47、Python-修改Pandas數(shù)據(jù)表的列名(https://blog.csdn.net/weixin_44556353/article/details/125295463)
直接以屬性賦值的方式烫幕,一次將全部的列名進(jìn)行重新定義
data.columns = ['city','name','post','pay','request','number']
48俺抽、astype()函數(shù),將DataFrame轉(zhuǎn)換為String
pd_1 = pd_1[need_columns].astype('string')
49纬霞、DataFrame中列的順序改變(https://blog.csdn.net/The_dream1/article/details/122688517)
order = ['date', 'time', 'open', 'high', 'low', 'close', 'volumefrom', 'volumeto']
df = df[order]
50、Python-Pandas-DataFrame對(duì)象轉(zhuǎn)置(交換行列)(https://blog.csdn.net/shenyinwudi/article/details/118639251)
df_T = pd.DataFrame(df.values.T,columns=index_row,index=index_colums)
print(df_T)
51驱显、Pandas數(shù)據(jù)分析25——pandas數(shù)據(jù)框樣式設(shè)置(https://blog.csdn.net/weixin_46277779/article/details/126344626)
df.head().style.highlight_null(null_color='blue')
最大值高亮诗芜,默認(rèn)黃色
df.head().style.highlight_max()
52、Python pandas dataframe:計(jì)算列中大于或小于閾值的元素?cái)?shù)量(https://cloud.tencent.com/developer/ask/sof/356849)
import pandas as pd
df = pd.DataFrame({'c1': ['A', 'B','C','D','E'], 'c2': [3, 1, 0,2,5]})
count=df[df['c2'] >= 3].count().shape[0]
print(count) # prints 2
53埃疫、修改某一值
jiaozheng_pd.at[i, '預(yù)測(cè)值'] = jiaozheng_pd.at[i, '一次化拋時(shí)間'] + 10
54伏恐、pandas DataFrame 中at , loc ,iloc 區(qū)別
at 的列只能寫(xiě)列名,不能用下標(biāo)
55栓霜、Python中numpy數(shù)組如何添加元素(https://m.py.cn/jishu/jichu/23441.html)
list_b = np.empty([0,3], dtype=int)
for i in range(10000):
list_b = np.append(list_b,[1,2,3])
56翠桦、python dataframe新增一列(https://blog.csdn.net/julyclj55555/article/details/122450287)
指明列名,并賦值即可:
data[‘a(chǎn)ddlist’]=[1,2]
57胳蛮、