2019-06-26 Python實現(xiàn)Vba中的字典功能

在成績合并時弓摘，由于班級多焚鹊，重名現(xiàn)象比較常見，加上部分學(xué)科老師喜歡在原始成績上排序韧献、求平均值等一些簡單的統(tǒng)計末患，還有缺考學(xué)生處理方法不一致，導(dǎo)致學(xué)生名單不一致锤窑，直接復(fù)制各科成績肯定會出錯璧针。利用vba字典功能去重可輕松合并各科成績，比如：以班級渊啰、姓名探橱、考號為關(guān)鍵字申屹，在合并成績時，重名的問題隧膏，完全重復(fù)的學(xué)生成績都得到了解決哗讥。下面探索下用Python實現(xiàn)vba的這一功能。

成績表的表頭列名為：'班級', '姓名', '考號', '考場', '座位號', '語文', '數(shù)學(xué)', '外語', '物理', '化學(xué)', '生物', '政治', '歷史', '地理', 'Unnamed: 14', 'Unnamed: 15'

在成績表中總共969位學(xué)生胞枕。其中有一位學(xué)生陳美杆煞，以班級，姓名為關(guān)鍵字曲稼，她是重復(fù)的索绪，以班級，姓名贫悄、考號為關(guān)鍵字瑞驱，她不是重復(fù)的。

先以班級窄坦，姓名為關(guān)鍵字唤反，用pandas實現(xiàn)

第一步埂奈，生成輸出結(jié)果文件rs.xlsx和text2.txt用于記錄程序過程中的一些輸出竿刁，便于調(diào)試。

import numpy as np

import pandas as pd

import os

from os.path import exists

#改變當(dāng)前的路徑

os.chdir(r'D:\test\source2')

#將當(dāng)前目錄下的文件以列表的形式存放

file = os.listdir("./")

result = "rs.xls"

if exists(result):

? ? os.remove(result)

讀入要合并的excel文件中随常，并將表頭列名輸?shù)絫ext2.txt

f_0 = pd.read_excel(file[0])

print(df_0.columns)

print('表頭列名:\n',str(df_0.columns),file=open(r'D:\test\test2.txt', "a"))

第二步逆趋，處理關(guān)鍵字

先輸出關(guān)鍵字參考到test2用于在程序運行過程中要輸入關(guān)鍵字時盏阶，可復(fù)制、粘貼

print("關(guān)鍵字參考: ['班級','姓名']\n",file=open(r'D:\test\test2.txt', "a"))

keyw=eval(input("請輸入合并時的關(guān)鍵字闻书，比如：班級＋姓名名斟，輸入，['班級','姓名']:"))

print('關(guān)鍵字:\n',str(keyw),file=open(r'D:\test\test2.txt', "a"))

第三步魄眉，構(gòu)建字典數(shù)據(jù)結(jié)構(gòu)

data_dict2=df_0.set_index(keyw).T.to_dict('list')

最后砰盐，輸出到excel表

直接輸出，excel表中的列名為關(guān)鍵字坑律，因此需先轉(zhuǎn)置岩梳，然后處理列名

1.處理列名

colf=df_0.columns.tolist()#將列表名轉(zhuǎn)成字典

col_list = [item for item in colf if item not in keyw] + [item for item in keyw if item not in colf]#在列表名中減去關(guān)鍵字

2.轉(zhuǎn)置輸出

d1=pd.DataFrame(data_dict2)

d1=d1.T

d1.rename(columns=dict(enumerate(col_list)),inplace=True)

d1.to_excel(r'D:\test\rs.xls', index=True)

完整代碼

import numpy as np

import pandas as pd

import os

from os.path import exists

#改變當(dāng)前的路徑

os.chdir(r'D:\test\source2')

#將當(dāng)前目錄下的文件以列表的形式存放

file = os.listdir("./")

result = "rs.xls"

if exists(result):

? ? os.remove(result)

df_0 = pd.read_excel(file[0])

print(df_0.columns)

print('表頭列名:\n',str(df_0.columns),file=open(r'D:\test\test2.txt', "a"))

print("關(guān)鍵字參考:? ['班級','姓名']\n",file=open(r'D:\test\test2.txt', "a"))

keyw=eval(input("請輸入合并時的關(guān)鍵字，比如：班級＋姓名晃择，輸入冀值，['班級','姓名']:"))

print('關(guān)鍵字:\n',str(keyw),file=open(r'D:\test\test2.txt', "a"))

colf=df_0.columns.tolist()#將列表名轉(zhuǎn)成字典

col_list = [item for item in colf if item not in keyw] + [item for item in keyw if item not in colf]#在列表名中減去關(guān)鍵字

print(col_list)

data_dict2=df_0.set_index(keyw).T.to_dict('list')

d1=pd.DataFrame(data_dict2)

d1=d1.T

d1.rename(columns=dict(enumerate(col_list)),inplace=True)#處理轉(zhuǎn)置后的列名

d1.to_excel(r'D:\test\rs.xls', index=True)

效果

原始表

原始1

原始2

結(jié)果1

結(jié)果2

總結(jié)：

實現(xiàn)了目標(biāo)，以班級宫屠，姓名為關(guān)鍵字列疗，陳美只有一個名字，從結(jié)果2圖可看到人數(shù)少了一人激况。

不足：輸出表中,excel表第一列有合并單元格作彤，這是不想出現(xiàn)的情況膘魄，對pandas的使用是小白，不明白竭讳，也沒能力解決创葡，請讀者指點迷徑，在此感謝绢慢。

寫得不好灿渴，請批評指正。

最后編輯于：2019.06.26 11:03:38

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者