[Python] 自動化辦公從Excel提取信息到word表格

轉(zhuǎn)載請注明：陳熹 chenx6542@foxmail.com （簡書號：半為花間酒）
若公眾號內(nèi)轉(zhuǎn)載請聯(lián)系公眾號：早起Python

這篇文章能學(xué)到的主要內(nèi)容：

openpyxl讀取Excel獲取內(nèi)容

docx讀寫word文件

能學(xué)到的小技巧：

os獲取桌面路徑

win32com批量doc轉(zhuǎn)換為docx（僅windows用戶）

（文末附原始數(shù)據(jù)文件下載鏈接）

今天早起python公眾號的讀者提出了一個(gè)需求：
（由于涉及文件私密所以具體內(nèi)容已做修改）

每一列的數(shù)據(jù)需要按照一定規(guī)則填到一個(gè)word模板里菱魔，規(guī)則和模板大致如下：

這些是需要填寫的部分，整體的模板要復(fù)雜一些：

還有一個(gè)需求：最終輸出的word文件命名如下：
C列的數(shù)據(jù)去重然后用&鏈接 + G2 + V列數(shù)據(jù)求和 + P列的數(shù)據(jù)去重后用&連接 + 當(dāng)天日期(如：2020年04月22日) + 驗(yàn)貨報(bào)告

從需求和文件格式上看最筒，這次文件的讀寫解析任務(wù)較復(fù)雜留拾，碼代碼和思考時(shí)間會較久龄减，因此需要想清楚一個(gè)問題：
這次需要完成的任務(wù)是否工作量很多察绷，或者以后長期需要進(jìn)行，用python可以解放雙手及刻？

如果不是笨使，實(shí)際上手動就可以完成卿樱，失去了自動化辦公的意義

ok接下來我們正式碼代碼

1. 解析Excel的數(shù)據(jù)

將原始數(shù)據(jù)解壓縮后文件夾放在桌面即可
當(dāng)然如果你想放其他地方也可以，就指名絕對路徑

from openpyxl import load_workbook
import os

# 獲取桌面的路徑
def GetDesktopPath():
    return os.path.join(os.path.expanduser("~"), 'Desktop')

path = GetDesktopPath() + '/資料/' # 形成文件夾的路徑便后續(xù)重復(fù)使用

workbook = load_workbook(filename=path + '數(shù)據(jù).xlsx')
sheet = workbook.active # 獲取當(dāng)前頁

# 可以用代碼獲取數(shù)據(jù)范圍阱表，如果要批處理循環(huán)迭代也方便
# 獲取有數(shù)據(jù)范圍
print(sheet.dimensions)
# A1:W10

利用openpyxl讀取單元格有以下幾種用法：

cells = sheet['A1:A4']  # 返回A1-A4的4個(gè)單元格
cells = sheet['A'] # 獲取A列
cells = sheet['A:C'] # 獲取A-C列
cells = sheet[5] # 獲取第5行

# 注意如果是上述用cells獲取返回的是嵌套元祖
for cell in cells:
    print(cell[0].value) # 遍歷cells依然需要取出元祖中元素才可以獲取值

# 獲取一個(gè)范圍的所有cell
# 也可以用iter_col返回列
for row in sheet.iter_rows(min_row=1, max_row=3,
                           min_col=2, max_col=4):
    for cell in row:
        print(cell.value)

明白了原理我們就可以解析獲取Excel中的數(shù)據(jù)了

# SQE
SQE = sheet['Q2'].value

# 供應(yīng)商&制造商
supplier = sheet['G2'].value

# 采購單號
C2_10 = sheet['C2:C10'] # 返回cell.tuple對象
# 利用列表推導(dǎo)式后面同理
vC2_10 = [str(cell[0].value) for cell in C2_10]
# 用set簡易去重后用,連接椰苟，填word表用
order_num = ','.join(set(vC2_10))
# 用set簡易去重后用&連接较屿，word文件名命名使用
order_num_title = '&'.join(set(vC2_10))

# 產(chǎn)品型號
T2_10 = sheet['T2:T10']
vT2_10 = [str(cell[0].value) for cell in T2_10]
ptype = ','.join(set(vT2_10))

# 產(chǎn)品描述
P2_10 = sheet['P2:P10']
vP2_10 = [str(cell[0].value) for cell in P2_10]
info = ','.join(set(vP2_10))
info_title = '&'.join(set(vP2_10))

# 日期
# 用datetime庫獲取今日時(shí)間以及相應(yīng)格式化
import datetime
today = datetime.datetime.today()
time = today.strftime('%Y年%m月%d日')

# 驗(yàn)貨數(shù)量
V2_10 = sheet['V2:V10']
vV2_10 = [int(cell[0].value) for cell in V2_10]
total_num = sum(vV2_10) # 計(jì)算總數(shù)量

# 驗(yàn)貨箱數(shù)
W2_10 = sheet['W2:W10']
vW2_10 = [int(cell[0].value) for cell in W2_10]
box_num = sum(vW2_10)


# 生成最終需要的word文件名
title = f'{order_num_title}-{supplier}-{total_num}-{info_title}-{time}-驗(yàn)貨報(bào)告'

print(title)

Excel的部分就結(jié)束了翩概，接下來進(jìn)行word的填表啦

這里我們默認(rèn)讀取的word是.docx格式的汪疮，實(shí)際上讀者的需求是.doc格式文件

這里如果是windows用戶可以用如下代碼批量轉(zhuǎn)化doc，前提是安裝好win32com

# pip install pypiwin32
from win32com import client

docx_path = path + '模板.docx'

# doc轉(zhuǎn)docx的函數(shù)
def doc2docx(doc_path,docx_path):
    word = client.Dispatch("Word.Application")
    doc = word.Documents.Open(doc_path)
    doc.SaveAs(docx_path, 16)
    doc.Close()
    word.Quit()
    print('\n doc文件已轉(zhuǎn)換為docx \n')

if not os.path.exists(docx_path):
    doc2docx(docx_path[:-1], docx_path)

Mac暫時(shí)沒有好的解決策略爱致，如果有思路歡迎交流

有docx格式文件后我們繼續(xù)操作


docx_path = path + '模板.docx'

from docx import Document

# 實(shí)例化
document = Document(docx_path)

# 讀取word中的所有表格
tables = document.tables
# print(len(tables))
# 15

確定好每個(gè)表格數(shù)后即可進(jìn)行相應(yīng)的填報(bào)操作

table的用法和openpyxl中非常類似烤送，注意索引和原生python一樣都是從0開始

tables[0].cell(1, 1).text = SQE

tables[1].cell(1, 1).text = supplier
tables[1].cell(2, 1).text = supplier
tables[1].cell(3, 1).text = ptype
tables[1].cell(4, 1).text = info
tables[1].cell(5, 1).text = order_num
tables[1].cell(7, 1).text = time

for i in range(2, 11):
    tables[6].cell(i, 0).text = str(sheet[f'T{i}'].value)
    tables[6].cell(i, 1).text = str(sheet[f'P{i}'].value)
    tables[6].cell(i, 2).text = str(sheet[f'C{i}'].value)
    tables[6].cell(i, 4).text = str(sheet[f'V{i}'].value)
    tables[6].cell(i, 5).text = str(sheet[f'V{i}'].value)
    tables[6].cell(i, 6).text = '0'
    tables[6].cell(i, 7).text = str(sheet[f'W{i}'].value)
    tables[6].cell(i, 8).text = '0'

tables[6].cell(12, 4).text = str(total_num)
tables[6].cell(12, 5).text = str(total_num)
tables[6].cell(12, 7).text = str(box_num)

這里有兩個(gè)細(xì)節(jié)：

word寫入的數(shù)據(jù)需是字符串，所以從Excel獲取的數(shù)據(jù)需要用str格式化
這個(gè)也是最耗費(fèi)精力和時(shí)間的糠悯，表格可能存在合并等其他情況帮坚，因此你看到的行數(shù)和列數(shù)可能不是真實(shí)的，需要用代碼不斷測試互艾。上述代碼中跳過了第4列试和，試一試為什么

for i in range(2, 11):
    tables[13].cell(i - 1, 0).text = str(sheet[f'T{i}'].value)
    tables[13].cell(i - 1, 1).text = str(sheet[f'U{i}'].value)
    tables[13].cell(i - 1, 2).text = str(sheet[f'U{i}'].value)
    tables[13].cell(i - 1, 3).text = str(sheet[f'U{i}'].value)

需求大致就完成了，記得保存

document.save(path + f'{title}.docx')
print('\n文件已生成')

最后附上完整代碼

from openpyxl import load_workbook
from docx import Document
import datetime
# pip install pypiwin32
# from win32com import client
import os


# 獲取桌面的路徑
def GetDesktopPath():
    return os.path.join(os.path.expanduser("~"), 'Desktop')

path = GetDesktopPath() + '/資料/' # 形成文件夾的路徑便后續(xù)重復(fù)使用

workbook = load_workbook(filename=path + '數(shù)據(jù).xlsx')
sheet = workbook.active # 獲取當(dāng)前頁

# 獲取有數(shù)據(jù)范圍
# print(sheet.dimensions)
# A1:W10

# SQE
SQE = sheet['Q2'].value

# 供應(yīng)商&制造商
supplier = sheet['G2'].value

# 采購單號
C2_10 = sheet['C2:C10'] # 返回cell.tuple對象
vC2_10 = [str(cell[0].value) for cell in C2_10]
order_num = ','.join(set(vC2_10))
order_num_title = '&'.join(set(vC2_10))

# 產(chǎn)品型號
T2_10 = sheet['T2:T10']
vT2_10 = [str(cell[0].value) for cell in T2_10]
ptype = ','.join(set(vT2_10))

# 產(chǎn)品描述
P2_10 = sheet['P2:P10']
vP2_10 = [str(cell[0].value) for cell in P2_10]
info = ','.join(set(vP2_10))
info_title = '&'.join(set(vP2_10))

# 日期
today = datetime.datetime.today()
time = today.strftime('%Y年%m月%d日')

# 驗(yàn)貨數(shù)量
V2_10 = sheet['V2:V10']
vV2_10 = [int(cell[0].value) for cell in V2_10]
total_num = sum(vV2_10) # 計(jì)算總數(shù)量

# 驗(yàn)貨箱數(shù)
W2_10 = sheet['W2:W10']
vW2_10 = [int(cell[0].value) for cell in W2_10]
box_num = sum(vW2_10)

title = f'{order_num_title}-{supplier}-{total_num}-{info_title}-{time}-驗(yàn)貨報(bào)告'

print(title)

doc_path = path + '模板.docx'
docx_path = doc_path + 'x'

# doc轉(zhuǎn)docx的函數(shù)
# def doc2docx(doc_path,docx_path):
#     word = client.Dispatch("Word.Application")
#     doc = word.Documents.Open(doc_path)
#     doc.SaveAs(docx_path, 16)
#     doc.Close()
#     word.Quit()
#     print('\n doc文件已轉(zhuǎn)換為docx \n')

# if not os.path.exists(docx_path):
#     doc2docx(doc_path, docx_path)

document = Document(docx_path)

# 讀取word中的所有表格
tables = document.tables
# print(len(tables))
# 15

# 開始填表
tables[0].cell(1, 1).text = SQE

tables[1].cell(1, 1).text = supplier
tables[1].cell(2, 1).text = supplier
tables[1].cell(3, 1).text = ptype
tables[1].cell(4, 1).text = info
tables[1].cell(5, 1).text = order_num
tables[1].cell(7, 1).text = time

for i in range(2, 11):
    tables[6].cell(i, 0).text = str(sheet[f'T{i}'].value)
    tables[6].cell(i, 1).text = str(sheet[f'P{i}'].value)
    tables[6].cell(i, 2).text = str(sheet[f'C{i}'].value)
    tables[6].cell(i, 4).text = str(sheet[f'V{i}'].value)
    tables[6].cell(i, 5).text = str(sheet[f'V{i}'].value)
    tables[6].cell(i, 6).text = '0'
    tables[6].cell(i, 7).text = str(sheet[f'W{i}'].value)
    tables[6].cell(i, 8).text = '0'

tables[6].cell(12, 4).text = str(total_num)
tables[6].cell(12, 5).text = str(total_num)
tables[6].cell(12, 7).text = str(box_num)

for i in range(2, 11):
    tables[13].cell(i - 1, 0).text = str(sheet[f'T{i}'].value)
    tables[13].cell(i - 1, 1).text = str(sheet[f'U{i}'].value)
    tables[13].cell(i - 1, 2).text = str(sheet[f'U{i}'].value)
    tables[13].cell(i - 1, 3).text = str(sheet[f'U{i}'].value)

document.save(path + f'{title}.docx')
print('文件已生成')

寫在最后

如果有感興趣的自動化辦公方向纫普，或者手上有具體的案例想利用python解決

歡迎與我交流阅悍，或者直接在公眾號早起python留言

我們會選取有意思的例子無償解決并發(fā)布教程分享經(jīng)驗(yàn)讓更多人獲益

如果要提供案例需要說清楚需求，以及提供處理過的原始數(shù)據(jù)

我們發(fā)布教程前會對數(shù)據(jù)進(jìn)行無害化處理的哈哈哈哈保護(hù)隱私

原數(shù)據(jù)下載：
https://pan.baidu.com/s/1YFZPT7KViB5O-oQe4y_6HQ
提取碼：ym7p

最后編輯于：2020.05.02 09:14:31

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末昨稼，一起剝皮案震驚了整個(gè)濱河市节视，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌假栓，老刑警劉巖寻行，帶你破解...
沈念sama閱讀 218,284評論 6贊 506
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異匾荆，居然都是意外死亡拌蜘，警方通過查閱死者的電腦和手機(jī)杆烁，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 93,115評論 3贊 395
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來拦坠，“玉大人连躏，你說我怎么就攤上這事≌瓯酰” “怎么了？”我有些...
開封第一講書人閱讀 164,614評論 0贊 354
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵拍棕，是天一觀的道長晓铆。經(jīng)常有香客問我，道長绰播，這世上最難降的妖魔是什么骄噪？我笑而不...
開封第一講書人閱讀 58,671評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任，我火速辦了婚禮蠢箩，結(jié)果婚禮上链蕊，老公的妹妹穿的比我還像新娘。我一直安慰自己谬泌，他們只是感情好滔韵，可當(dāng)我...
茶點(diǎn)故事閱讀 67,699評論 6贊 392
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布。她就那樣靜靜地躺著掌实，像睡著了一般陪蜻。火紅的嫁衣襯著肌膚如雪。梳的紋絲不亂的頭發(fā)上贱鼻，一...
開封第一講書人閱讀 51,562評論 1贊 305
城市分裂傳說
那天宴卖，我揣著相機(jī)與錄音，去河邊找鬼邻悬。笑死症昏，一個(gè)胖子當(dāng)著我的面吹牛，可吹牛的內(nèi)容都是我干的父丰。我是一名探鬼主播肝谭，決...
沈念sama閱讀 40,309評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼，長吁一口氣：“原來是場噩夢啊……” “哼础米！你這毒婦竟也來了分苇？” 一聲冷哼從身側(cè)響起，我...
開封第一講書人閱讀 39,223評論 0贊 276
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤屁桑，失蹤者是張志新（化名）和其女友劉穎医寿，沒想到半個(gè)月后，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體蘑斧，經(jīng)...
沈念sama閱讀 45,668評論 1贊 314
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡靖秩，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 37,859評論 3贊 336
?白月光啟示錄
正文我和宋清朗相戀三年须眷，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片沟突。...
茶點(diǎn)故事閱讀 39,981評論 1贊 348
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡花颗，死狀恐怖，靈堂內(nèi)的尸體忽然破棺而出惠拭，到底是詐尸還是另有隱情扩劝，我是刑警寧澤，帶...
沈念sama閱讀 35,705評論 5贊 347
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布职辅，位于F島的核電站棒呛，受9級特大地震影響，放射性物質(zhì)發(fā)生泄漏域携。R本人自食惡果不足惜簇秒，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 41,310評論 3贊 330
男人毒藥：我在死后第九天來索命
文/蒙蒙一、第九天我趴在偏房一處隱蔽的房頂上張望秀鞭。院中可真熱鬧趋观，春花似錦、人聲如沸锋边。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,904評論 0贊 22
一樁弒父案，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽宠默。三九已至麸恍，卻和暖如春，著一層夾襖步出監(jiān)牢的瞬間搀矫，已是汗流浹背抹沪。一陣腳步聲響...
開封第一講書人閱讀 33,023評論 1贊 270
情欲美人皮
我被黑心中介騙來泰國打工，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留瓤球，地道東北人融欧。一個(gè)月前我還...
沈念sama閱讀 48,146評論 3贊 370
代替公主和親
正文我出身青樓，卻偏偏與公主長得像卦羡，于是被迫代替她去往敵國和親噪馏。傳聞我的和親對象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 44,933評論 2贊 355

[Python] 自動化辦公 從Excel提取信息到word表格

1. 解析Excel的數(shù)據(jù)

寫在最后

推薦閱讀更多精彩內(nèi)容

[Python] 自動化辦公從Excel提取信息到word表格