有的時(shí)候一個(gè)任務(wù)需要進(jìn)行大量的網(wǎng)絡(luò)囤攀、磁盤(pán)或數(shù)據(jù)庫(kù)請(qǐng)求软免,比如后臺(tái)抓取網(wǎng)頁(yè),比如離線(xiàn)數(shù)據(jù)統(tǒng)計(jì)等焚挠,完成任務(wù)需要多次迭代膏萧,每次都會(huì)產(chǎn)生網(wǎng)絡(luò)IO、磁盤(pán)IO或數(shù)據(jù)庫(kù)連接等待,導(dǎo)致任務(wù)執(zhí)行時(shí)間很長(zhǎng)榛泛。但是蝌蹂,任務(wù)又有一個(gè)特點(diǎn),迭代之間不需要順序執(zhí)行曹锨,這就是多線(xiàn)程/多進(jìn)程非常適合的場(chǎng)景孤个。
在python里,因?yàn)镚IL的限制沛简,導(dǎo)致沒(méi)有真正的多線(xiàn)程齐鲤,所以ThereadPool也在官網(wǎng)提示使用multiprocessing來(lái)代替。
This module is OBSOLETE and is only provided on PyPI to support old projects that still use it.
Please DO NOT USE IT FOR NEW PROJECTS!
Use modern alternatives like the multiprocessing module in the standard library or even an asynchroneous approach with asyncio.
multiprocessing 的簡(jiǎn)單用法:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
這種方式椒楣,采用的依然是同步順序執(zhí)行的方式给郊,發(fā)揮多進(jìn)程威力的異步方式如下:
from multiprocessing import Pool
import time
def f(x):
time.sleep(1)
return x,x*x
if __name__ == '__main__':
res_list = []
# apply
with Pool(5) as p:
for i in xrange(1,20):
res = pool.apply_async(f,[i])
res_list.append(res)
# print
for res in res_list:
print(res.get())
pool.close()
pool.join()
需要特別注意的幾個(gè)地方:
- res.get() 需要在進(jìn)程調(diào)用完成后,統(tǒng)一獲取捧灰,否則就是同步方式了
- pool.join() 使主進(jìn)程等待所有子進(jìn)程完成后淆九,再退出
- 需要在join前調(diào)用pool.close()
也可以直接定義Processing
from multiprocessing import Process
import os
import time
def info(title):
print(title)
print('module name:', __name__)
print('parent process:', os.getppid())
print('process id:', os.getpid())
def f(name):
time.sleep(1)
info('function f')
print('hello', name)
if __name__ == '__main__':
info('main line')
for i in xrange(10):
p = Process(target=f, args=(i,))
p.start()
# p.join()
備注:
- 通過(guò)os.getppid()和os.getpid() 獲取父進(jìn)程及當(dāng)前進(jìn)程的ID
- 如果希望線(xiàn)程順序執(zhí)行,可調(diào)用p.join()