各線程總結(jié)
基礎(chǔ)作業(yè)
1.陽(yáng)光問政(zhaopin,抓取崗位)-協(xié)程,線程淌铐,進(jìn)程蔫缸,分布式,
并發(fā)讀取吐葱,寫入一個(gè)文件
拓展作業(yè)
2.淘寶訂單抓取-協(xié)程,線程它匕,進(jìn)程窖认,分布式,
并發(fā)讀取烧给,寫入一個(gè)文件
3.分布式作業(yè)----淘寶A喝噪,淘寶B ,淘寶C 作業(yè)系統(tǒng)
4.抓取網(wǎng)頁(yè)的郵箱-----協(xié)程酝惧,線程,進(jìn)程巫财,分布式哩陕,
協(xié)程:
gevent.monkey.patch_all()#自動(dòng)切
tasklist=[]
for i in range(N):
tasklist.append( gevent.spawn(download,xclist[i],file))
gevent.joinall(tasklist)
線程:
threadlist=[]
for i in range(N):
mythead=threading.Thread(target=download,args=(urllist,))
mythead.start()
threadlist.append(mythead) #加入線程列表
for thd in threadlist:
thd.join()
進(jìn)程:
import multiprocessing
queue.put(mygetstr)#壓入數(shù)據(jù)
queue=multiprocessing.Queue()#進(jìn)程之間傳遞數(shù)據(jù)
processlist = []
for i in range(N):
process=multiprocessing.Process(target=download,args=(xclist[i],queue))
process.start()
processlist.append(process)
print "start"
for p in processlist:
p.join()#等待所有進(jìn)程退出
print "okok"
time.sleep(5)
while not queue.empty():
data=queue.get()
print "get",data
分布式:
sever,client
并發(fā)讀绕较睢:
queue= multiprocessing.Manager().Queue() #多進(jìn)程
processlist = []
for urllist in xclist:
process = multiprocessing.Process(target=go, args=(urllist, queue))
process.start()
processlist.append(process) # 開啟多個(gè)進(jìn)程
readprocess=multiprocessing.Process(target=readdata,args=(queue,))#開啟讀取
readprocess.start()
processlist.append(readprocess)
for p in processlist:
p.join() # 等待所有進(jìn)程退出