學(xué)了這一節(jié)感覺難度不大售滤,開始上不去網(wǎng)發(fā)現(xiàn)需要翻~wall罚拟,好吧。還有就是下載圖片的命名發(fā)現(xiàn)切片少了命名會(huì)重就會(huì)覆蓋完箩,下載了半天發(fā)現(xiàn)的赐俗。
我的成果
我的代碼
from bs4 import BeautifulSoup
import requests
import urllib.request
urls=['http://weheartit.com/inspirations/taylorswift?page={}&before='.format(i) for i in range(1,21)]
for url in urls:
wb_data=requests.get(url)
soup=BeautifulSoup(wb_data.text,'lxml')
imgs=soup.select('#main-container > div > div > div > div > div > a > img')
download_links=[]
for img in imgs:
download_links.append(img.get('src'))
file_path='/Users/mac/Desktop/taylor/'
for item,i in zip(download_links,range(1,len(download_links))):
urllib.request.urlretrieve(item,file_path+str(i)+item[-8:])
print('done')
總結(jié)
- 發(fā)現(xiàn)命名重了會(huì)覆蓋,真是的弊知,加了個(gè)變量讓他增長肯定就不重了
- 貌似掛了vpn就不用寫代理了