昨天凌晨2點(diǎn)醒了看了下向右奔跑的文章垢粮,準(zhǔn)備來個(gè)scrapy跨頁面的數(shù)據(jù)爬取意荤,以簡書七日熱門數(shù)據(jù)為例戳晌。
1 items.py代碼
from scrapy.item import Item,Field
class SevendayItem(Item):
article_url = Field()#文章鏈接在首頁爬取
author = Field()
article = Field()
date = Field()
word = Field()
view = Field()
comment = Field()
like = Field()
gain = Field()
可以看出诈闺,我要爬取的數(shù)據(jù)不在一個(gè)頁面唠倦,這時(shí)候就需要跨頁面爬取了菜枷。