1、Request對(duì)象
class Request(object_ref):
def __init__(self, url, callback=None, method='GET', headers=None, body=None,
cookies=None, meta=None, encoding='utf-8', priority=0,
dont_filter=False, errback=None, flags=None, cb_kwargs=None):
self._encoding = encoding # this one has to be set first
self.method = str(method).upper()
self._set_url(url)
self._set_body(body)
if not isinstance(priority, int):
raise TypeError("Request priority not an integer: %r" % priority)
self.priority = priority
if callback is not None and not callable(callback):
raise TypeError('callback must be a callable, got %s' % type(callback).__name__)
if errback is not None and not callable(errback):
raise TypeError('errback must be a callable, got %s' % type(errback).__name__)
self.callback = callback
self.errback = errback
self.cookies = cookies or {}
self.headers = Headers(headers or {}, encoding=encoding)
self.dont_filter = dont_filter
self._meta = dict(meta) if meta else None
self._cb_kwargs = dict(cb_kwargs) if cb_kwargs else None
self.flags = [] if flags is None else list(flags)
??Request對(duì)象在寫爬蟲竹习,爬取一頁(yè)的數(shù)據(jù)需要重新發(fā)送一個(gè)請(qǐng)求的時(shí)候調(diào)用昔园,這個(gè)類需要傳遞一些參數(shù),其中比較常用的參數(shù)有:
??1)url:這個(gè)Request對(duì)象發(fā)送請(qǐng)求的url澜搅;
??2)callback:在下載器下載完相應(yīng)的數(shù)據(jù)后執(zhí)行的回調(diào)函數(shù)觅丰;
??3)method:請(qǐng)求的方法蜕企,默認(rèn)為GET方法懦底,可以設(shè)置為其他方法;
??4)headers:請(qǐng)求頭弥臼,對(duì)于一些固定的設(shè)置,放在settings.py中指定就可以了,對(duì)于那些非固定的,可以在發(fā)送請(qǐng)求的時(shí)候指定闪檬;
??5)meta:比較常用样傍,用于在不同的請(qǐng)求之間傳遞數(shù)據(jù)用的;
??6)encoding:編碼,默認(rèn)的為utf-8,使用默認(rèn)的就可以了跷究;
??7)dot_filter:表示不由調(diào)度器過濾,在執(zhí)行多次重復(fù)的請(qǐng)求的時(shí)候用得比較多;
??8)errback:在發(fā)生錯(cuò)誤的時(shí)候執(zhí)行的函數(shù)夫偶。
2说铃、Response對(duì)象
??Response對(duì)象一般是由Scrapy給你自動(dòng)構(gòu)建的砾嫉,因此開發(fā)者不需要關(guān)心如何創(chuàng)建Response對(duì)象,而是如何使用,Response對(duì)象有很多屬性,可以用來提取數(shù)據(jù)的,主要有如下屬性:
??1)meta:從其他請(qǐng)求傳過來的meta屬性,可以用來保持多個(gè)請(qǐng)求之間的數(shù)據(jù)連接;
??2)encoding:返回當(dāng)前字符串編碼和解碼的格式;
??3)text:將返回來的數(shù)據(jù)作為unicode字符串返回;
??4)body:將返回來的數(shù)據(jù)作為bytes字符串返回残吩;
??5)xpath:xpath選擇器旁瘫;
??6)css:css選擇器。