xpath的模糊查詢
//div[contains(text(),"history-loadmore") and not(contains(@class, "history-loadmore hide"))]
選取同級節(jié)點(diǎn)
# 同級節(jié)點(diǎn)下個(gè)節(jié)點(diǎn)
//div[@class='listpage']/span/following-sibling::a[1]
# 同級節(jié)點(diǎn)上個(gè)節(jié)點(diǎn)
//div[@class='address-row']/table/tbody/tr[@id='submitTime']/preceding-sibling::tr[1]
獲取父級節(jié)點(diǎn)
//div[@class='page-box house-lst-page-box']/parent::div
xpath定位
# 大于1
//li[position()>1]
# 倒數(shù)第一個(gè)
//li[last()]
# 倒數(shù)第二個(gè)
//li[last()-1]
列表時(shí)間篩選
//span[@class='light' and number(translate(text(),'更新時(shí)間-',''))>20171204]/../../../../h3/a/@href
xpath獲取標(biāo)簽
content_html = html.xpath("http://div[@class='show-content-free']")
content_html = etree.tostring(content_html[0], encoding='UTF-8', pretty_print=False, method='html')
content_html = content_html.decode()
xpath的string()方法
content_text = html.xpath("string(//div[@class='show-content-free'])")[0]
使用xpath獲取標(biāo)簽
content_html = response.xpath("http://div[@class='txt_con']")
content_html = etree.tostring(content_html[0], encoding='UTF-8', pretty_print=False, method='html')
content_html = content_html.decode()
requests獲取標(biāo)簽的所有內(nèi)容
content_text = response.xpath("http://div[@id='ctrlfscont']")
content_text = content_text[0].xpath('string(.)').encode('utf-8').strip().decode()
最后編輯于 :
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者