使用text()來做標(biāo)記,用來確定位置.
測試文本
<tbody>
<tr class="result1">
<th class="field-name">Type</th>
<td>Electronic Thesis or Dissertation</td>
</tr>
<tr class="result2">
<th class="field-name">Type</th>
<td>Text</td>
</tr>
<tr class="result1">
<th class="field-name">Type</th>
<td>Image</td>
</tr>
<tr class="result2">
<th class="field-name">Type</th>
<td>StillImage</td>
</tr>
<tr class="result1">
<th class="field-name">Language</th>
<td>fr</td>
</tr>
<tr class="result2">
<th class="field-name">Identifier</th>
<td>
<a onclick="ga('send', 'event', 'External-link', 'Identifier', '/full.php?id=1183922'); return logDownload('1183922');"
title="View original record">http://www.theses.fr/2016SACLS038</a></td>
</tr>
</tbody>
</table>
測試1
//th[.='Type'] # 獲取到所有文本為Type的值.
我們?yōu)榱双@取,第一個文本
文本1
需要在此基礎(chǔ)上我們獲取它的父節(jié)點下面的td的文本內(nèi)容../td/text()
,我們只需要獲取第一個值加一個坐標(biāo).
(//th[.='Type']/../td/text())[1] # 得到預(yù)期的結(jié)果 Electronic Thesis or Dissertation
使用屬性的多值匹配 使用contains
倘若屬性的值發(fā)生變化.但是存在一定規(guī)律,如下圖class='result1'
或者是class='result2'
之類的.我們需要獲取他們的內(nèi)容.
<tr class="result1">
<th class="field-name">Type</th>
<td>Electronic Thesis or Dissertation</td>
</tr>
<tr class="result2">
<th class="field-name">Type</th>
<td>Text</td>
</tr>
<tr class="result1">
<th class="field-name">Type</th>
<td>Image</td>
</tr>
<tr class="result2">
<th class="field-name">Type</th>
<td>StillImage</td>
</tr>
<tr class="result1">
<th class="field-name">Language</th>
<td>fr</td>
</tr>
xpath 語法
//tr[contains(@class,'result')] # 得到所有class 包含result的語句
獲取多個參數(shù)
<div class="accordion-tabbed__tab-mobile ">
<a href="#" data-id="a2" data-db-target-for="a2" title="Costa M. L."
class="author-name accordion-tabbed__control visible-x"><span>Costa M. L.</span><i aria-hidden="true"
class="icon-arrow_d_n"></i></a>
<div data-db-target-of="a2" class="author-info accordion-tabbed__content"><p>PhD, FRCS (Tr & Orth), Clinical
Senior Lecturer</p>
<p class="author-type"></p>
<p></p>
<p>1Clinical Sciences Institute University of Warwick Medical School, Clinical Sciences Building, University
Hospital, Clifford Bridge Road, Coventry CV2 2DX, UK.</p>
<div class="bottom-info"><p><a href="/author/Costa%2C+M+L">
Search for more papers by this author
</a></p></div>
</div>
</div>
demo
需要一條
xpath
獲取他們的名字,職位,跟機構(gòu).
//div[a/span/text() and div/p/text() and div/div/p/a/text()]