指紋識(shí)別原理
- 在 selenium 抓取數(shù)據(jù)的時(shí)候,會(huì)暴露一些預(yù)定義的 JavaScript 變量喉祭,通過(guò)這些變量可以識(shí)別到用戶是否使用了 selenium 驅(qū)動(dòng);
- 比較典型的例子泛烙,是 "window.navigator.webdriver",在非selenium環(huán)境下其值為undefined胶惰,而在selenium環(huán)境下,其值為true孵滞;
image.png
- 除了 navigator,還有一些其它的標(biāo)志性字符串(不同的瀏覽器可能會(huì)有所不同)坊饶,常見的特征串如下所示:
webdriver
__driver_evaluate
__webdriver_evaluate
__selenium_evaluate
__fxdriver_evaluate
__webdriver_unwrapped
__selenium_unwrapped
__fxdriver_unwrapped
_Selenium_IDE_Recorder
_selenium calledSelenium
_WEBDRIVER_ELEM_CACHE
ChromeDriverw
driver-evaluate
webdriver-evaluate
selenium-evaluate
webdriverCommand
webdriver-evaluate-response
__webdriverFunc
__webdriver_script_fn
__$webdriverAsyncExecutor
__lastWatirAlert
__lastWatirConfirm
__lastWatirPrompt
$chrome_asyncScriptInfo
$cdc_asdjflasutopfhvcZLmcfl_
反指紋識(shí)別方法
- webdriver 配置
options = webdriver.ChromeOptions()
options.add_experimental_option("excludeSwitches", ["enable-automation"])
browser = webdriver.Chrome(chrome_options=options)
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """Object.defineProperty(navigator, 'webdriver', {get: () => undefined})""",
})
- mitmproxy 篡改參數(shù)
# coding: utf-8
# modify_response.py
from mitmproxy import ctx
def response(flow):
"""Modify response data
"""
if '/js/yoda.' in flow.request.url:
# Screening selenium detection
for webdriver_key in ['webdriver', '__driver_evaluate', '__webdriver_evaluate', '__selenium_evaluate',
'__fxdriver_evaluate', '__driver_unwrapped', '__webdriver_unwrapped',
'__selenium_unwrapped', '__fxdriver_unwrapped', '_Selenium_IDE_Recorder', '_selenium',
'calledSelenium', '_WEBDRIVER_ELEM_CACHE', 'ChromeDriverw', 'driver-evaluate',
'webdriver-evaluate', 'selenium-evaluate', 'webdriverCommand',
'webdriver-evaluate-response', '__webdriverFunc', '__webdriver_script_fn',
'__$webdriverAsyncExecutor', '__lastWatirAlert', '__lastWatirConfirm',
'__lastWatirPrompt', '$chrome_asyncScriptInfo', '$cdc_asdjflasutopfhvcZLmcfl_']:
ctx.log.info('Remove "{}" from {}.'.format(webdriver_key, flow.request.url))
flow.response.text = flow.response.text.replace('"{}"'.format(webdriver_key), '"NO-SUCH-ATTR"')
print(webdriver_key)
flow.response.text = flow.response.text.replace('t.webdriver', 'false')
flow.response.text = flow.response.text.replace('ChromeDriver', '')
mitmdump.exe -p Port number -s modify_response.py
image.png