Flask Sentry
Sentry
Users and logs provide clues. Sentry provides answers.
What's Sentry?
Sentry fundamentally is a service that helps you monitor and fix crashes in realtime. The server is in Python, but it contains a full API for sending events from any language, in any application.
https://github.com/getsentry/sentry
這里我們就不詳細(xì)介紹,具體內(nèi)容見官網(wǎng),簡單理解是一個(gè)面向主流語言的開源錯(cuò)誤日志收集服務(wù)
Flask use Sentry
flask 如何使用 Sentry?
見官網(wǎng),基本接入非常簡單块差,兩行代碼級(jí)別就搞定
Problem
接入 Sentry 之后我們發(fā)現(xiàn)使用原先 flask.Request.get_data 方法無法獲取原始報(bào)文!
get_data(cache=True, as_text=False, parse_form_data=False)
This reads the buffered incoming data from the client into one bytestring. By default this is cached but that behavior can be changed by setting cache to False.
Usually it’s a bad idea to call this method without checking the content length first as a client could send dozens of megabytes or more to cause memory problems on the server.
Note that if the form data was already parsed this method will not return anything as form data parsing does not cache the data like this method does. To implicitly invoke form data parsing function set parse_form_data to True. When this is done the return value of this method will be an empty string if the form parser handles the data. This generally is not necessary as if the whole data is cached (which is the default) the form parser will used the cached data to parse the form data. Please be generally aware of checking the content length first in any case before calling this method to avoid exhausting server memory.
If as_text is set to True the return value will be a decoded unicode string.
查看 Flask Sentry 的源碼發(fā)現(xiàn),在每個(gè)請求之前 Sentry 會(huì)記錄請求信息柜去,想想他要實(shí)現(xiàn)的功能也應(yīng)該可以預(yù)見這個(gè)實(shí)現(xiàn)。
實(shí)現(xiàn)中會(huì)訪問 request.form
或者 request.data
源碼
werkzeug.wrappers.BaseRequest.data
Contains the incoming request data as string in case it came with a mimetype Werkzeug does not handle.
那也就是說 Sentry 會(huì)先于我們自己的代碼獲取 request.data
拆宛,同時(shí)當(dāng) request.data
無法被 utf8 編碼的情況下嗓奢,拋棄掉這些內(nèi)容,之后我們獲取的內(nèi)容就為空了胰挑。
Resolve
方案一
https://github.com/getsentry/raven-python/issues/457, 我們一定不是第一個(gè)遇到這個(gè)問題的人
@app.before_request
def enable_form_raw_cache():
if request.path.startswith('/redacted'):
if request.content_length > 1024 * 1024: # 1mb
abort(413) # Payload too large
request.get_data(parse_form_data=False, cache=True)
方案二
@app.before_request
def enable_form_raw_cache():
cache_path_list = [
'/PATH_FOO',
'/PATH_BAR',
]
path = request.path
if any([path.startswith(i) for i in cache_path_list]):
request.get_data(parse_form_data=False, cache=True)
總體思路都是在 Sentry 訪問 request.data
之前把它先緩存起來蔓罚,并且是選擇性的緩存起來
N more things
更多的思考
-
為什么會(huì)有不是 utf8 編碼的數(shù)據(jù)椿肩?
- 這個(gè)是開玩笑了瞻颂,因?yàn)槲覀円邮?GBK 編碼的 XML 數(shù)據(jù)豺谈,這個(gè)的根本原因我們就不細(xì)談了。
-
為什么 flask 不緩存所有的原始對(duì)象贡这?
- 這應(yīng)該是個(gè)好問題茬末,可能的原因,太多的原始對(duì)象消耗內(nèi)存盖矫。同時(shí)因?yàn)橐呀?jīng)將請求數(shù)據(jù)從流對(duì)象中讀出丽惭,然后結(jié)構(gòu)化了,也就沒有必要保存原始對(duì)象了