翻譯自:
http://www.haproxy.org/#doc1.4
1. Quick reminder about HTTP
當(dāng) haproxy 運行于 HTTP 模式梢褐,請求報文和響應(yīng)報文都將被徹底地進(jìn)行分析和建立索引丛晌,因而基本上可以對 HTTP 報文的任何內(nèi)容進(jìn)行匹配邻吭。
如果能理解 HTTP 請求報文和響應(yīng)報文是如何建立的,那么在配置中編寫正確的規(guī)則就更為容易。
1.1. The HTTP transaction model
HTTP 協(xié)議是 transaction-driven,對應(yīng)于一個請求,有且僅有一個響應(yīng)痕囱。傳統(tǒng)的工作模式是這樣的:client 與 server 建立連接,client 向 server 發(fā)出 HTTP 請求報文暴匠,server 回復(fù)響應(yīng)報文給 client鞍恢,連接關(guān)閉。新的請求只能新起一個新的連接發(fā)送:
[CON1] [REQ1] ... [RESP1] [CLO1] [CON2] [REQ2] ... [RESP2] [CLO2] ...
這種模式被稱為 "HTTP close" 模式每窖,有多少個 HTTP transaction帮掉,對應(yīng)就有多少個連接被建立。當(dāng) server 回復(fù)了響應(yīng)報文后窒典,服務(wù)端就主動關(guān)閉鏈接蟆炊,因此 client 不需要知道內(nèi)容的長度。
由于 HTTP 協(xié)議的 transactional 屬性瀑志,有了改進(jìn)的方法涩搓。對于兩個連續(xù)的 transactions污秆,server 在第一次響應(yīng)后不會馬上關(guān)閉連接。
在這種模式中昧甘,server 需要將響應(yīng)內(nèi)容的長度告訴 client 以避免客戶端無限期地等待良拼。為此,一個特殊的 header 被使用:"Content-length"疾层。這個模式被稱為 "keep-alive" 模式:
[CON] [REQ1] ... [RESP1] [REQ2] ... [RESP2] [CLO] ...
這種模式可以減少兩個 transactions 之間的延遲将饺,并且減輕 server 端處理連接建立贡避、關(guān)閉的工作痛黎。一般來說這種模式好于第一種 "HTTP close" 模式,但也不總是這樣刮吧,因為客戶端經(jīng)常限制了他們的并發(fā)連接數(shù)為一個比較小的值湖饱。
最后一種改進(jìn)模式是 "pipelining" 模式。它仍然使用 keep-alive 連接保持杀捻,但 client 不等待接收第一個響應(yīng)之后才發(fā)送第二個請求井厌,這對于獲取大量的圖片來組成一個頁面時是很有用的:
[CON] [REQ1] [REQ2] ... [RESP1] [RESP2] [CLO] ...
這種模式對于性能的提升是顯而易見的,因為 client 的一個請求與下一個請求之間沒有了網(wǎng)絡(luò)延遲致讥。許多的 HTTP agent 不能正確支持 "pipelining" 模式仅仆,因為無法在 HTTP 中將請求和響應(yīng)進(jìn)行關(guān)聯(lián)。因為這個原因垢袱,server 必須嚴(yán)格按照接收到的請求的順序發(fā)送響應(yīng)墓拜。
HAProxy 默認(rèn)工作于 "tunnel-like" 模式,支持連接保持:對于每個連接请契,HAProxy 處理第一個請求咳榜,然后將后續(xù)的所有..(包括額外的請求) 轉(zhuǎn)發(fā)到被選擇的服務(wù)器。一旦連接建立爽锥,連接在 client 和 server 端都是持久的涌韩。
HAProxy 如果使用了 "option http-server-close" 選項,連接在 client 端是持久的氯夷,對于所有進(jìn)來的請求進(jìn)行獨立的處理臣樱,將它們分發(fā)到后端服務(wù)器,server 端以 "HTTP close" 模式工作腮考。
HAProxy 如果使用了 "option httpclose" 選項擎淤,client 和 server 端都工作于 "HTTP close" 模式。
如果 server 在 "HTTP close" 模式工作不正常秸仙,可嘗試使用 "option forceclose" 或者 "option http-pretend-keepalive" 選項嘴拢,或許會有幫助。
1.2. HTTP request
首先寂纪,我們看看這個 HTTP 請求:
Line Contents
number
1 GET /serv/login.php?lang=en&profile=2 HTTP/1.1
2 Host: www.mydomain.com
3 User-agent: my small browser
4 Accept: image/jpeg, image/gif
5 Accept: image/png
1.2.1. The Request line
Line 1 是 "request line"席吴,它總是由三個字段組成赌结,三個字段通常以空格(LWS)分隔:
- a METHOD : GET
- a URI : /serv/login.php?lang=en&profile=2
- a version tag : HTTP/1.1
這種結(jié)構(gòu)很好解析, HAProxy 可以自行對其進(jìn)行解析孝冒,所以無需用戶自己寫復(fù)雜的正則表達(dá)式去抓取其中的字段柬姚。
注:LWS (linear white spaces),which are commonly spaces, but can also be tabs or line feeds/carriage returns followed by spaces/tabs.
URI 可以有幾種不同的形式 :
-
一個 “相對的 URI” :
/serv/login.php?lang=en&profile=2
這是一個不包括 host 部分的完整的 URL。一般情況下庄涡,服務(wù)器量承,反向代理和透明代理都接收這種 URI。
-
一個 “絕對的 URI”穴店,也被稱為 “URL” :
它的組成為:
scheme: 格式為 <協(xié)議名>://
host: 主機(jī)名或IP地址
端口號: 格式為 ":PORT"撕捍,是可選項
相對 URI: 以 / 為起始,跟在地址后面
反向代理一般會接收這種請求泣洞,但支持 HTTP/1.1 協(xié)議的服務(wù)器也必須接收這種形式的請求忧风。
-
a star ('*') :
這種形式必須和 OPTIONS 方法聯(lián)合使用,并且能被 relay球凰。這是用于查詢下一跳的能力的狮腿。
-
an address:port combination : 192.168.0.12:80
這必須和 CONNECT 方法聯(lián)合使用,用于通過 HTTP 代理建立 TCP 隧道呕诉,一般是為了 HTTPS缘厢,有時也為其他協(xié)議。
在相對 URI /serv/login.php?lang=en&profile=2 中甩挫,有兩個 sub-parts贴硫。
/serv/login.php 是 “path”,這是一個文件在服務(wù)器上的相對路徑捶闸。
lang=en&profile=2 是 “query string”夜畴,通常與 GET 方法一起使用,請求目標(biāo)通常是一個動態(tài)腳本删壮。它的含義與具體的動態(tài)語言贪绘、框架、應(yīng)用相關(guān)央碟。
1.2.2. The request headers
The headers start at the second line. They are composed of a name at the
beginning of the line, immediately followed by a colon (':'). Traditionally,
an LWS is added after the colon but that's not required. Then come the values.
Multiple identical headers may be folded into one single line, delimiting the
values with commas, provided that their order is respected. This is commonly
encountered in the "Cookie:" field. A header may span over multiple lines if
the subsequent lines begin with an LWS. In the example in 1.2, lines 4 and 5
define a total of 3 values for the "Accept:" header.
從 Line 2 開始是 HTTP 的 headers(首部)税灌,格式為 header_name: value。
2 Host: www.mydomain.com
3 User-agent: my small browser
4 Accept: image/jpeg, image/gif
5 Accept: image/png
<空行>
Line 4 和 5 可合并為一行:
Accept: image/jpeg, image/gif, image/png
Contrary to a common mis-conception, header names are not case-sensitive, and
their values are not either if they refer to other header names (such as the
"Connection:" header).
首部名對大小寫不敏感亿虽。
The end of the headers is indicated by the first empty line. People often say
that it's a double line feed, which is not exact, even if a double line feed
is one valid form of empty line.
首部以一個空行為結(jié)尾菱涤。double line feed :LFLF 也是一種有效的空行。
Fortunately, HAProxy takes care of all these complex combinations when indexing
headers, checking values and counting them, so there is no reason to worry
about the way they could be written, but it is important not to accuse an
application of being buggy if it does unusual, valid things.
HAProxy 能夠?qū)λ鼈冞M(jìn)行正確解析洛勉。
Important note:
As suggested by RFC2616, HAProxy normalizes headers by replacing line breaks
in the middle of headers by LWS in order to join multi-line headers. This
is necessary for proper analysis and helps less capable HTTP parsers to work
correctly and not to be fooled by such complex constructs.
1.3. HTTP response
以下是一個 HTTP response:
Line Contents
number
1 HTTP/1.1 200 OK
2 Content-length: 350
3 Content-Type: text/html
As a special case, HTTP supports so called "Informational responses" as status
codes 1xx. These messages are special in that they don't convey any part of the
response, they're just used as sort of a signaling message to ask a client to
continue to post its request for instance. In the case of a status 100 response
the requested information will be carried by the next non-100 response message
following the informational one. This implies that multiple responses may be
sent to a single request, and that this only works when keep-alive is enabled
(1xx messages are HTTP/1.1 only). HAProxy handles these messages and is able to
correctly forward and skip them, and only process the next non-100 response. As
such, these messages are neither logged nor transformed, unless explicitly
state otherwise. Status 101 messages indicate that the protocol is changing
over the same connection and that haproxy must switch to tunnel mode, just as
if a CONNECT had occurred. Then the Upgrade header would contain additional
information about the type of protocol the connection is switching to.
1.3.1. The Response line
Line 1 is the "response line". It is always composed of 3 fields :
- a version tag : HTTP/1.1
- a status code : 200
- a reason : OK
The status code is always 3-digit. The first digit indicates a general status :
- 1xx = informational message to be skipped (eg: 100, 101)
- 2xx = OK, content is following (eg: 200, 206)
- 3xx = OK, no content following (eg: 302, 304)
- 4xx = error caused by the client (eg: 401, 403, 404)
- 5xx = error caused by the server (eg: 500, 502, 503)
Please refer to RFC2616 for the detailed meaning of all such codes. The
"reason" field is just a hint, but is not parsed by clients. Anything can be
found there, but it's a common practice to respect the well-established
messages. It can be composed of one or multiple words, such as "OK", "Found",
or "Authentication Required".
Haproxy 自己可能發(fā)出以下的 status code :
Code When / reason
200 access to stats page, and when replying to monitoring requests
301 when performing a redirection, depending on the configured code
302 when performing a redirection, depending on the configured code
303 when performing a redirection, depending on the configured code
307 when performing a redirection, depending on the configured code
308 when performing a redirection, depending on the configured code
400 for an invalid or too large request
401 when an authentication is required to perform the action (when
accessing the stats page)
403 when a request is forbidden by a "block" ACL or "reqdeny" filter
408 when the request timeout strikes before the request is complete
500 when haproxy encounters an unrecoverable internal error, such as a
memory allocation failure, which should never happen
502 when the server returns an empty, invalid or incomplete response, or
when an "rspdeny" filter blocks the response.
503 when no server was available to handle the request, or in response to
monitoring requests which match the "monitor fail" condition
504 when the response timeout strikes before the server responds
Haproxy 的 4xx 和 5xx 狀態(tài)碼可進(jìn)行自定義粘秆,(see "errorloc" in section
4.2).
1.3.2. The response headers
Response headers work exactly like request headers, and as such, HAProxy uses
the same parsing function for both. Please refer to paragraph 1.2.2 for more
details.