TCP 連接3次握手报慕,斷開連接4次握手圖示
tcp/ip協(xié)議listen函數(shù)中backlog參數(shù)的含義
1团南、client發(fā)送SYN到server谈宛,將狀態(tài)修改為SYN_SEND鸽照,如果server收到請(qǐng)求,則將狀態(tài)修改為SYN_RCVD码秉,并把該請(qǐng)求放到syns queue隊(duì)列中逮矛。
2、server回復(fù)SYN+ACK給client泡徙,如果client收到請(qǐng)求橱鹏,則將狀態(tài)修改為ESTABLISHED,并發(fā)送ACK給server。
3莉兰、server收到ACK挑围,將狀態(tài)修改為ESTABLISHED,并把該請(qǐng)求從syns queue中放到accept queue糖荒。
在linux系統(tǒng)內(nèi)核中維護(hù)了兩個(gè)隊(duì)列:syns queue和accept queue
syns queue
用于保存半連接狀態(tài)的請(qǐng)求杉辙,其大小通過/proc/sys/net/ipv4/tcp_max_syn_backlog指定,一般默認(rèn)值是512捶朵,不過這個(gè)設(shè)置有效的前提是系統(tǒng)的syncookies功能被禁用蜘矢。互聯(lián)網(wǎng)常見的TCP SYN FLOOD惡意DOS攻擊方式就是建立大量的半連接狀態(tài)的請(qǐng)求综看,然后丟棄品腹,導(dǎo)致syns queue不能保存其它正常的請(qǐng)求。
accept queue
用于保存全連接狀態(tài)的請(qǐng)求红碑,其大小通過/proc/sys/net/core/somaxconn指定舞吭,在使用listen函數(shù)時(shí),內(nèi)核會(huì)根據(jù)傳入的backlog參數(shù)與系統(tǒng)參數(shù)somaxconn析珊,取二者的較小值羡鸥。
在Linux下,backlog指定的是complete queue的大小忠寻,而incomplete queue的大小可以由系統(tǒng)管理員在 /proc/sys/net/ipv4/tcp_max_syn_backlog下進(jìn)行統(tǒng)一配置惧浴。
如果accpet queue隊(duì)列滿了,server將發(fā)送一個(gè)ECONNREFUSED錯(cuò)誤信息Connection refused到client奕剃。
[為何PHP5.5.6中fpm backlog Changed default listen() backlog to 65535]
其中理由是“backlog值為65535太大了衷旅。會(huì)導(dǎo)致前面的nginx(或者其他客戶端)超時(shí)”,而且提交者舉例計(jì)算了一下纵朋,假設(shè)FPM的QPS為5000芜茵,那么65535個(gè)請(qǐng)求全部處理完需要13s的樣子。但前端的nginx(或其他客戶端)已經(jīng)等待超時(shí)倡蝙,關(guān)閉了這個(gè)連接。當(dāng)FPM處理完之后绞佩,再往這個(gè)SOCKET ID 寫數(shù)據(jù)時(shí)寺鸥,卻發(fā)現(xiàn)連接已關(guān)閉,得到的是“error: Broken Pipe”品山,在nginx胆建、redis、apache里肘交,默認(rèn)的backlog值兜是511笆载。故這里也建議改為511
backlog的定義是已連接但未進(jìn)行accept處理的SOCKET隊(duì)列大小,已是(并非syn的SOCKET隊(duì)列)。如果這個(gè)隊(duì)列滿了凉驻,將會(huì)發(fā)送一個(gè)ECONNREFUSED錯(cuò)誤信息給到客戶端,即 linux 頭文件 /usr/include/asm-generic/errno.h中定義的“Connection refused”腻要,(如果協(xié)議不支持重傳,該請(qǐng)求會(huì)被忽略)
在linux 2.2以前涝登,backlog大小包括了半連接狀態(tài)和全連接狀態(tài)兩種隊(duì)列大小雄家。linux 2.2以后,分離為兩個(gè)backlog來分別限制半連接SYN_RCVD狀態(tài)的未完成連接隊(duì)列大小跟全連接ESTABLISHED狀態(tài)的已完成連接隊(duì)列大小胀滚√思茫互聯(lián)網(wǎng)上常見的TCP SYN FLOOD惡意DOS攻擊方式就是用/proc/sys/net/ipv4/tcp_max_syn_backlog來控制的,可參見《TCP洪水攻擊(SYN Flood)的診斷和處理》咽笼。
在使用listen函數(shù)時(shí)顷编,內(nèi)核會(huì)根據(jù)傳入?yún)?shù)的backlog跟系統(tǒng)配置參數(shù)/proc/sys/net/core/somaxconn中,二者取最小值剑刑,作為“ESTABLISHED狀態(tài)之后媳纬,完成TCP連接,等待服務(wù)程序ACCEPT”的隊(duì)列大小叛甫。在kernel 2.4.25之前层宫,是寫死在代碼常量SOMAXCONN,默認(rèn)值是128其监。在kernel 2.4.25之后萌腿,在配置文件/proc/sys/net/core/somaxconn (即 /etc/sysctl.conf 之類 )中可以修改。我稍微整理了流程圖抖苦,如下:
uwsgi 啟動(dòng)的時(shí)候listen 設(shè)置大于128是失敗的毁菱,報(bào)錯(cuò):
Listen queue size is greater than the system max net.core.somaxconn (128).
這個(gè)值是來自 :
cat /proc/sys/net/core/somaxconn
128
- 查看進(jìn)程的backlog是多大
Send-Q就是backlog大小
> ss -tl
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:8011 *:*
LISTEN 0 128 127.0.0.1:11211 *:*
LISTEN 0 128 *:8012 *:*
LISTEN 0 10 *:9005 *:*
LISTEN 0 128 *:8013 *:*
LISTEN 0 5 *:9006 *:*
LISTEN 0 128 127.0.0.1:2222 *:*
有的進(jìn)程不是監(jiān)聽端口IP的,是監(jiān)聽unix socket
所以用下面命令可以找到锌历,進(jìn)程監(jiān)聽的socket以及它的backlog是多少
ss -pl|grep owan
u_str LISTEN 0 13 /tmp/owan_web.sock 27171486 * 0
# 我設(shè)置了owan_web.sock的listen backlog是13 贮庞,這里正確顯示了
sudo netstat -a -p --unix|grep owan
unix 2 [ ACC ] STREAM LISTENING 27171486 5424/uwsgi /tmp/owan_web.sock
unix 3 [ ] STREAM CONNECTING 0 - /tmp/owan_web.sock
unix 3 [ ] STREAM CONNECTING 0 - /tmp/owan_web.sock
unix 3 [ ] STREAM CONNECTING 0 - /tmp/owan_web.sock
unix 3 [ ] STREAM CONNECTING 0 - /tmp/owan_web.sock
# 也是類似根據(jù)pid查詢進(jìn)程的狀態(tài), 多個(gè)請(qǐng)求的情況下,有CONNECTED究西, CONNECTING等狀態(tài)
Linux 2.4.7 a backlog of 3 given to listen() results in up to 6 connections being queued.
經(jīng)測(cè)試:當(dāng)uwsgi 監(jiān)聽的時(shí)unix sockets時(shí)候窗慎, listen 的backlog設(shè)置成5的話, 用ab壓測(cè)并發(fā)數(shù)是不可以超過5+3
否認(rèn)會(huì)報(bào)錯(cuò):
connect() to unix:/tmp/owan_web.sock failed (11: Resource temporarily unavailable) while connecting to upstream
ab -c 132 -n 1000 http://runmo.ouwan.com/mobi/essayDetail/?id=3
壓測(cè)結(jié)果:挺正常的
Server Software: nginx/1.4.6
Server Hostname: mo.ouwan.com
Server Port: 80
Document Path: /mobi/essayDetail/?id=3
Document Length: 29903 bytes
Concurrency Level: 130
Time taken for tests: 2.668 seconds
Complete requests: 122
Failed requests: 0
Total transferred: 3666466 bytes
HTML transferred: 3648166 bytes
Requests per second: 45.72 [#/sec] (mean)
Time per request: 2843.376 [ms] (mean)
Time per request: 21.872 [ms] (mean, across all concurrent requests)
Transfer rate: 1341.83 [Kbytes/sec] received
但是當(dāng)uwsgi 監(jiān)聽的是端口IP的時(shí)候卤材,runserver 的backlog默認(rèn)5遮斥, ab壓測(cè)可以遠(yuǎn)遠(yuǎn)超過5
nginx 高并發(fā)優(yōu)化
系統(tǒng)內(nèi)核層面:
net.core.somaxconn = 4096 允許等待中的監(jiān)聽
net.ipv4.tcp_tw_recycle = 1 tcp連接快速回收
net.ipv4.tcp_tw_reuse = 1 tcp連接重用
net.ipv4.tcp_syncookies = 0 不抵御洪水攻擊
ulimit -n 30000
Nginx層面:
解決: nginx.conf 下面: work_connection 加大
worker_connections 10240;
Worker_rlimit_nofiles 10000;
Keepalive_timeout 0;
web 服務(wù)器使用什么并發(fā)策略,是影響最大并發(fā)數(shù)的關(guān)鍵
uwsgi: your server socket listen backlog is limited to 100 connections
Note that a "listen backlog" of 100 connections doesn't mean that your server can only handle 100 simultaneous (or total) connections - this is instead dependent on the number of configured processes or threads. The listen backlog is a socket setting telling the kernel how to limit the number of outstanding (as yet unaccapted) connections in the listen queue of a listening socket. If the number of pending connections exceeds the specified size, new ones are automatically rejected. A functioning server regularly servicing its connections should not require a large backlog size.
upstream prematurely closed connection while reading response header from upstream,
104: Connection reset by peer
http://www.cppblog.com/thisisbin/archive/2010/02/07/107444.html
http://www.reibang.com/p/e6f2036621f4
https://stackoverflow.com/questions/12893379/listen-queue-length-in-socket-programing-in-c
UNIX Domain Socket 與 TCP/IP Socket 對(duì)比
socket API原本是為網(wǎng)絡(luò)通訊設(shè)計(jì)的扇丛,但后來在socket的框架上發(fā)展出一種IPC機(jī)制术吗,就是UNIX Domain Socket。
雖然網(wǎng)絡(luò)socket也可用于同一臺(tái)主機(jī)的進(jìn)程間通訊(通過loopback地址127.0.0.1)帆精,
但是UNIX Domain Socket用于IPC更有效率:不需要經(jīng)過網(wǎng)絡(luò)協(xié)議棧较屿,不需要打包拆包隧魄、計(jì)算校驗(yàn)和、維護(hù)序號(hào)和應(yīng)答等隘蝎,只是將應(yīng)用層數(shù)據(jù)從一個(gè)進(jìn)程拷貝到另一個(gè)進(jìn)程购啄。
UNIX域套接字與TCP套接字相比較,在同一臺(tái)主機(jī)的傳輸速度前者是后者的兩倍末贾。
這是因?yàn)檎⒗#琁PC機(jī)制本質(zhì)上是可靠的通訊,而網(wǎng)絡(luò)協(xié)議是為不可靠的通訊設(shè)計(jì)的拱撵。
UNIX Domain Socket也提供面向流和面向數(shù)據(jù)包兩種API接口辉川,類似于TCP和UDP,但是面向消息的UNIX Domain Socket也是可靠的拴测,消息既不會(huì)丟失也不會(huì)順序錯(cuò)亂乓旗。
A UNIX socket is an inter-process communication mechanism that allows bidirectional data exchange between processes running on the same machine.
IP sockets (especially TCP/IP sockets) are a mechanism allowing communication between processes over the network.
In some cases, you can use TCP/IP sockets to talk with processes running on the same computer (by using the loopback interface).
UNIX domain sockets know that they’re executing on the same system, so they can avoid some checks and operations (like routing);
which makes them faster and lighter than IP sockets.
So if you plan to communicate with processes on the same host, this is a better option than IP sockets.
strace 跟蹤進(jìn)程命令
strace -tfp PID
strace -o output.txt cmd
strace -o output.txt -fp PID
#統(tǒng)計(jì)命令各個(gè)系統(tǒng)條用的次數(shù)
strace -c command
# 通過使用-o選項(xiàng)可以把strace命令的輸出結(jié)果保存到一個(gè)文件中。
sudo strace -o process_strace -p 3229
# strace命令的-e選項(xiàng)僅僅被用來展示特定的系統(tǒng)調(diào)用(例如集索,open屿愚,write等等)
strace -e open cat dead.letter
命令相關(guān):1.https://lesliezhu.github.io/2014/06/19/strace%E5%91%BD%E4%BB%A4%E6%9F%A5%E7%9C%8B%E8%BF%9B%E7%A8%8B%E7%9A%84%E7%B3%BB%E7%BB%9F%E8%B0%83%E7%94%A8/
2.https://linux.cn/article-3935-1.html