wireshark可以說是網(wǎng)絡(luò)問題排查的神器财异,里面的功能非常多姻灶,也很實(shí)用红碑,本篇文章就是為一次課程試驗(yàn)名船,使用wireshark排查典型的網(wǎng)絡(luò)場景绰上。
一 試驗(yàn)環(huán)境搭建
采用的主機(jī)是版本都是centos 8.5版本,ip分別為192.168.31.50和192.168.31.200
[root@localhost ~]# cat /etc/centos-release
CentOS Linux release 8.5.2111
[root@localhost ~]# ip addr show ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:cc:ff:99 brd ff:ff:ff:ff:ff:ff
inet 192.168.31.50/24 brd 192.168.31.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet6 fe80::31bf:afff:c603:4fa6/64 scope link noprefixroute
valid_lft forever preferred_lft forever
[root@MiWiFi-RA72-srv wrk-master]# ip addr show ens33
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 00:0c:29:ed:16:d1 brd ff:ff:ff:ff:ff:ff
inet 192.168.31.200/24 brd 192.168.31.255 scope global dynamic noprefixroute ens33
valid_lft 39956sec preferred_lft 39956sec
inet6 fe80::20c:29ff:feed:16d1/64 scope link
valid_lft forever preferred_lft forever
[root@MiWiFi-RA72-srv wrk-master]#
試驗(yàn)環(huán)境采用docker搭建:
docker run --network=host --name=good -itd nginx
docker run --name nginx --network=host -itd feisky/nginx:latenc
驗(yàn)證下是否正常:
[root@localhost ~]# curl http://192.168.31.50:8080
...
<h1>Welcome to nginx!</h1>
....
[root@localhost ~]# curl http://192.168.31.50
...
<h1>Welcome to nginx!</h1>
....
均可正常運(yùn)行渠驼,
1.1 奇怪的小問題
下面采用hping3 測試下兩個(gè)nginx的性能:
[root@localhost ~]# hping3 -c 3 -S -p 80 192.168.31.50
HPING 192.168.31.50 (ens33 192.168.31.50): S set, 40 headers + 0 data bytes
--- 192.168.31.50 hping statistic ---
3 packets transmitted, 0 packets received, 100% packet loss
round-trip min/avg/max = 0.0/0.0/0.0 ms
很奇怪蜈块,hping3 進(jìn)入無法正常請(qǐng)求,首先反應(yīng)是不是防火墻:
[root@localhost ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
防火墻是關(guān)閉的迷扇,于是我抓包看看發(fā)生了什么:
[root@localhost ~]# tcpdump -i lo port 80 -w good_nginx.pcap
dropped privs to tcpdump
tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes
^C120 packets captured
240 packets received by filter
0 packets dropped by kernel
[root@localhost ~]# tshark -r good_nginx.pcap
Running as user "root" and group "root". This could be dangerous.
1 0.000000 192.168.31.50 → 192.168.31.50 TCP 54 1797 → 80 [SYN] Seq=0 Win=512 Len=0
2 0.000013 192.168.31.50 → 192.168.31.50 TCP 58 80 → 1797 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495
3 0.000016 192.168.31.50 → 192.168.31.50 TCP 54 1797 → 80 [RST] Seq=1 Win=0 Len=0
4 1.000585 192.168.31.50 → 192.168.31.50 TCP 54 1798 → 80 [SYN] Seq=0 Win=512 Len=0
5 1.000597 192.168.31.50 → 192.168.31.50 TCP 58 80 → 1798 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495
6 1.000601 192.168.31.50 → 192.168.31.50 TCP 54 1798 → 80 [RST] Seq=1 Win=0 Len=0
7 2.001198 192.168.31.50 → 192.168.31.50 TCP 54 1799 → 80 [SYN] Seq=0 Win=512 Len=0
8 2.001209 192.168.31.50 → 192.168.31.50 TCP 58 80 → 1799 [SYN, ACK] Seq=0 Ack=1 Win=43690 Len=0 MSS=65495
9 2.001213 192.168.31.50 → 192.168.31.50 TCP 54 1799 → 80 [RST] Seq=1 Win=0 Len=0
經(jīng)過tshark的分析百揭,SYN包后,主動(dòng)發(fā)起了RST包蜓席,那看看檢查下是不是半連接數(shù)量設(shè)置小了那:
#sysctl -a|grep tcp_max_syn_backlog
net.ipv4.tcp_max_syn_backlog = 128
查看下監(jiān)聽端口:
[root@localhost ~]# ss -ln|grep 80
tcp LISTEN 0 128 0.0.0.0:8080 0.0.0.0:*
tcp LISTEN 0 128 0.0.0.0:80 0.0.0.0:*
tcp LISTEN 0 128 [::]:80 [::]:*
[root@localhost ~]#
好吧器一,看起來只能通過netstat查看連接失敗的tcp連接是什么情況:
#netstat -s
TcpExt:
387 resets received for embryonic SYN_RECV sockets
這個(gè)數(shù)字再持續(xù)增加,說明是syn_recv狀態(tài)下厨内,收到RST被中斷了(后面發(fā)現(xiàn)正常的包也是這樣的盹舞,這就是hping3 -S 發(fā)送SYN包的機(jī)制,這就是上次試驗(yàn)的DoS攻擊啊)隘庄。然后我突發(fā)奇想,因?yàn)槭菃螜C(jī)癣亚,所以我試了下127.0.0.1丑掺,結(jié)果竟然是通的,抓包的信息是一樣的述雾,換了一臺(tái)機(jī)器也是可以hping3通過:
[root@localhost ~]# hping3 -I lo -n -c 3 -S -p 80 192.168.31.50
HPING 192.168.31.50 (lo 192.168.31.50): S set, 40 headers + 0 data bytes
len=44 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=0 win=43690 rtt=2.8 ms
len=44 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=1 win=43690 rtt=1.4 ms
len=44 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=2 win=43690 rtt=2.0 ms
--- 192.168.31.50 hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.4/2.0/2.8 ms
[root@localhost ~]# hping3 -n -c 3 -S -p 80 127.0.0.1
HPING 127.0.0.1 (lo 127.0.0.1): S set, 40 headers + 0 data bytes
len=44 ip=127.0.0.1 ttl=64 DF id=0 sport=80 flags=SA seq=0 win=43690 rtt=5.8 ms
len=44 ip=127.0.0.1 ttl=64 DF id=0 sport=80 flags=SA seq=1 win=43690 rtt=3.4 ms
len=44 ip=127.0.0.1 ttl=64 DF id=0 sport=80 flags=SA seq=2 win=43690 rtt=2.0 ms
--- 127.0.0.1 hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 2.0/3.7/5.8 ms
[root@localhost ~]# ^C
[root@localhost ~]# hping3 -n -c 3 -S -p 80 192.168.31.50
HPING 192.168.31.50 (ens33 192.168.31.50): S set, 40 headers + 0 data bytes
--- 192.168.31.50 hping statistic ---
3 packets transmitted, 0 packets received, 100% packet loss
round-trip min/avg/max = 0.0/0.0/0.0 ms
通過另外一臺(tái)機(jī)器街州,完全正常,不明就里玻孟,知道者麻煩告知唆缴,謝謝:
[root@MiWiFi-RA72-srv ~]# hping3 -c 3 -S -p 80 192.168.31.50
HPING 192.168.31.50 (ens33 192.168.31.50): S set, 40 headers + 0 data bytes
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=0 win=29200 rtt=2.0 ms
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=1 win=29200 rtt=1.4 ms
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=2 win=29200 rtt=2.0 ms
--- 192.168.31.50 hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.4/1.8/2.0 ms
二 開始試驗(yàn)
2.1 hping3 延遲測試
用hping3 測試延遲,這個(gè)兩個(gè)都差不多黍翎,說明-c指定發(fā)送的包個(gè)數(shù)面徽,-S發(fā)送SYN報(bào)文,-p指定服務(wù)器端的端口,下面跟著的是IP趟紊。
[root@MiWiFi-RA72-srv ~]# hping3 -c 3 -S -p 80 192.168.31.50
HPING 192.168.31.50 (ens33 192.168.31.50): S set, 40 headers + 0 data bytes
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=0 win=29200 rtt=1.8 ms
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=1 win=29200 rtt=1.3 ms
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=80 flags=SA seq=2 win=29200 rtt=2.7 ms
--- 192.168.31.50 hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 1.3/1.9/2.7 ms
[root@MiWiFi-RA72-srv ~]# hping3 -c 3 -S -p 8080 192.168.31.50
HPING 192.168.31.50 (ens33 192.168.31.50): S set, 40 headers + 0 data bytes
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=8080 flags=SA seq=0 win=29200 rtt=0.6 ms
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=8080 flags=SA seq=1 win=29200 rtt=1.2 ms
len=46 ip=192.168.31.50 ttl=64 DF id=0 sport=8080 flags=SA seq=2 win=29200 rtt=2.0 ms
--- 192.168.31.50 hping statistic ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.6/1.2/2.0 ms
看了下氮双,兩個(gè)nginx的端口延遲差不多,都在2ms左右霎匈。
2.2 wrk測試應(yīng)用性能
wrk的測試選項(xiàng)如下:
# --latency 打印延遲統(tǒng)計(jì)
# -c 100 保持100個(gè)連接
# -t 4 個(gè)線程
#--timeout 2 socket請(qǐng)求超時(shí)時(shí)間
[root@MiWiFi-RA72-srv wrk-master]# wrk --latency -c 100 -t 4 --timeout 2 http://192.168.31.50/
測試80端口和8080端口:
[root@MiWiFi-RA72-srv wrk-master]# wrk --latency -c 100 -t 4 --timeout 2 http://192.168.31.50:8080/
Running 10s test @ http://192.168.31.50:8080/
4 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 41.64ms 6.15ms 83.39ms 96.82%
Req/Sec 601.11 81.26 1.05k 80.00%
Latency Distribution
50% 42.08ms
75% 42.69ms
90% 43.51ms
99% 48.09ms
24007 requests in 10.04s, 19.48MB read
Requests/sec: 2391.46
Transfer/sec: 1.94MB
[root@MiWiFi-RA72-srv wrk-master]#
[root@MiWiFi-RA72-srv wrk-master]#
[root@MiWiFi-RA72-srv wrk-master]#
[root@MiWiFi-RA72-srv wrk-master]# wrk --latency -c 100 -t 4 --timeout 2 http://192.168.31.50
Running 10s test @ http://192.168.31.50
4 threads and 100 connections
Thread Stats Avg Stdev Max +/- Stdev
Latency 5.29ms 7.57ms 105.21ms 95.68%
Req/Sec 5.81k 2.07k 10.87k 63.66%
Latency Distribution
50% 3.36ms
75% 5.91ms
90% 9.66ms
99% 39.66ms
232036 requests in 10.09s, 188.76MB read
Requests/sec: 23005.97
Transfer/sec: 18.71MB
兩個(gè)端口的差距還是很大的戴差,一個(gè)Request/sec為2391個(gè),且90%的延遲在43ms铛嘱,而第二個(gè)Request/sec為2萬3千多暖释,90%的延遲在9ms左右,下面就要排查下原因墨吓。
2.3 原因排查
同樣抓包球匕,然后用wireshark打開,我們用兩種方式看:
通過流量圖查看
第一種方式肛真,通過過濾流谐丢,然后查看”統(tǒng)計(jì)“里面的”流量圖“。
-
我們先在wireshark中的列表中蚓让,右鍵乾忱,選擇”追蹤流“ 查看TCP流視圖 ,
然后關(guān)閉彈出窗口历极,這個(gè)好處是再過濾框中窄瘟,自動(dòng)顯示過濾剛才選中的流。
-
查看流量圖
注意勾選 限制顯示過濾器趟卸,選擇TCP Flow和任何地址蹄葱。
通過前面的時(shí)間來看,連續(xù)性沒多大問題锄列,每個(gè)包之間時(shí)間差比較小图云。
再看延遲大的情況:
雖然這種方式也可以看到有的延遲增大,但是還不夠直觀邻邮,可以采用第二種方式竣况。
往返時(shí)間統(tǒng)計(jì)圖
-
通過”統(tǒng)計(jì)“菜單進(jìn)入到"TCP流圖形” 選擇“往返時(shí)間”
可以很清楚的看到,延遲大的往返時(shí)間很多延遲40ms以上筒严,而正常的延遲丹泉,大部分點(diǎn)都在10ms之內(nèi)。
延遲確認(rèn)
再通過第一種方法鸭蛙,可以看到摹恨,延遲大的都是確認(rèn)包,而這個(gè)40ms娶视,是TCP延遲確認(rèn)的最小超時(shí)時(shí)間晒哄。
延遲確認(rèn): TCP對(duì)確認(rèn)的一種優(yōu)化,因?yàn)槿绻麊为?dú)發(fā)確認(rèn)包,信息攜帶的比較少揩晴,所以不是每次收到請(qǐng)求立刻回復(fù)確認(rèn)包勋陪,而是延遲等一會(huì)(40ms這里),然后看看是否有包需要發(fā)送硫兰,有需要發(fā)送包的情況下诅愚,直接將ACK帶過去,如果沒有需要發(fā)送的數(shù)據(jù)劫映,再單獨(dú)發(fā)送確認(rèn)包违孝,所以就有40ms的延遲了。
查看TCP幫助泳赋,TCP_QUICKACK 雌桑,只有TCP套接字設(shè)置了這個(gè)選項(xiàng)才會(huì)開啟快速確認(rèn)。
TCP_QUICKACK (since Linux 2.4.4)
Enable quickack mode if set or disable quickack mode if cleared. In quickack mode, acks are sent immediately, rather than delayed if
needed in accordance to normal TCP operation. This flag is not permanent, it only enables a switch to or from quickack mode. Subsequent
operation of the TCP protocol will once again enter/leave quickack mode depending on internal protocol processing and factors such as
delayed ack timeouts occurring and data transfer. This option should not be used in code intended to be portable.
這個(gè)時(shí)間和系統(tǒng)時(shí)鐘頻率有關(guān)系祖今,它是一個(gè)宏定義:
#define TCP_DELACK_MAX ((unsigned)(HZ/5))
#define TCP_DELACK_MIN ((unsigned)(HZ/25))
看下本機(jī)時(shí)鐘頻率:
[root@MiWiFi-RA72-srv wrk-master]# cat /boot/config-4.18.0-305.3.1.el8.x86_64 |grep 'CONFIG_HZ='
CONFIG_HZ=1000
那么校坑,最大時(shí)間就是1000/5 =200(ms),最小延遲時(shí)間1000/25= 40(ms),關(guān)閉延遲確認(rèn):
setsockopt(sock_fd,IPPROTO_TCP,TCP_QUICKACK,(char*)&value,sizeof(int));
那么來確認(rèn)下wrk是否開啟了這個(gè)選項(xiàng):
[root@MiWiFi-RA72-srv wrk-master]# strace -e trace=network -f wrk --latency -c 100 -t 4 --timeout 2 http://192.168.31.50
....
strace: Process 21316 attached
strace: Process 21318 attached
strace: Process 21319 attached
Running 10s test @ http://192.168.31.50
4 threads and 100 connections
strace: Process 21317 attached
[pid 21317] socket(AF_INET, SOCK_STREAM, IPPROTO_TCP) = 8
[pid 21317] connect(8, {sa_family=AF_INET, sin_port=htons(80), sin_addr=inet_addr("192.168.31.50")}, 16) = -1 EINPROGRESS (Operation now in progress)
[pid 21317] setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4) = 0
....
采用strace 跟蹤所有和網(wǎng)絡(luò)有關(guān)的系統(tǒng)調(diào)用,可以看到
setsockopt(8, SOL_TCP, TCP_NODELAY, [1], 4)
沒有設(shè)置TCP_QUICKACK 千诬。
那么問題來了耍目,同樣的命令為什么另外一個(gè)端口不會(huì)問題那。
這里面仔細(xì)看報(bào)文徐绑,會(huì)發(fā)現(xiàn)邪驮,延遲大的時(shí)候,發(fā)送的包也在等傲茄,這就涉及到Nagle 算法(納格算法)毅访。
Nagle算法
Nagle 算法也是一種減少TCP小包發(fā)送數(shù)據(jù)包的一種優(yōu)化算法,算法策略:
1.沒有發(fā)送未確認(rèn)報(bào)文時(shí)候盘榨,立刻發(fā)送喻粹;2. 如果存在未確認(rèn)報(bào)文,需要等到【沒有已發(fā)送未確認(rèn)報(bào)文】或者【數(shù)據(jù)包長度達(dá)到MSS大小】草巡,再發(fā)送數(shù)據(jù)磷斧。
借用網(wǎng)上一張圖表示:
啟用這個(gè)算法后,如果我們通過telnet慢速敲入HELLO捷犹,剛開始要發(fā)送H,雖然包很小冕末,但是沒有需要確認(rèn)的包萍歉,可以立刻發(fā)送,但是發(fā)送完畢后档桃,由于H的確認(rèn)還沒有來枪孩,所以還必須等待,直到H報(bào)文的ACK報(bào)文來的,報(bào)文也積累了ELL三個(gè)字符蔑舞,下面類似拒担。不采用這個(gè)算法,可以看到攻询,只要窗口夠大从撼,可以直接發(fā)送多個(gè)字符。
這個(gè)算法默認(rèn)是開啟的钧栖,關(guān)閉低零,可以通過設(shè)置socket 的TCP_NODELAY 選項(xiàng)來關(guān)閉。
TCP_NODELAY
If set, disable the Nagle algorithm. This means that segments are always sent as soon as possible, even if there is only a small amount of data. When not set, data is buffered until there is a sufficient amount to send out, thereby avoiding the frequent sending of small packets, which results in poor utilization of the network. This option is overridden by TCP_CORK; however, setting this option forces an explicit flush of pending output, even if TCP_CORK is currently set.
我們測試的nginx 是關(guān)閉的TCP_NODEAY,如下:
[root@localhost ~]# docker exec nginx cat /etc/nginx/nginx.conf|grep tcp_nodelay
tcp_nodelay off;
兩種算法結(jié)合
兩種算法本來沒啥問題拯杠,結(jié)合在一起的時(shí)候掏婶,就有可能導(dǎo)致延遲過大的問題,借用網(wǎng)上一位老哥的圖潭陪,比較清晰的說明這個(gè)情況了:
我們這次試驗(yàn)種雄妥,wrk開啟了延遲確認(rèn),而有問題的Nginx的鏡像開啟了Nagle算法依溯,所以導(dǎo)致延遲比較大老厌,不過我們的延遲是40ms。