DPVS 測(cè)試需要的環(huán)境比較復(fù)雜蓝谨,按照官方文檔 simple fnat 測(cè)試一下單機(jī)雙臂 fnat. 關(guān)于安裝編繹沒啥好說的呕诉,按 github 做就可以,但是一定要打開 DEBUG 模式倦卖,并且日志級(jí)別也為 DEBUG
測(cè)試環(huán)境
ubuntu 16.04.5
# uname -a
Linux jjh-dpvs-test0 4.4.0-116-generic 140-Ubuntu SMP Mon Feb 12 21:23:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
lspci -v | grep Eth
02:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
02:00.1 Ethernet controller: Intel Corporation I350 Gigabit Network Connection (rev 01)
06:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
兩個(gè) I350 網(wǎng)卡用于測(cè)試释簿,剩于網(wǎng)卡用于 ssh 暫時(shí)不用
ip 分配
┌───────────────────┐ ┌────────────────┐
│ dpvs │ │ │
│ │ │ real server │
│ │ ┌──────?│10.20.34.24:6379│
│ │ │ │ │
│ │ │ │ │
│ │ │ └────────────────┘
┌─────┴───────┐ ┌─────┴───────┐ │
│ │ │ │ │
│ │ │ │──────┘
┌──────────────┐ │ dpdk1 │ │ │
│ │ │ VIP │ │ dpdk0 │
│ client │ │10.20.101.43:│ │ LIP │
│ 10.34.38.43 ├───────?│ 6379 │ │10.20.102.41 │
│ │ │ │ │ │
└──────────────┘ │ │ │ │──────┐
│ │ │ │ │
└─────┬───────┘ └─────┬───────┘ │
│ │ │ ┌────────────────┐
│ │ │ │ │
│ │ │ │ real server │
│ │ └──────?│10.20.74.41:6379│
│ │ │ │
└───────────────────┘ │ │
└────────────────┘
Client IP: 10.34.38.43 測(cè)試客戶端網(wǎng)卡
DPDK1 VIP: 10.20.101.43 wan 網(wǎng)卡
DPDK0 LIP: 10.20.102.41 lan 網(wǎng)卡
RS1: 10.20.34.24
RS2: 10.20.74.41
配置服務(wù)
wan 網(wǎng)卡添加 vip
dpip addr add 10.20.101.43/32 dev dpdk1
添加 wan 默認(rèn)路由
dpip route add default via 10.20.101.254 dev dpdk1
在 client 機(jī)器 ping vip 確保生效
ping 10.20.101.43
PING 10.20.101.43 (10.20.101.43) 56(84) bytes of data.
64 bytes from 10.20.101.43: icmp_seq=1 ttl=58 time=3.66 ms
64 bytes from 10.20.101.43: icmp_seq=2 ttl=58 time=3.52 ms
添加 ipvs service 輪循算法
ipvsadm -A -t 10.20.101.43:6379 -s rr
添加兩個(gè) rs
ipvsadm -a -t 10.20.101.43:6379 -r 10.20.34.24:6379 -b
ipvsadm -a -t 10.20.101.43:6379 -r 10.20.74.41:6379 -b
添加 lan lip
ipvsadm --add-laddr -z 10.20.102.41 -t 10.20.101.43:6379 -F dpdk0
添加 dpdk0 默認(rèn)路由
dpip route add default via 10.20.102.254 dev dpdk0
在 client 機(jī)器 ping lip 確保生效
ping 10.20.102.41
PING 10.20.102.41 (10.20.102.41) 56(84) bytes of data.
64 bytes from 10.20.102.41: icmp_seq=1 ttl=58 time=3.52 ms
64 bytes from 10.20.102.41: icmp_seq=2 ttl=58 time=3.43 ms
至少配置完成,這里走了些彎路洒试,由于歷史原因交換機(jī)配置導(dǎo)致 lip 不通倍奢。感謝 sys 組春波同學(xué)幫忙。
測(cè)試效果
redis-cli -h 10.20.101.43 -p 6379 get a
發(fā)現(xiàn)在測(cè)試機(jī)訪問 redis 服務(wù)失敗垒棋,排查看看到底哪里出了問題卒煞。
client 機(jī)器執(zhí)行
tcpdump port 6379 -i bond0 -n
rs 兩個(gè)機(jī)器執(zhí)行
tcpdump port 6379 -i bond0 -n
dpvs 觀察日志
tail -f /var/log/dpvs.log
然后再訪問 redis 服務(wù)
redis-cli -h 10.20.101.43 -p 6379 get a
測(cè)試 client 輸出
13:32:22.130615 IP 10.34.38.43.37943 > 10.20.101.43.6379: Flags [S], seq 1653003455, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.127957 IP 10.34.38.43.37943 > 10.20.101.43.6379: Flags [S], seq 1653003455, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
連續(xù)發(fā)了兩個(gè) syn 包,也就是說第一次 syn 超時(shí)后又重試了一次叼架。
看下 rs 輸出
13:32:22.127008 IP 10.20.102.41.1029 > 10.20.34.24.6379: Flags [S], seq 338949052, win 29200, options [exp-9437,mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:22.127035 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.123551 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.124287 IP 10.20.102.41.1029 > 10.20.34.24.6379: Flags [S], seq 338949052, win 29200, options [exp-9437,mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:23.124304 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
13:32:25.123557 IP 10.20.34.24.6379 > 10.20.102.41.1029: Flags [S.], seq 930729927, ack 338949053, win 29200, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
可以看到 rs 10.20.34.24 己經(jīng)給 dpvs lip 10.20.102.41 回復(fù) syn+ack 包了畔裕,但是沒有完成第三次握手。
再來看下 dpvs 日志
IPVS: conn lookup: [6] TCP 10.34.38.43:37943 -> 10.20.101.43:6379 miss
SAPOOL: sa_pool_fetch: 10.20.102.41:1029 fetched!
IPVS: new conn: [6] TCP 10.34.38.43:37943 10.20.101.43:6379 10.20.102.41:1029 10.20.34.24:6379 refs 2
IPVS: state trans: TCP in [S...] 10.34.38.43:37943->10.20.34.24:6379 state NONE->SYN_RECV conn.refcnt 2
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
IPVS: conn lookup: [6] TCP 10.34.38.43:37943 -> 10.20.101.43:6379 hit
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
IPVS: conn lookup: [3] TCP 10.20.34.24:6379 -> 10.20.102.41:1029 miss
IPVS: tcp_conn_sched: [3] try sched non-SYN packet: [S.A.] 10.20.34.24:6379->10.20.102.41:1029
首先乖订,可以看到從 sa_pool 中正確的獲取了本地端口 1029扮饶,然后將 syn 包轉(zhuǎn)發(fā)到了后端 rs 10.20.34.24, 狀態(tài)由 NONE 變成了 SYN_RECV
然后 dpvs 接到 rs 的 syn+ack 回包,去查找 session 流表時(shí)發(fā)現(xiàn) miss 然后就把包 drop 了乍构√鹞蓿可以看到數(shù)據(jù)是 cpu [6] 發(fā)送的,但是返程數(shù)據(jù)接收的是 cpu[3]
問題原因
由現(xiàn)象可以得知哥遮,是返程數(shù)據(jù)親和性問題岂丘,通過官方 issue 及文檔,得知 我的測(cè)試網(wǎng)卡 I350 暫時(shí)不支持 flow director, 所以只能用 1 worker 來測(cè)試昔善。下周申請(qǐng)萬兆網(wǎng)卡測(cè)試吧元潘,還得做性能測(cè)試。
小感概一下君仆,對(duì)于開源軟件翩概,如果不懂源碼有些問題真是無從下手。
更新20181204
在 sys 組春波和文強(qiáng)的幫助下返咱,換了萬兆網(wǎng)卡钥庇,simple fullnat 測(cè)試通過。下一步做單機(jī)的性能測(cè)試咖摹,最后是 ospf + funat