從Redis連接池獲取連接失敗的原因說(shuō)起

問(wèn)題描述

其他業(yè)務(wù)線的同學(xué)在測(cè)試環(huán)境發(fā)現(xiàn)應(yīng)用程序一直不能獲取redis連接首繁,我?guī)兔戳讼隆?br> 首先看應(yīng)用錯(cuò)誤日志

Caused by: org.springframework.data.redis.RedisConnectionFailureException: Cannot get Jedis connection; nested exception is redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:97)
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.getConnection(JedisConnectionFactory.java:143)
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.getConnection(JedisConnectionFactory.java:41)
    at org.springframework.data.redis.core.RedisConnectionUtils.doGetConnection(RedisConnectionUtils.java:85)
    at org.springframework.data.redis.core.RedisConnectionUtils.getConnection(RedisConnectionUtils.java:55)
    at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:169)
    at org.springframework.data.redis.core.RedisTemplate.execute(RedisTemplate.java:149)
    ... 76 more
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
    at redis.clients.util.Pool.getResource(Pool.java:22)
    at org.springframework.data.redis.connection.jedis.JedisConnectionFactory.fetchJedisConnector(JedisConnectionFactory.java:90)
    ... 83 more
Caused by: java.util.NoSuchElementException: Could not create a validated object, cause: ValidateObject failed
    at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:871)
    at redis.clients.util.Pool.getResource(Pool.java:20)
    ... 84 more

問(wèn)題調(diào)查

確定環(huán)境

發(fā)現(xiàn)是使用spring-data-redis通過(guò)jedis連接的redis服務(wù)端渐北。
這個(gè)系統(tǒng)的代碼很久沒(méi)動(dòng)德崭,已經(jīng)忘記了。先看看使用的jar版本吧双肤。
查看應(yīng)用程序使用的相關(guān)jar:

lsof -p 19377 | grep -E "jedis|pool|redis"

發(fā)現(xiàn)輸出的jar包含:commons-pool-1.3.jar杉适、spring-data-redis-1.1.1.RELEASE.jar、jedis-2.1.0.jar
翻了下commons pool相關(guān)代碼

try {
    _factory.activateObject(latch.getPair().value);
    if(_testOnBorrow &&
            !_factory.validateObject(latch.getPair().value)) {
        throw new Exception("ValidateObject failed");
    }
    synchronized(this) {
        _numInternalProcessing--;
        _numActive++;
    }
    return latch.getPair().value;
}
catch (Throwable e) {
    PoolUtils.checkRethrow(e);
    // object cannot be activated or is invalid
    try {
        _factory.destroyObject(latch.getPair().value);
    } catch (Throwable e2) {
        PoolUtils.checkRethrow(e2);
        // cannot destroy broken object
    }
    synchronized (this) {
        _numInternalProcessing--;
        if (!newlyCreated) {
            latch.reset();
            _allocationQueue.add(0, latch);
        }
        allocate();
    }
    if(newlyCreated) {
        throw new NoSuchElementException("Could not create a validated object, cause: " + e.getMessage());
    }
    else {
        continue; // keep looping
    }
}

可見(jiàn)客戶端應(yīng)該是配置了testOnBorrow鸟辅,在校驗(yàn)連接時(shí)失敗了氛什。

java操作redis有多種客戶端,項(xiàng)目使用spring-data-redis操作redis匪凉,在spring-data-redis中也有不同的客戶端實(shí)現(xiàn)如jedis枪眉,lettuce等。根據(jù)錯(cuò)誤日志推斷使用的redis客戶端實(shí)現(xiàn)為jedis再层。
查看JedisConnectionFactory源碼
JedisPool中定義了校驗(yàn)對(duì)象的代碼贸铜。

public boolean validateObject(final Object obj) {
    if (obj instanceof Jedis) {
        final Jedis jedis = (Jedis) obj;
        try {
            return jedis.isConnected() && jedis.ping().equals("PONG");
        } catch (final Exception e) {
            return false;
        }
    } else {
        return false;
    }
}

通過(guò)wireshark查看TCP包并確定問(wèn)題原因

熟悉redis的同學(xué)都知道,redis客戶端發(fā)送“PING”后服務(wù)端會(huì)返回一個(gè)“PONG“作為回應(yīng)聂受,一般會(huì)作為連接的檢驗(yàn)方法蒿秦。
既然校驗(yàn)報(bào)錯(cuò),那抓包看看請(qǐng)求和響應(yīng)吧蛋济!

首先查看網(wǎng)卡編號(hào)ip a
再使用tcpdump對(duì)eth1網(wǎng)卡的6379端口數(shù)據(jù)抓包渤早。

tcpdump -i eth1 port 6379 -w target.cap

最后使用wireshark對(duì)target.cap進(jìn)行分析,可借助wireshark的redis插件進(jìn)行分析瘫俊。
根據(jù)應(yīng)用錯(cuò)誤日志打印的時(shí)間鹊杖,查詢到此時(shí)客戶端(應(yīng)用服務(wù)器)向服務(wù)端(redis服務(wù)器)發(fā)送了一個(gè)RST包悴灵。

ws_1.png

感覺(jué)是有問(wèn)題的。就往上查了下骂蓖。

ws_2.png

可以看到积瞒,箭頭位置上方客戶端發(fā)送了PING命令,箭頭位置應(yīng)該返回客戶端一個(gè)PONG作為響應(yīng)登下。而是返回了以下信息:

MISCONF Redis is configured to save RDB snapshots, but is currently not able to persist on disk. Commands that may modify the data set are disabled. Please check Redis logs for details about the error.

意思是茫孔,redis服務(wù)端配置了RDB快照持久化,但當(dāng)前不能進(jìn)行持久化被芳。有可能修改數(shù)據(jù)集的命令都被禁用了缰贝。(但是通過(guò)看源碼發(fā)現(xiàn),除了涉及修改的命令畔濒,PING也在禁用之列剩晴,redis-3.2.9 server.c,而讀取涉及的命令應(yīng)該不會(huì)受到影響)
以下代碼是redis-3.2.9 server.c中in processCommand(client *c)發(fā)生持久化異常后的處理代碼

/* Don't accept write commands if there are problems persisting on disk
     * and if this is a master instance. */
    if (((server.stop_writes_on_bgsave_err &&
          server.saveparamslen > 0 &&
          server.lastbgsave_status == C_ERR) ||
          server.aof_last_write_status == C_ERR) &&
        server.masterhost == NULL &&
        (c->cmd->flags & CMD_WRITE ||
         c->cmd->proc == pingCommand))
    {
        flagTransaction(c);
        if (server.aof_last_write_status == C_OK)
            addReply(c, shared.bgsaveerr);
        else
            addReplySds(c,
                sdscatprintf(sdsempty(),
                "-MISCONF Errors writing to the AOF file: %s\r\n",
                strerror(server.aof_last_write_errno)));
        return C_OK;
    }

之后客戶端發(fā)送QUIT命令退出侵状,服務(wù)器返回OK響應(yīng)退出成功赞弥。
那個(gè)返回的配置錯(cuò)誤信息是說(shuō)在持久化RDB時(shí)出現(xiàn)了問(wèn)題。于是到redis服務(wù)器上看了下磁盤信息和redis的日志趣兄,果然绽左,磁盤空間不足了。

linux_df.png

到此艇潭,問(wèn)題基本查明拼窥,是由于redis所在服務(wù)器磁盤不足導(dǎo)致,由于是測(cè)試服務(wù)器蹋凝,也沒(méi)有配置磁盤的監(jiān)控鲁纠。騰出空間后即可恢復(fù)。

對(duì)RST包的理解

但是我還有一個(gè)問(wèn)題仙粱,那就是為什么會(huì)有一個(gè)RST包呢?如果沒(méi)有那個(gè)RST包彻舰,其實(shí)問(wèn)題還不好發(fā)現(xiàn)伐割,雖然按照錯(cuò)誤日志的時(shí)間,挨個(gè)查找Redis數(shù)據(jù)包的信息刃唤,能夠查詢出來(lái)隔心,但是RST無(wú)疑從一開(kāi)始就吸引了我的注意,讓我能夠更加快速的定位問(wèn)題尚胞。

初識(shí)RST

那現(xiàn)在問(wèn)題來(lái)了硬霍,為什么會(huì)有RST包呢?
首先了解一下RST笼裳。(可參考TCP/IP詳解 卷1 唯卖, 18.7 復(fù)位報(bào)文段)
歸納起來(lái)粱玲,當(dāng)以下任一情況發(fā)生時(shí),會(huì)產(chǎn)生RST包:

  • 到不存在的端口的連接請(qǐng)求
  • 異常終止一個(gè)連接
  • 檢測(cè)半打開(kāi)連接

jedis與redis的關(guān)閉機(jī)制

觀察RST之前的幾個(gè)包

ws_3.png

使用wireshark的專家信息查看多個(gè)RST包拜轨,發(fā)現(xiàn)RST之前都會(huì)有QUIT,OK的交互抽减。那看來(lái)應(yīng)該是框架層面的問(wèn)題。
再翻看上面GenericObjectPool的相關(guān)代碼橄碾,在borrowObject時(shí)如果發(fā)生異常卵沉,會(huì)調(diào)用destroyObject()方法,這個(gè)destroyObject是延遲到子類實(shí)現(xiàn)的法牲,也就是上面說(shuō)到的JedisPool史汗。

public void destroyObject(final Object obj) throws Exception {
    if (obj instanceof Jedis) {
        final Jedis jedis = (Jedis) obj;
        if (jedis.isConnected()) {
            try {
                try {
                    jedis.quit();
                } catch (Exception e) {
                }
                jedis.disconnect();
            } catch (Exception e) {

            }
        }
    }
}

最終調(diào)用redis.clients.jedis.Connection的disconnect,關(guān)閉輸入輸出流拒垃。

public void disconnect() {
    if (isConnected()) {
        try {
            inputStream.close();
            outputStream.close();
            if (!socket.isClosed()) {
                socket.close();
            }
        } catch (IOException ex) {
            throw new JedisConnectionException(ex);
        }
    }
}

這也就解釋了為什么會(huì)出現(xiàn)RST包:
客戶端請(qǐng)求QUIT停撞,服務(wù)端返回OK。(此時(shí)客戶端在接收完quit返回后恶复,調(diào)用了disconnect方法怜森,導(dǎo)致連接斷開(kāi))緊接著服務(wù)端發(fā)起TCP揮手,發(fā)送FIN包到之前交互的客戶端51311端口谤牡,但調(diào)用完disconnect的客戶端已經(jīng)斷開(kāi)了和服務(wù)端的連接副硅。客戶端只能通過(guò)發(fā)送RST翅萤,通知服務(wù)端“你發(fā)送了一個(gè)到不存在的端口的關(guān)閉請(qǐng)求”恐疲。

翻看新版的jedis代碼,除了將之前JedisPool中實(shí)現(xiàn)的代碼挪到了JedisFactory中實(shí)現(xiàn)套么,大致邏輯依然沒(méi)有改變()

// 2.10 JedisFactory
@Override
  public void destroyObject(PooledObject<Jedis> pooledJedis) throws Exception {
    final BinaryJedis jedis = pooledJedis.getObject();
    if (jedis.isConnected()) {
      try {
        try {
          jedis.quit();
        } catch (Exception e) {
        }
        jedis.disconnect();
      } catch (Exception e) {

      }
    }
  }

@Override
public boolean validateObject(PooledObject<Jedis> pooledJedis) {
  final BinaryJedis jedis = pooledJedis.getObject();
  try {
    HostAndPort hostAndPort = this.hostAndPort.get();

    String connectionHost = jedis.getClient().getHost();
    int connectionPort = jedis.getClient().getPort();

    return hostAndPort.getHost().equals(connectionHost)
        && hostAndPort.getPort() == connectionPort && jedis.isConnected()
        && jedis.ping().equals("PONG");
  } catch (final Exception e) {
    return false;
  }
}

而disconnect最終調(diào)用的Connection有變化培己。

public void disconnect() {
  if (isConnected()) {
    try {
      outputStream.flush();
      socket.close();
    } catch (IOException ex) {
      broken = true;
      throw new JedisConnectionException(ex);
    } finally {
      IOUtils.closeQuietly(socket);
    }
  }
}

由之前的inpusStream.close()和outputStream.close()改成了outputStream.flush()。原因是jedis自定義了帶緩沖的RedisOutputStream胚泌,在socket.close前要確保緩沖內(nèi)容寫到流中省咨。
客戶端使用disconnect確實(shí)能夠快速釋放資源,在調(diào)用disconnect時(shí)關(guān)閉了客戶端端口玷室,回收了文件句柄資源零蓉。
試想如果在quit后,服務(wù)端就已經(jīng)釋放了文件句柄穷缤,關(guān)閉了socket連接敌蜂,而客戶端不調(diào)用disconnect釋放資源,就會(huì)一直占用資源津肛,在進(jìn)程結(jié)束才會(huì)釋放章喉。
下圖也進(jìn)行了驗(yàn)證。第一次注釋掉disconnect中關(guān)閉socket的代碼,程序sleep10秒后退出秸脱,可以看到直到進(jìn)程退出時(shí)落包,客戶端的連接才被關(guān)閉。而第二次是恢復(fù)注釋掉的代碼撞反,客戶端在quit后馬上就關(guān)閉了連接釋放了資源妥色。

ws_4.png

redis連接開(kāi)啟和關(guān)閉時(shí)的系統(tǒng)調(diào)用

這個(gè)問(wèn)題困擾了我一天,到底怎么產(chǎn)生的RST包遏片?不管是客戶端還是服務(wù)端嘹害,調(diào)用close后,都應(yīng)該進(jìn)行正常的四次握手吧吮便?
我反復(fù)看了redis服務(wù)端關(guān)閉客戶端連接的源碼(redis 3.2.9 networking.c#unlinkClient)笔呀。也只是調(diào)用了系統(tǒng)調(diào)用close(fd),甚至為了避免干擾還新建了一個(gè)redis實(shí)例髓需,使用strace -f -p $pid -tt -T跟蹤關(guān)閉附近的系統(tǒng)調(diào)用

[pid 25442] 10:29:42.299132 epoll_wait(3, {{EPOLLIN, {u32=4, u64=4}}}, 11024, 100) = 1 <0.004041>
[pid 25442] 10:29:42.303248 accept(4, {sa_family=AF_INET, sin_port=htons(52294), sin_addr=inet_addr("192.168.3.45")}, [16]) = 5 <0.000025>
[pid 25442] 10:29:42.303356 fcntl(5, F_GETFL) = 0x2 (flags O_RDWR) <0.000014>
[pid 25442] 10:29:42.303417 fcntl(5, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000010>
[pid 25442] 10:29:42.303456 setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000012>
[pid 25442] 10:29:42.303499 epoll_ctl(3, EPOLL_CTL_ADD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:42.303544 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 96) = 1 <0.073370>
[pid 25442] 10:29:42.376968 read(5, "*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\nbar\r\n", 16384) = 31 <0.000014>
[pid 25442] 10:29:42.377071 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000013>
[pid 25442] 10:29:42.377144 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 22) = 1 <0.000017>
[pid 25442] 10:29:42.377210 write(5, "+OK\r\n", 5) = 5 <0.000034>
[pid 25442] 10:29:42.377304 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000025>
[pid 25442] 10:29:42.377377 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 22) = 1 <0.007943>
[pid 25442] 10:29:42.385376 read(5, "*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n", 16384) = 22 <0.000013>
[pid 25442] 10:29:42.385432 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:42.385477 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 14) = 1 <0.000010>
[pid 25442] 10:29:42.385518 write(5, "$3\r\nbar\r\n", 9) = 9 <0.000019>
[pid 25442] 10:29:42.385567 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:42.385617 epoll_wait(3, {}, 11024, 14) = 0 <0.014075>
[pid 25442] 10:29:42.399742 epoll_wait(3, {}, 11024, 100) = 0 <0.100126>
[pid 25442] 10:29:42.499930 epoll_wait(3, {}, 11024, 100) = 0 <0.100126>
[pid 25442] 10:29:42.600115 epoll_wait(3, {}, 11024, 100) = 0 <0.100071>
[pid 25442] 10:29:42.700276 epoll_wait(3, {}, 11024, 100) = 0 <0.100131>
[pid 25442] 10:29:42.800482 epoll_wait(3, {}, 11024, 100) = 0 <0.100129>
[pid 25442] 10:29:42.900687 epoll_wait(3, {}, 11024, 100) = 0 <0.100141>
[pid 25442] 10:29:43.000895 epoll_wait(3, {}, 11024, 100) = 0 <0.100132>
[pid 25442] 10:29:43.101095 epoll_wait(3, {}, 11024, 100) = 0 <0.100131>
[pid 25442] 10:29:43.201305 epoll_wait(3, {}, 11024, 100) = 0 <0.100134>
[pid 25442] 10:29:43.301521 epoll_wait(3, {}, 11024, 100) = 0 <0.100136>
[pid 25442] 10:29:43.401725 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 100) = 1 <0.003552>
[pid 25442] 10:29:43.405350 read(5, "*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n", 16384) = 22 <0.000016>
[pid 25442] 10:29:43.405425 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:43.405477 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 96) = 1 <0.000014>
[pid 25442] 10:29:43.405531 write(5, "$3\r\nbar\r\n", 9) = 9 <0.000022>
[pid 25442] 10:29:43.405601 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000011>
[pid 25442] 10:29:43.405660 epoll_wait(3, {}, 11024, 96) = 0 <0.096129>
[pid 25442] 10:29:43.501877 epoll_wait(3, {{EPOLLIN, {u32=5, u64=5}}}, 11024, 100) = 1 <0.003474>
[pid 25442] 10:29:43.505429 read(5, "*1\r\n$4\r\nQUIT\r\n", 16384) = 14 <0.000018>
[pid 25442] 10:29:43.505514 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN|EPOLLOUT, {u32=5, u64=5}}) = 0 <0.000015>
[pid 25442] 10:29:43.505578 epoll_wait(3, {{EPOLLOUT, {u32=5, u64=5}}}, 11024, 96) = 1 <0.000012>
[pid 25442] 10:29:43.505623 write(5, "+OK\r\n", 5) = 5 <0.000028>
[pid 25442] 10:29:43.505693 epoll_ctl(3, EPOLL_CTL_MOD, 5, {EPOLLIN, {u32=5, u64=5}}) = 0 <0.000016>
[pid 25442] 10:29:43.505764 epoll_ctl(3, EPOLL_CTL_DEL, 5, {0, {u32=5, u64=5}}) = 0 <0.000016>
[pid 25442] 10:29:43.505830 close(5)    = 0 <0.000111>
[pid 25442] 10:29:43.505992 epoll_wait(3, {}, 11024, 96) = 0 <0.096134>

java客戶端junit測(cè)試代碼(根據(jù)jedis測(cè)試用例JedisPoolTest#checkConnections修改):

    JedisPool pool = new JedisPool(new JedisPoolConfig(), hnp.getHost(), hnp.getPort(), 2000);
    Jedis jedis = pool.getResource();
    jedis.set("foo", "bar");
    assertEquals("bar", jedis.get("foo"));
    pool.returnResource(jedis);

    try {
      Thread.sleep(1*1000);
    } catch (InterruptedException e) {
      e.printStackTrace();
    }
    System.out.println("hello");
    jedis.get("foo");
    pool.destroy();
    assertTrue(pool.isClosed());

觀察服務(wù)端系統(tǒng)調(diào)用许师,

setsockopt(5, SOL_TCP, TCP_NODELAY, [1], 4) = 0
...
close(5) = 0

在socket連接時(shí)只設(shè)置了TCP_NODELAY,禁用了Nagle算法僚匆。

jedis客戶端的socket設(shè)置

正在無(wú)解之際微渠,突然想到是不是redis客戶端設(shè)置了一些參數(shù)呢?
終于咧擂,在jedis控制連接的redis.clients.jedisConnection類中逞盆,找到了連接時(shí)對(duì)socket的設(shè)置:

public void connect() {
    if (!isConnected()) {
      try {
        socket = new Socket();
        // ->@wjw_add
        socket.setReuseAddress(true);
        socket.setKeepAlive(true); // Will monitor the TCP connection is
        // valid
        socket.setTcpNoDelay(true); // Socket buffer Whetherclosed, to
        // ensure timely delivery of data
        socket.setSoLinger(true, 0); // Control calls close () method,
        // the underlying socket is closed
        // immediately
        // <-@wjw_add

        socket.connect(new InetSocketAddress(host, port), connectionTimeout);
        socket.setSoTimeout(soTimeout);

        if (ssl) {
          if (null == sslSocketFactory) {
            sslSocketFactory = (SSLSocketFactory)SSLSocketFactory.getDefault();
          }
          socket = (SSLSocket) sslSocketFactory.createSocket(socket, host, port, true);
          if (null != sslParameters) {
            ((SSLSocket) socket).setSSLParameters(sslParameters);
          }
          if ((null != hostnameVerifier) &&
              (!hostnameVerifier.verify(host, ((SSLSocket) socket).getSession()))) {
            String message = String.format(
                "The connection to '%s' failed ssl/tls hostname verification.", host);
            throw new JedisConnectionException(message);
          }
        }

        outputStream = new RedisOutputStream(socket.getOutputStream());
        inputStream = new RedisInputStream(socket.getInputStream());
      } catch (IOException ex) {
        broken = true;
        throw new JedisConnectionException("Failed connecting to host " 
            + host + ":" + port, ex);
      }
    }
  }

這個(gè)socket.setSoLinger(true, 0);引起了我的注意。
根據(jù)SCTP rfc SO_LINGER的解釋

If the l_linger value is set to 0, calling close() is the same as the ABORT primitive.

繼續(xù)看SCTP_ABORT:

SCTP_ABORT: Setting this flag causes the specified association
to abort by sending an ABORT message to the peer. The ABORT
chunk will contain an error cause of 'User Initiated Abort'
with cause code 12. The cause-specific information of this
error cause is provided in msg_iov.

不太明白松申,看下TCP中對(duì)Abort的解釋吧
TCP rfc對(duì)Abort的解釋:

This command causes all pending SENDs and RECEIVES to be
aborted, the TCB to be removed, and a special RESET message to
be sent to the TCP on the other side of the connection.
Depending on the implementation, users may receive abort
indications for each outstanding SEND or RECEIVE, or may simply
receive an ABORT-acknowledgment.
注:TCB是一個(gè)抽象的控制塊(Transmission Control Block)

Socket選項(xiàng)SO_LINGER用于強(qiáng)制中斷

到此才算明白云芦,由于jedis客戶端在連接時(shí),設(shè)置了socket.setSoLinger(true, 0);贸桶,這樣在關(guān)閉連接時(shí)就等同與TCP的Abort舅逸,也就是忽略所有正在發(fā)送和接收的數(shù)據(jù),直接向?qū)Ψ桨l(fā)送一個(gè)RESET消息皇筛。這也是為什么jedis要在socket.close()前flush緩沖琉历,以確保在途數(shù)據(jù)不會(huì)丟失。
我去掉了客戶端對(duì)SO_LINGER的設(shè)置水醋,終于又看到了正常的TCP揮手旗笔。

ws_5.png

還想深入的同學(xué),可以閱讀linux源碼net/ipv4/tcp.c离例。我大概看了下换团,代碼邏輯很明確(linux內(nèi)核版本有區(qū)別)如果設(shè)置了SO_LINGER悉稠,在close時(shí)宫蛆,會(huì)直接調(diào)用tcp_disconnect發(fā)送RST數(shù)據(jù)包,而不再做常規(guī)的四次揮手流程。雖然我覺(jué)得這樣做不太優(yōu)雅耀盗,更優(yōu)雅的做法可能是socket.setSoLinger(true, timeout)設(shè)置一個(gè)超時(shí)閥值想虎。
在這個(gè)github jedis issue Improving socket performance中描述了加入以下四項(xiàng)設(shè)置用于提升性能。

socket.setReuseAddress(true);
socket.setKeepAlive(true);
socket.setTcpNoDelay(true);
socket.setSoLinger(true,0);

在issue下加了個(gè)comment詢問(wèn)了下叛拷,有消息了再更新吧舌厨。

總結(jié)

此次應(yīng)用程序中Jedis連接池不能獲取redis連接的問(wèn)題,原因是redis服務(wù)器磁盤空間滿忿薇,導(dǎo)致不能保存快照(rdb snapshot)裙椭。應(yīng)用程序中在testOnBorrow為true的情況下,使用redisPING PONG命令測(cè)試redis連接是否有效時(shí)署浩,收到了MISCONF Redis is configured to save RDB snapshots的響應(yīng)揉燃,而非正常的PONG。這就導(dǎo)致jedis判斷連接無(wú)效筋栋,強(qiáng)制斷開(kāi)了連接炊汤。
之后對(duì)TCP中RST flag做了淺嘗輒止的分析。當(dāng)設(shè)置了socket.setSoLinger(true, 0)后弊攘,關(guān)閉此socket將清空數(shù)據(jù)并向?qū)Ψ桨l(fā)送RST消息抢腐。
可以深入的地方還有不少,自己關(guān)于網(wǎng)絡(luò)編程的知識(shí)也有待加強(qiáng)襟交。準(zhǔn)備補(bǔ)充下相關(guān)知識(shí)迈倍,再結(jié)合一些優(yōu)秀的開(kāi)源項(xiàng)目如redis、nginx深入了解下婿着。


參考

  1. Jedis源碼 https://github.com/xetorthio/jedis
  2. Commons-pool源碼 https://github.com/apache/commons-pool
  3. Spring-data-redis源碼 https://github.com/spring-projects/spring-data-redis
  4. redis-wireshark源碼 https://github.com/jzwinck/redis-wireshark
  5. Redis源碼 https://github.com/antirez/redis
  6. TCP/IP詳解在線電子書 http://www.52im.net/topic-tcpipvol1.html
  7. SCTP rfc - https://tools.ietf.org/html/rfc6458
  8. TCP rfc - https://tools.ietf.org/html/rfc793
  9. 幾種TCP連接中出現(xiàn)RST的情況
  10. setsockopt()--Set Socket Options
  11. StackOverflow What is AF_INET, and why do I need it?
  12. Socket選項(xiàng)系列之SO_LINGER(《深入剖析Nginx》作者) - http://www.lenky.info/archives/2013/02/2220
最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末授瘦,一起剝皮案震驚了整個(gè)濱河市,隨后出現(xiàn)的幾起案子竟宋,更是在濱河造成了極大的恐慌提完,老刑警劉巖,帶你破解...
    沈念sama閱讀 219,039評(píng)論 6 508
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件丘侠,死亡現(xiàn)場(chǎng)離奇詭異徒欣,居然都是意外死亡,警方通過(guò)查閱死者的電腦和手機(jī)蜗字,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,426評(píng)論 3 395
  • 文/潘曉璐 我一進(jìn)店門打肝,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái),“玉大人挪捕,你說(shuō)我怎么就攤上這事粗梭。” “怎么了级零?”我有些...
    開(kāi)封第一講書人閱讀 165,417評(píng)論 0 356
  • 文/不壞的土叔 我叫張陵断医,是天一觀的道長(zhǎng)。 經(jīng)常有香客問(wèn)我,道長(zhǎng)鉴嗤,這世上最難降的妖魔是什么斩启? 我笑而不...
    開(kāi)封第一講書人閱讀 58,868評(píng)論 1 295
  • 正文 為了忘掉前任,我火速辦了婚禮醉锅,結(jié)果婚禮上兔簇,老公的妹妹穿的比我還像新娘。我一直安慰自己硬耍,他們只是感情好垄琐,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,892評(píng)論 6 392
  • 文/花漫 我一把揭開(kāi)白布。 她就那樣靜靜地躺著经柴,像睡著了一般此虑。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發(fā)上口锭,一...
    開(kāi)封第一講書人閱讀 51,692評(píng)論 1 305
  • 那天朦前,我揣著相機(jī)與錄音,去河邊找鬼鹃操。 笑死韭寸,一個(gè)胖子當(dāng)著我的面吹牛,可吹牛的內(nèi)容都是我干的荆隘。 我是一名探鬼主播恩伺,決...
    沈念sama閱讀 40,416評(píng)論 3 419
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼椰拒!你這毒婦竟也來(lái)了晶渠?” 一聲冷哼從身側(cè)響起,我...
    開(kāi)封第一講書人閱讀 39,326評(píng)論 0 276
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤燃观,失蹤者是張志新(化名)和其女友劉穎褒脯,沒(méi)想到半個(gè)月后赚导,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體裳瘪,經(jīng)...
    沈念sama閱讀 45,782評(píng)論 1 316
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡肺蔚,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,957評(píng)論 3 337
  • 正文 我和宋清朗相戀三年臂外,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片侣监。...
    茶點(diǎn)故事閱讀 40,102評(píng)論 1 350
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡渣触,死狀恐怖赚哗,靈堂內(nèi)的尸體忽然破棺而出浇雹,到底是詐尸還是另有隱情沉御,我是刑警寧澤,帶...
    沈念sama閱讀 35,790評(píng)論 5 346
  • 正文 年R本政府宣布昭灵,位于F島的核電站吠裆,受9級(jí)特大地震影響聂儒,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜硫痰,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,442評(píng)論 3 331
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望窜护。 院中可真熱鬧效斑,春花似錦、人聲如沸柱徙。這莊子的主人今日做“春日...
    開(kāi)封第一講書人閱讀 31,996評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)护侮。三九已至敌完,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間羊初,已是汗流浹背滨溉。 一陣腳步聲響...
    開(kāi)封第一講書人閱讀 33,113評(píng)論 1 272
  • 我被黑心中介騙來(lái)泰國(guó)打工, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留长赞,地道東北人晦攒。 一個(gè)月前我還...
    沈念sama閱讀 48,332評(píng)論 3 373
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像得哆,于是被迫代替她去往敵國(guó)和親脯颜。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 45,044評(píng)論 2 355

推薦閱讀更多精彩內(nèi)容