記一次應(yīng)用hang起問(wèn)題處理
一. 程序主體
處理邏輯應(yīng)用開(kāi)起10個(gè)線程分批處理任務(wù),主體代碼如下:
private Executor executor = Executors.newFixedThreadPool(THREAD_SIZE);
for (LoanGroup group : groups) {
LOGGER.info("商戶ID:{}轉(zhuǎn)換資產(chǎn)開(kāi)始...", group.getMerchantId());
Integer totalSize = loanMapper.countNeedCastLoans(group.getMerchantId());
Integer times = totalSize % BATCH_SIZE == 0 ? totalSize / BATCH_SIZE : totalSize / BATCH_SIZE + 1;
CountDownLatch countDownLatch = new CountDownLatch(times);
for (int i = 0; i < times; i++) {
final int time = i;
executor.execute(new Runnable() {
@Override
public void run() {
try {
processBatch(group.getMerchantId(), time);
} catch (Exception e) {
LOGGER.error("商戶ID:{}開(kāi)啟多線程轉(zhuǎn)換绊含,轉(zhuǎn)換批次:{}...錯(cuò)誤:{}", group.getMerchantId(), time, e);
} finally {
countDownLatch.countDown();
}
}
});
}
try {
countDownLatch.await();
} catch (InterruptedException e) {
e.printStackTrace();
}
二. 問(wèn)題現(xiàn)象
應(yīng)用執(zhí)行過(guò)程中發(fā)現(xiàn)任務(wù)日志不再打印,且可以確定的是當(dāng)前任務(wù)并沒(méi)有執(zhí)行完畢,通過(guò)jstat -gcutil 1000 10 采樣垃圾回收情況發(fā)現(xiàn)連續(xù)10次打印結(jié)果都沒(méi)有變化,通過(guò)jmap -heap pid 查看虛擬機(jī)內(nèi)存使用情況琐馆,發(fā)現(xiàn)虛擬機(jī)內(nèi)存還有足夠的空間使用。后又通過(guò)jstack pid dump出線程棧信息如下:
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.131-b11 mixed mode):
"Attach Listener" #47 daemon prio=9 os_prio=31 tid=0x00007ffe38e7a000 nid=0x8707 waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"pool-1-thread-10" #46 prio=5 os_prio=31 tid=0x00007ffe33e8f000 nid=0x8503 waiting on condition [0x00007000125a3000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000007415c10d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
"pool-1-thread-9" #45 prio=5 os_prio=31 tid=0x00007ffe33167800 nid=0x8303 runnable [0x000070001249e000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
- locked <0x0000000740cf8780> (a com.mysql.jdbc.util.ReadAheadInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3008)
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:2212)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1989)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:3410)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:470)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:3112)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:2341)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2736)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2501)
- locked <0x0000000740cf36c8> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858)
- locked <0x0000000740cf36c8> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1197)
- locked <0x0000000740cf36c8> (a com.mysql.jdbc.JDBC4Connection)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.execute(DruidPooledPreparedStatement.java:493)
10個(gè)線程中6個(gè)線程處于waiting狀態(tài)恒序,其他4個(gè)線程全如thread-9一樣處于RUNNABLE狀態(tài)瘦麸。而且處于RUNNABLE狀態(tài)的線程全都阻塞在at com.mysql.jdbc.util.ReadAheadInputStream.fill處。
三.分析處理
根據(jù)現(xiàn)象猜測(cè)可能是網(wǎng)絡(luò)抖動(dòng)丟包引起jdbc客戶端始終hang在套接字讀取數(shù)據(jù)處歧胁。
排查jdbc數(shù)據(jù)庫(kù)連接池配置以及mybatis的statement相關(guān)配置參數(shù)滋饲,發(fā)現(xiàn)并沒(méi)有對(duì)statement的timeout參數(shù)有相關(guān)配置,而且整個(gè)處理邏輯中也沒(méi)有用到事務(wù)喊巍。
定位問(wèn)題應(yīng)該就是引網(wǎng)絡(luò)抖動(dòng)引起而超時(shí)設(shè)置又沒(méi)有配置引起
解決方法:增加mybatis.configuration.default-statement-timeout 配置
參考資料:
1.http://www.importnew.com/2466.html
2.http://www.cubrid.org/blog/understanding-jdbc-internals-and-timeout-configuration