【Hive】 HiveServer2 內(nèi)存溢出總結

1.前言

用戶使用Beeline訪問HiveServer2 (3.1.2版本) 執(zhí)行離線SQL任務娱颊,持續(xù)運行一周后HiveServer2就出現(xiàn)OOM現(xiàn)象闪檬,嚴重影響數(shù)據(jù)查詢與報表產(chǎn)出锋爪,經(jīng)過幾輪修復問題終于解決仆救。作者把修復過的問題進行了匯總,避免其他小伙伴再遇到此問題時束手無策础米。

2.案例

2.1 HIVE-16455

HiveServer2 在使用ADD JAR語句時導致文件句柄泄漏

[root@host-10-17-80-111 ~]# lsof -p 29588 | grep "(deleted)" | wc -l
java    29588 hive  391u   REG              252,3    125987  2099944 /tmp/57d98f5b-1e53-44e2-876b-6b4323ac24db_resources/hive-contrib.jar (deleted)
java    29588 hive  392u   REG              252,3    125987  2099946 /tmp/eb3184ad-7f15-4a77-a10d-87717ae634d1_resources/hive-contrib.jar (deleted)
java    29588 hive  393r   REG              252,3    125987  2099825 /tmp/e29dccfc-5708-4254-addb-7a8988fc0500_resources/hive-contrib.jar (deleted)
java    29588 hive  394r   REG              252,3    125987  2099833 /tmp/5153dd4a-a606-4f53-b02c-d606e7e56985_resources/hive-contrib.jar (deleted)
java    29588 hive  395r   REG              252,3    125987  2099827 /tmp/ff3cdb05-917f-43c0-830a-b293bf397a23_resources/hive-contrib.jar (deleted)
java    29588 hive  396r   REG              252,3    125987  2099822 /tmp/60531b66-5985-421e-8eb5-eeac31fdf964_resources/hive-contrib.jar (deleted)
java    29588 hive  397r   REG              252,3    125987  2099831 /tmp/78878921-455c-438c-9735-447566ed8381_resources/hive-contrib.jar (deleted)
java    29588 hive  399r   REG              252,3    125987  2099835 /tmp/0e5d7990-30cc-4248-9058-587f7f1ff211_resources/hive-contrib.jar (deleted)

2.2 HIVE-24236

不容易復現(xiàn),只能某些特定條件下可能存在連接泄漏風險

2020-09-29T18:44:26,563 INFO  [Heartbeater-0]: txn.TxnHandler (TxnHandler.java:checkRetryable(3733)) - Non-retryable error in heartbeat(HeartbeatRequest(lockid:0, txnid:11908)) : Cannot get a connection, general error (SQLState=null, ErrorCode=0)
2020-09-29T18:44:26,564 ERROR [Heartbeater-0]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(201)) - MetaException(message:Unable to select from transaction database org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, general error
        at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:118)
        at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3605)
        at org.apache.hadoop.hive.metastore.txn.TxnHandler.getDbConn(TxnHandler.java:3598)
        at org.apache.hadoop.hive.metastore.txn.TxnHandler.heartbeat(TxnHandler.java:2739)
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.heartbeat(HiveMetaStore.java:8452)
        at sun.reflect.GeneratedMethodAccessor415.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
        at com.sun.proxy.$Proxy63.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.heartbeat(HiveMetaStoreClient.java:3247)
        at sun.reflect.GeneratedMethodAccessor414.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:213)
        at com.sun.proxy.$Proxy64.heartbeat(Unknown Source)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.heartbeat(DbTxnManager.java:671)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.lambda$run$0(DbTxnManager.java:1102)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1898)
        at org.apache.hadoop.hive.ql.lockmgr.DbTxnManager$Heartbeater.run(DbTxnManager.java:1101)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

2.3 HIVE-24552

調(diào)用loadDynamicPartitions(Hive.java)時生成多個線程來處理FileMove,這些線程可能會生成HiveMetaStore連接添诉,這些連接可能沒有及時關閉造成大量的連接堆積屁桑。

2020-12-15T17:05:38.485Z hiveserver2-0 hiveserver2 1 a3671b96-74fb-4ee9-b186-aeff0de0bbec [mdc@18060 class="metastore.HiveMetaStoreClient" level="INFO" thread="Finalizer"] Closed a connection to metastore, current connections: 43901
2020-12-15T17:05:38.485Z hiveserver2-0 hiveserver2 1 a3671b96-74fb-4ee9-b186-aeff0de0bbec [mdc@18060 class="metastore.HiveMetaStoreClient" level="INFO" thread="Finalizer"] Closed a connection to metastore, current connections: 43900
2020-12-15T17:05:38.485Z hiveserver2-0 hiveserver2 1 a3671b96-74fb-4ee9-b186-aeff0de0bbec [mdc@18060 class="metastore.HiveMetaStoreClient" level="INFO" thread="Finalizer"] Closed a connection to metastore, current connections: 43899
2020-12-15T17:05:38.485Z hiveserver2-0 hiveserver2 1 a3671b96-74fb-4ee9-b186-aeff0de0bbec [mdc@18060 class="metastore.HiveMetaStoreClient" level="INFO" thread="Finalizer"] Closed a connection to metastore, current connections: 43898
2020-12-15T17:05:38.485Z hiveserver2-0 hiveserver2 1 a3671b96-74fb-4ee9-b186-aeff0de0bbec [mdc@18060 class="metastore.HiveMetaStoreClient" level="INFO" thread="Finalizer"] Closed a connection to metastore, current connections: 43897
2020-12-15T17:05:38.485Z hiveserver2-0 hiveserver2 1 a3671b96-74fb-4ee9-b186-aeff0de0bbec [mdc@18060 class="transport.TIOStreamTransport" level="WARN" thread="Finalizer"] Error closing output stream.
java.net.SocketException: Socket closed
    at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
    at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
    at java.io.FilterOutputStream.close(FilterOutputStream.java:158)

2.4 HIVE-24858

如果在會話中注冊了一個UDF JAR 并從中創(chuàng)建了一個臨時函數(shù),當會話關閉時UDFClassLoader不會被GC回收掉栏赴。

Class Name                                                                                                                          | Shallow Heap | Retained Heap
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
contextClassLoader org.apache.hive.service.server.ThreadWithGarbageCleanup @ 0x7164deb50  HiveServer2-Handler-Pool: Thread-72 Thread|          128 |        79,072
referent java.util.WeakHashMap$Entry @ 0x7164e67d0                                                                                  |           40 |           824
'- [6] java.util.WeakHashMap$Entry[16] @ 0x71581aac0                                                                                |           80 |         5,056
   '- table java.util.WeakHashMap @ 0x71580f510                                                                                     |           48 |         6,920
      '- CACHE_CLASSES class org.apache.hadoop.conf.Configuration @ 0x71580f3d8                                                     |           64 |        74,528
-------------------------------------------------------------------------------------------------------------------------------------------------------------------

2.5 HIVE-26404

HiveMetaStore無法響應JVM垃圾回收停頓時間長蘑斧,堆內(nèi)存org.apache.hadoop.conf.Configuration占用過多存在OOM風險。

 Class Name                                                                             | Shallow Heap | Retained Heap
----------------------------------------------------------------------------------------------------------------------
org.apache.hadoop.fs.FileSystem$Cache @ 0x45403fe70                                    |           32 |   108,671,824
|- <class> class org.apache.hadoop.fs.FileSystem$Cache @ 0x45410c3e0                   |            8 |           544
'- map java.util.HashMap @ 0x453ffb598                                                 |           48 |    92,777,232
   |- <class> class java.util.HashMap @ 0x4520382c8 System Class                       |           40 |           168
   |- entrySet java.util.HashMap$EntrySet @ 0x454077848                                |           16 |            16
   '- table java.util.HashMap$Node[32768] @ 0x463585b68                                |      131,088 |    92,777,168
      |- class java.util.HashMap$Node[] @ 0x4520b7790                                  |            0 |             0
      '- [1786] java.util.HashMap$Node @ 0x451998ce0                                   |           32 |         9,968
         |- <class> class java.util.HashMap$Node @ 0x4520b7728 System Class            |            8 |            32
         '- value org.apache.hadoop.hdfs.DistributedFileSystem @ 0x452990178           |           56 |         4,976
            |- <class> class org.apache.hadoop.hdfs.DistributedFileSystem @ 0x45402e290|            8 |         4,664
            |- uri java.net.URI @ 0x451a05cd0  hdfs://nameservice1                     |           80 |           432
            |- dfs org.apache.hadoop.hdfs.DFSClient @ 0x451f5d9b8                      |          128 |         3,824
            '- conf org.apache.hadoop.hive.conf.HiveConf @ 0x453a34b38                 |           80 |       250,160
----------------------------------------------------------------------------------------------------------------------

2.6 HIVE-22275

單個Hive Session執(zhí)行多條SQL語時OperationManager.queryIdOperation沒有正常清理存在OOM風險

2019-09-13T08:37:36,785 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9]
2019-09-13T08:37:38,432 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Removed queryId: hive_20190913083736_c49cf3cc-cfe8-48a1-bd22-8b924dfb0396 corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=dfed4c18-a284-4640-9f4a-1a20527105f9] with tag: null
2019-09-13T08:37:38,469 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb]
2019-09-13T08:37:52,662 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b983802c-1dec-4fa0-8680-d05ab555321b]
2019-09-13T08:37:56,239 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=75dbc531-2964-47b2-84d7-85b59f88999c]
2019-09-13T08:38:30,791 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b697c801-7da0-4544-bcfa-442eb1d3bd77]
2019-09-13T08:39:10,187 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=bda93c8f-0822-4592-a61c-4701720a1a5c]
2019-09-13T08:39:15,471 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=24d0030c-0e49-45fb-a918-2276f0941cfb] with tag: null
2019-09-13T08:39:15,507 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=b983802c-1dec-4fa0-8680-d05ab555321b] with tag: null
2019-09-13T08:39:15,538 INFO  [8eaa1601-f045-4ad5-9c2e-1e5944b75f6a HiveServer2-Handler-Pool: Thread-202]: operation.OperationManager (:()) - Removed queryId: hive_20190913083910_c4809ca8-d8db-423c-8b6d-fbe3eee89971 corresponding to operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=75dbc531-2964-47b2-84d7-85b59f88999c] with tag: null

2.7 HIVE-24590

日志輸出文件沒有正常關閉或刪除须眷,Log4j中的RandomAccessFileManager實例占用堆內(nèi)存空間過多存在OOM風險竖瘾。
image.png

3.總結

筆者使用HiveServer2版本為3.1.2,由于此版本內(nèi)存泄漏問題較多花颗,大家可根據(jù)上述案例進行編譯修復捕传,如遇到其他BUG或性能問題,建議多去社區(qū)看看扩劝。

最后編輯于
?著作權歸作者所有,轉載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末庸论,一起剝皮案震驚了整個濱河市,隨后出現(xiàn)的幾起案子棒呛,更是在濱河造成了極大的恐慌聂示,老刑警劉巖,帶你破解...
    沈念sama閱讀 218,204評論 6 506
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件簇秒,死亡現(xiàn)場離奇詭異鱼喉,居然都是意外死亡,警方通過查閱死者的電腦和手機趋观,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 93,091評論 3 395
  • 文/潘曉璐 我一進店門扛禽,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人拆内,你說我怎么就攤上這事旋圆〕枘” “怎么了麸恍?”我有些...
    開封第一講書人閱讀 164,548評論 0 354
  • 文/不壞的土叔 我叫張陵,是天一觀的道長搀矫。 經(jīng)常有香客問我抹沪,道長,這世上最難降的妖魔是什么瓤球? 我笑而不...
    開封第一講書人閱讀 58,657評論 1 293
  • 正文 為了忘掉前任融欧,我火速辦了婚禮,結果婚禮上卦羡,老公的妹妹穿的比我還像新娘噪馏。我一直安慰自己麦到,他們只是感情好,可當我...
    茶點故事閱讀 67,689評論 6 392
  • 文/花漫 我一把揭開白布欠肾。 她就那樣靜靜地躺著瓶颠,像睡著了一般。 火紅的嫁衣襯著肌膚如雪刺桃。 梳的紋絲不亂的頭發(fā)上粹淋,一...
    開封第一講書人閱讀 51,554評論 1 305
  • 那天,我揣著相機與錄音瑟慈,去河邊找鬼桃移。 笑死,一個胖子當著我的面吹牛葛碧,可吹牛的內(nèi)容都是我干的借杰。 我是一名探鬼主播,決...
    沈念sama閱讀 40,302評論 3 418
  • 文/蒼蘭香墨 我猛地睜開眼进泼,長吁一口氣:“原來是場噩夢啊……” “哼第步!你這毒婦竟也來了?” 一聲冷哼從身側響起缘琅,我...
    開封第一講書人閱讀 39,216評論 0 276
  • 序言:老撾萬榮一對情侶失蹤粘都,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后刷袍,有當?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體翩隧,經(jīng)...
    沈念sama閱讀 45,661評論 1 314
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 37,851評論 3 336
  • 正文 我和宋清朗相戀三年呻纹,在試婚紗的時候發(fā)現(xiàn)自己被綠了堆生。 大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 39,977評論 1 348
  • 序言:一個原本活蹦亂跳的男人離奇死亡雷酪,死狀恐怖淑仆,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情哥力,我是刑警寧澤蔗怠,帶...
    沈念sama閱讀 35,697評論 5 347
  • 正文 年R本政府宣布,位于F島的核電站吩跋,受9級特大地震影響寞射,放射性物質發(fā)生泄漏。R本人自食惡果不足惜锌钮,卻給世界環(huán)境...
    茶點故事閱讀 41,306評論 3 330
  • 文/蒙蒙 一桥温、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧梁丘,春花似錦侵浸、人聲如沸旺韭。這莊子的主人今日做“春日...
    開封第一講書人閱讀 31,898評論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽茂翔。三九已至,卻和暖如春履腋,著一層夾襖步出監(jiān)牢的瞬間珊燎,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,019評論 1 270
  • 我被黑心中介騙來泰國打工遵湖, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留悔政,地道東北人。 一個月前我還...
    沈念sama閱讀 48,138評論 3 370
  • 正文 我出身青樓延旧,卻偏偏與公主長得像谋国,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子迁沫,可洞房花燭夜當晚...
    茶點故事閱讀 44,927評論 2 355

推薦閱讀更多精彩內(nèi)容