環(huán)境背景
在單臺(tái)云服務(wù)器上使用docker搭了一套簡(jiǎn)單的大數(shù)據(jù)開(kāi)發(fā)測(cè)試環(huán)境凸郑,1個(gè)master,2個(gè)slave,服務(wù)啟動(dòng)使用docker-compose簡(jiǎn)單編排诲祸。
hmaster啟動(dòng)失敗而昨,查看日志出現(xiàn)如下異常信息
2018-08-19 13:49:29,121 FATAL [78e7a081d8b6:16000.activeMasterManager] master.HMaster: Unhandled exception. Starting shutdown.
org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-914767151-172.18.0.4-1533896696236:blk_1073741825_1001 file=/hbase/hbase.version
at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:976)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:632)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:874)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:926)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:200)
at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:608)
at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:691)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:509)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:166)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:141)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:741)
at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:205)
at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:2023)
at java.lang.Thread.run(Thread.java:748)
查看hdfs上的文件塊信息配紫,發(fā)現(xiàn)有CORRUPT的情況
[root@78e7a081d8b6 logs]# hdfs fsck / -files -blocks
18/08/19 13:52:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://master:50070/fsck?ugi=root&files=1&blocks=1&path=%2F
FSCK started by root (auth:SIMPLE) from /172.18.0.3 for path / at Sun Aug 19 13:52:55 UTC 2018
/ <dir>
/hbase <dir>
/hbase/.tmp <dir>
/hbase/MasterProcWALs <dir>
/hbase/MasterProcWALs/state-00000000000000000015.log 0 bytes, 0 block(s): OK
/hbase/WALs <dir>
/hbase/WALs/zhaocan_slave1_1.zhaocan_default,16020,1534214426421 <dir>
/hbase/WALs/zhaocan_slave2_1.zhaocan_default,16020,1534214426422 <dir>
/hbase/archive <dir>
/hbase/corrupt <dir>
/hbase/data <dir>
/hbase/data/default <dir>
/hbase/data/default/soy_test1 <dir>
/hbase/data/default/soy_test1/.tabledesc <dir>
/hbase/data/default/soy_test1/.tabledesc/.tableinfo.0000000001 289 bytes, 1 block(s):
/hbase/data/default/soy_test1/.tabledesc/.tableinfo.0000000001: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741857
MISSING 1 blocks of total size 289 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741857_1033 len=289 MISSING!
/hbase/data/default/soy_test1/.tmp <dir>
/hbase/data/default/soy_test1/e332a1b73e8bcac9b69e446999e834fb <dir>
/hbase/data/default/soy_test1/e332a1b73e8bcac9b69e446999e834fb/.regioninfo 44 bytes, 1 block(s):
/hbase/data/default/soy_test1/e332a1b73e8bcac9b69e446999e834fb/.regioninfo: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741858
MISSING 1 blocks of total size 44 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741858_1034 len=44 MISSING!
/hbase/data/default/soy_test1/e332a1b73e8bcac9b69e446999e834fb/data <dir>
/hbase/data/default/soy_test1/e332a1b73e8bcac9b69e446999e834fb/recovered.edits <dir>
/hbase/data/default/soy_test1/e332a1b73e8bcac9b69e446999e834fb/recovered.edits/4.seqid 0 bytes, 0 block(s): OK
/hbase/data/default/t1 <dir>
/hbase/data/default/t1/.tabledesc <dir>
/hbase/data/default/t1/.tabledesc/.tableinfo.0000000001 766 bytes, 1 block(s):
/hbase/data/default/t1/.tabledesc/.tableinfo.0000000001: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741853
MISSING 1 blocks of total size 766 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741853_1029 len=766 MISSING!
/hbase/data/default/t1/.tmp <dir>
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8 <dir>
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/.regioninfo 37 bytes, 1 block(s):
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/.regioninfo: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741854
MISSING 1 blocks of total size 37 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741854_1030 len=37 MISSING!
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/f1 <dir>
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/f1/37aeca77a262451f89bc9745de640e31 4925 bytes, 1 block(s):
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/f1/37aeca77a262451f89bc9745de640e31: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741859
MISSING 1 blocks of total size 4925 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741859_1035 len=4925 MISSING!
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/f2 <dir>
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/f3 <dir>
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/recovered.edits <dir>
/hbase/data/default/t1/9384715a6b039dd6db92d729703a01d8/recovered.edits/14.seqid 0 bytes, 0 block(s): OK
/hbase/data/hbase <dir>
/hbase/data/hbase/meta <dir>
/hbase/data/hbase/meta/.tabledesc <dir>
/hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001 397 bytes, 1 block(s):
/hbase/data/hbase/meta/.tabledesc/.tableinfo.0000000001: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741828
MISSING 1 blocks of total size 397 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741828_1004 len=397 MISSING!
/hbase/data/hbase/meta/.tmp <dir>
/hbase/data/hbase/meta/1588230740 <dir>
/hbase/data/hbase/meta/1588230740/.regioninfo 32 bytes, 1 block(s):
/hbase/data/hbase/meta/1588230740/.regioninfo: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741827
MISSING 1 blocks of total size 32 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741827_1003 len=32 MISSING!
/hbase/data/hbase/meta/1588230740/.tmp <dir>
/hbase/data/hbase/meta/1588230740/info <dir>
/hbase/data/hbase/meta/1588230740/info/b4a2d55521614033bd88eeeb82c3fd4d 9013 bytes, 1 block(s):
/hbase/data/hbase/meta/1588230740/info/b4a2d55521614033bd88eeeb82c3fd4d: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741928
MISSING 1 blocks of total size 9013 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741928_1110 len=9013 MISSING!
/hbase/data/hbase/meta/1588230740/recovered.edits <dir>
/hbase/data/hbase/meta/1588230740/recovered.edits/31.seqid 0 bytes, 0 block(s): OK
/hbase/data/hbase/namespace <dir>
/hbase/data/hbase/namespace/.tabledesc <dir>
/hbase/data/hbase/namespace/.tabledesc/.tableinfo.0000000001 312 bytes, 1 block(s):
/hbase/data/hbase/namespace/.tabledesc/.tableinfo.0000000001: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741834
MISSING 1 blocks of total size 312 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741834_1010 len=312 MISSING!
/hbase/data/hbase/namespace/.tmp <dir>
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556 <dir>
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556/.regioninfo 42 bytes, 1 block(s):
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556/.regioninfo: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741835
MISSING 1 blocks of total size 42 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741835_1011 len=42 MISSING!
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556/info <dir>
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556/info/f642f548417241e3a1fbbe34103507a2 4963 bytes, 1 block(s):
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556/info/f642f548417241e3a1fbbe34103507a2: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741850
MISSING 1 blocks of total size 4963 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741850_1026 len=4963 MISSING!
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556/recovered.edits <dir>
/hbase/data/hbase/namespace/23dc1dbbc536758979b2bcdfb7b6d556/recovered.edits/16.seqid 0 bytes, 0 block(s): OK
/hbase/hbase.id 42 bytes, 1 block(s):
/hbase/hbase.id: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741826
MISSING 1 blocks of total size 42 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741826_1002 len=42 MISSING!
/hbase/hbase.version 7 bytes, 1 block(s):
/hbase/hbase.version: CORRUPT blockpool BP-914767151-172.18.0.4-1533896696236 block blk_1073741825
MISSING 1 blocks of total size 7 B
0. BP-914767151-172.18.0.4-1533896696236:blk_1073741825_1001 len=7 MISSING!
/hbase/oldWALs <dir>
/root <dir>
/root/hive <dir>
/root/hive/root <dir>
/root/hive/root/16e2f965-2d21-4504-992e-4d64079f5a51 <dir>
/root/hive/root/16e2f965-2d21-4504-992e-4d64079f5a51/_tmp_space.db <dir>
/root/hive/root/3a835fae-a5d9-4cfd-8f14-b2fc1b35e984 <dir>
/root/hive/root/3a835fae-a5d9-4cfd-8f14-b2fc1b35e984/_tmp_space.db <dir>
/root/hive/root/4e754087-66f8-471e-9717-5ecc7bebc29b <dir>
/root/hive/root/4e754087-66f8-471e-9717-5ecc7bebc29b/_tmp_space.db <dir>
/root/hive/root/e6fe8961-9813-4e26-9366-b323d5a16479 <dir>
/root/hive/root/e6fe8961-9813-4e26-9366-b323d5a16479/_tmp_space.db <dir>
/root/hive/warehouse <dir>
/root/hive/warehouse/t1 <dir>
/root/hive/warehouse/testdb.db <dir>
/root/hive/warehouse/testdb.db/soy_test1 <dir>
/test <dir>
Status: CORRUPT
Total size: 20869 B (Total open files size: 249 B)
Total dirs: 56
Total files: 18
Total symlinks: 0 (Files currently being written: 3)
Total blocks (validated): 13 (avg. block size 1605 B) (Total open file blocks (not validated): 3)
********************************
UNDER MIN REPL'D BLOCKS: 13 (100.0 %)
dfs.namenode.replication.min: 1
CORRUPT FILES: 13
MISSING BLOCKS: 13
MISSING SIZE: 20869 B
CORRUPT BLOCKS: 13
********************************
Minimally replicated blocks: 0 (0.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 0.0
Corrupt blocks: 13
Missing replicas: 0
Number of data-nodes: 2
Number of racks: 1
FSCK ended at Sun Aug 19 13:52:55 UTC 2018 in 8 milliseconds
The filesystem under path '/' is CORRUPT
由于是開(kāi)發(fā)環(huán)境,直接執(zhí)行問(wèn)題塊的刪除操作氛魁,然后再次啟動(dòng)hbase
[root@78e7a081d8b6 logs]# hdfs fsck -delete
18/08/19 13:54:54 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connecting to namenode via http://master:50070/fsck?ugi=root&delete=1&path=%2F
FSCK started by root (auth:SIMPLE) from /172.18.0.3 for path / at Sun Aug 19 13:54:55 UTC 2018
.....Status: HEALTHY
Total size: 0 B (Total open files size: 249 B)
Total dirs: 56
Total files: 5
Total symlinks: 0 (Files currently being written: 3)
Total blocks (validated): 0 (Total open file blocks (not validated): 3)
Minimally replicated blocks: 0
Over-replicated blocks: 0
Under-replicated blocks: 0
Mis-replicated blocks: 0
Default replication factor: 3
Average block replication: 0.0
Corrupt blocks: 0
Missing replicas: 0
Number of data-nodes: 2
Number of racks: 1
FSCK ended at Sun Aug 19 13:54:55 UTC 2018 in 4 milliseconds
The filesystem under path '/' is HEALTHY
此時(shí)出現(xiàn)如下異常秀存,可以直接使用hdfs -rmr /hbase刪除整個(gè)目錄的數(shù)據(jù)
2018-08-19 13:56:13,652 FATAL [78e7a081d8b6:16000.activeMasterManager] master.HMaster: Failed to become active master
org.apache.hadoop.hbase.util.FileSystemVersionException: HBase file layout needs to be upgraded. You have version null and I want version 8. Consult http://hbase.apache.org/book.html for further information about upgrading HBase. Is your hbase.rootdir valid? If so, you may need to run 'hbase hbck -fixVersionFile'.
at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:712)
at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:509)
at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:166)
at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:141)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:741)
at org.apache.hadoop.hbase.master.HMaster.access$600(HMaster.java:205)
at org.apache.hadoop.hbase.master.HMaster$2.run(HMaster.java:2023)
at java.lang.Thread.run(Thread.java:748)