作為一個(gè)大數(shù)據(jù)從業(yè)者來(lái)說(shuō), hadoop, hbase, hive, spark, storm這是都是應(yīng)用得比較成熟的系統(tǒng)了,雖然維護(hù)性差较木,版本兼容性弱几迄,但是依然堅(jiān)挺著蔚龙。。映胁。
hive僅僅只是一個(gè)客戶端木羹,主流spark on yarn模式中spark也是個(gè)客戶端,對(duì)于storm來(lái)說(shuō), 被 twitter捐給apache后屿愚,主人離職移交權(quán)利汇跨,活生生成了棄子一顆。故本人亦從storm流式平臺(tái)轉(zhuǎn)向了KSQL流式平臺(tái)...
所以我們常用的大數(shù)據(jù)集群系統(tǒng)只有hadoop + hbase
hadoop分為兩塊組塊, 分布式文件系統(tǒng)hdfs以及資源管理器yarn妆距。
這里我們介紹的是hbase數(shù)據(jù)庫(kù)穷遂,建立在hadoop hdfs模塊之上,基于LSM存儲(chǔ)引擎讀寫(xiě)高性能的列式NOSQL數(shù)據(jù)庫(kù)娱据。目前已成為大數(shù)據(jù)服務(wù)的重要一環(huán)蚪黑。
這里選擇的版本是HADOOP 3.0盅惜,為什么呢?
因?yàn)閔adoop 2.0的兼容性太弱忌穿,當(dāng)前hbase跟spark都在積極擁抱hadoop 3.0.
其它也沒(méi)有太多要介紹的抒寂,那么就開(kāi)始正式的主題吧。
搭建hbase-2.1 on hadoop-3.1.1的集群
由于精力有限掠剑,hadoop及hbase均沒(méi)有加入HA功能屈芜,后續(xù)將會(huì)補(bǔ)充進(jìn)來(lái)。
分為以下四個(gè)過(guò)程:
- 創(chuàng)建虛擬機(jī)-createvm.sh
- 打包依賴-package.sh
- 部署集群-deploy.sh
- 啟動(dòng)集群-start.sh
- 測(cè)試集群
項(xiàng)目路徑: https://github.com/clojurians-org/my-env
代碼路徑:run.sh.d/hbase-example/{createvm.sh, package.sh, deploy.sh, start.sh}
第0步: 創(chuàng)建虛擬機(jī)-createvm.sh
[larluo@larluo-nixos:~/my-env]$ cat run.sh.d/hbase-example/createvm.sh
set -e
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
echo -e "\n==== bash nix.sh create-vm nixos-hbase-001" && bash nix.sh create-vm nixos-hbase-001
echo -e "\n==== bash nix.sh create-vm nixos-hbase-002" && bash nix.sh create-vm nixos-hbase-002
echo -e "\n==== bash nix.sh create-vm nixos-hbase-003" && bash nix.sh create-vm nixos-hbase-003
成功創(chuàng)建三臺(tái)虛擬機(jī): 192.168.56.101, 192.168.56.102, 192.168.56.103
第1步: 打包依賴-package.sh
[larluo@larluo-nixos:~/my-env]$ cat run.sh.d/hbase-example/package.sh
set -e
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
echo -e "\n==== bash nix.sh build tgz.hbase-2.1.0" && bash nix.sh build tgz.hbase-2.1.0
echo -e "\n==== bash nix.sh export tgz.nix-2.0.4" && bash nix.sh export tgz.nix-2.0.4
echo -e "\n==== bash nix.sh export nix.rsync-3.1.3" && bash nix.sh export nix.rsync-3.1.3
echo -e "\n==== bash nix.sh export nix.gettext-0.19.8.1" && bash nix.sh export nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh export nix.openjdk-8u172b11" && bash nix.sh export nix.openjdk-8u172b11
echo -e "\n==== bash nix.sh export nix.zookeeper-3.4.13" && bash nix.sh export nix.zookeeper-3.4.13
echo -e "\n==== bash nix.sh export nix.hadoop-3.1.1" && bash nix.sh export nix.hadoop-3.1.1
由于hbase默認(rèn)是hadoop 2的版本朴译,故hbase on hadoop3需定制編譯井佑,打包時(shí)會(huì)自動(dòng)調(diào)用。
這里由于時(shí)間關(guān)系眠寿,沒(méi)有采用nix構(gòu)建躬翁,直接使用mvn打包,后續(xù)可能會(huì)優(yōu)化..
[larluo@larluo-nixos:~/my-env]$ cat nix.conf/hbase-2.1.0/build.sh
mvn package -Dhadoop.profile=3.0 -Dhadoop-three.version=3.1.1 -DskipTests assembly:single
這里我們除了安裝nix, rsync, gettext基本工具外盯拱,添加了jdk,zookeeper,hadoop的nix軟件包盒发,最后調(diào)用hbase構(gòu)建過(guò)程.
第2步. 部署集群-deploy.sh
[larluo@larluo-nixos:~/my-env]$ cat run.sh.d/hbase-example/deploy.sh
set -e
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
echo -e "\n==== bash nix.sh create-user 192.168.56.101" && bash nix.sh create-user 192.168.56.101
echo -e "\n==== bash nix.sh create-user 192.168.56.102" && bash nix.sh create-user 192.168.56.102
echo -e "\n==== bash nix.sh create-user 192.168.56.103" && bash nix.sh create-user 192.168.56.103
echo -e "\n==== bash nix.sh install 192.168.56.101 tgz.nix-2.0.4" && bash nix.sh install 192.168.56.101 tgz.nix-2.0.4
echo -e "\n==== bash nix.sh install 192.168.56.102 tgz.nix-2.0.4" && bash nix.sh install 192.168.56.102 tgz.nix-2.0.4
echo -e "\n==== bash nix.sh install 192.168.56.103 tgz.nix-2.0.4" && bash nix.sh install 192.168.56.103 tgz.nix-2.0.4
echo -e "\n==== bash nix.sh install 192.168.56.101 nix.rsync-3.1.3" && bash nix.sh install 192.168.56.101 nix.rsync-3.1.3
echo -e "\n==== bash nix.sh install 192.168.56.102 nix.rsync-3.1.3" && bash nix.sh install 192.168.56.102 nix.rsync-3.1.3
echo -e "\n==== bash nix.sh install 192.168.56.103 nix.rsync-3.1.3" && bash nix.sh install 192.168.56.103 nix.rsync-3.1.3
echo -e "\n==== bash nix.sh install 192.168.56.101 nix.gettext-0.19.8.1" && bash nix.sh install 192.168.56.101 nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh install 192.168.56.102 nix.gettext-0.19.8.1" && bash nix.sh install 192.168.56.102 nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh install 192.168.56.103 nix.gettext-0.19.8.1" && bash nix.sh install 192.168.56.103 nix.gettext-0.19.8.1
echo -e "\n==== bash nix.sh install 192.168.56.101 nix.openjdk-8u172b11" && bash nix.sh install 192.168.56.101 nix.openjdk-8u172b11
echo -e "\n==== bash nix.sh install 192.168.56.102 nix.openjdk-8u172b11" && bash nix.sh install 192.168.56.102 nix.openjdk-8u172b11
echo -e "\n==== bash nix.sh install 192.168.56.103 nix.openjdk-8u172b11" && bash nix.sh install 192.168.56.103 nix.openjdk-8u172b11
echo -e "\n==== bash nix.sh install 192.168.56.101 nix.zookeeper-3.4.13" && bash nix.sh install 192.168.56.101 nix.zookeeper-3.4.13
echo -e "\n==== bash nix.sh install 192.168.56.102 nix.zookeeper-3.4.13" && bash nix.sh install 192.168.56.102 nix.zookeeper-3.4.13
echo -e "\n==== bash nix.sh install 192.168.56.103 nix.zookeeper-3.4.13" && bash nix.sh install 192.168.56.103 nix.zookeeper-3.4.13
echo -e "\n==== bash nix.sh install 192.168.56.101 nix.hadoop-3.1.1" && bash nix.sh install 192.168.56.101 nix.hadoop-3.1.1
echo -e "\n==== bash nix.sh install 192.168.56.102 nix.hadoop-3.1.1" && bash nix.sh install 192.168.56.102 nix.hadoop-3.1.1
echo -e "\n==== bash nix.sh install 192.168.56.103 nix.hadoop-3.1.1" && bash nix.sh install 192.168.56.103 nix.hadoop-3.1.1
echo -e "\n==== bash nix.sh import 192.168.56.101 tgz.hbase-2.1.0" && bash nix.sh import 192.168.56.101 tgz.hbase-2.1.0
echo -e "\n==== bash nix.sh import 192.168.56.102 tgz.hbase-2.1.0" && bash nix.sh import 192.168.56.102 tgz.hbase-2.1.0
echo -e "\n==== bash nix.sh import 192.168.56.103 tgz.hbase-2.1.0" && bash nix.sh import 192.168.56.103 tgz.hbase-2.1.0
這個(gè)過(guò)程基本上與package打包一一對(duì)應(yīng),將軟件包分發(fā)至各個(gè)服務(wù)器
第3步. 啟動(dòng)集群-start.sh
my=$(cd -P -- "$(dirname -- "${BASH_SOURCE-$0}")" > /dev/null && pwd -P) && cd $my/../..
# start zookeeper
export ZK_ALL="192.168.56.101:2181,192.168.56.102:2181,192.168.56.103:2181"
# start zookeeper-3.4.12
echo -e "\n==== bash nix.sh start 192.168.56.101:2181 zookeeper-3.4.13 --all ${ZK_ALL}" && bash nix.sh start 192.168.56.101:2181 zookeeper-3.4.13 --all ${ZK_ALL}
echo -e "\n==== bash nix.sh start 192.168.56.102:2181 zookeeper-3.4.13 --all ${ZK_ALL}" && bash nix.sh start 192.168.56.102:2181 zookeeper-3.4.13 --all ${ZK_ALL}
echo -e "\n==== bash nix.sh start 192.168.56.103:2181 zookeeper-3.4.13 --all ${ZK_ALL}" && bash nix.sh start 192.168.56.103:2181 zookeeper-3.4.13 --all ${ZK_ALL}
# start hadoop-3.1.1
echo -e "\n==== bash nix.sh start 192.168.56.101:9000 hadoop-3.1.1:namenode" && bash nix.sh start 192.168.56.101:9000 hadoop-3.1.1:namenode
export HDFS_MASTER="192.168.56.101:9000"
echo -e "\n==== bash nix.sh start 192.168.56.101:5200 hadoop-3.1.1:datanode" && bash nix.sh start 192.168.56.101:5200 hadoop-3.1.1:datanode --master ${HDFS_MASTER}
echo -e "\n==== bash nix.sh start 192.168.56.102:5200 hadoop-3.1.1:datanode" && bash nix.sh start 192.168.56.102:5200 hadoop-3.1.1:datanode --master ${HDFS_MASTER}
echo -e "\n==== bash nix.sh start 192.168.56.103:5200 hadoop-3.1.1:datanode" && bash nix.sh start 192.168.56.103:5200 hadoop-3.1.1:datanode --master ${HDFS_MASTER}
# start hbase-2.1.0
echo -e "\n==== bash nix.sh start 192.168.56.101:16010 hbase-2.1.0:master --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}"
bash nix.sh start 192.168.56.101:16010 hbase-2.1.0:master --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}
echo -e "\n==== bash nix.sh start 192.168.56.101:16030 hbase-2.1.0:regionserver --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}"
bash nix.sh start 192.168.56.101:16030 hbase-2.1.0:regionserver --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}
echo -e "\n==== bash nix.sh start 192.168.56.102:16030 hbase-2.1.0:regionserver --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}"
bash nix.sh start 192.168.56.102:16030 hbase-2.1.0:regionserver --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}
echo -e "\n==== bash nix.sh start 192.168.56.103:16030 hbase-2.1.0:regionserver --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}"
bash nix.sh start 192.168.56.103:16030 hbase-2.1.0:regionserver --zookeepers ${ZK_ALL} --hdfs.master ${HDFS_MASTER}
第4步: 測(cè)試集群
- 測(cè)試 zookeeper
[larluo@larluo-nixos:~/my-env]$ echo ruok | nc 192.168.56.101 2181
imok
[larluo@larluo-nixos:~/my-env]$ echo ruok | nc 192.168.56.102 2181
imok
[larluo@larluo-nixos:~/my-env]$ echo ruok | nc 192.168.56.103 2181
imok
- 測(cè)試hdfs
[larluo@larluo-nixos:~/my-env]$ su - op -c "/nix/store/p7wlb2b81dsw2kqjxnsrq4s62i8nn6xi-hadoop-3.1.1/bin/hdfs dfsadmin -fs hdfs://192.168.56.101:9000 -report"
Password:
WARNING: HADOOP_PREFIX has been replaced by HADOOP_HOME. Using value of HADOOP_PREFIX.
Configured Capacity: 31499022336 (29.34 GB)
Present Capacity: 15335188603 (14.28 GB)
DFS Remaining: 15334603519 (14.28 GB)
DFS Used: 585084 (571.37 KB)
DFS Used%: 0.00%
Replicated Blocks:
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
Erasure Coded Block Groups:
Low redundancy block groups: 0
Block groups with corrupt internal blocks: 0
Missing block groups: 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 192.168.56.101:9866 (192.168.56.101)
Hostname: 192.168.56.101
Decommission Status : Normal
Configured Capacity: 10499674112 (9.78 GB)
DFS Used: 175484 (171.37 KB)
Non DFS Used: 4045742724 (3.77 GB)
DFS Remaining: 5094927957 (4.75 GB)
DFS Used%: 0.00%
DFS Remaining%: 48.52%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 13
Last contact: Sun Aug 26 14:37:35 CST 2018
Last Block Report: Sun Aug 26 14:15:08 CST 2018
Num of Blocks: 11
Name: 192.168.56.102:9866 (192.168.56.102)
Hostname: 192.168.56.102
Decommission Status : Normal
Configured Capacity: 10499674112 (9.78 GB)
DFS Used: 204800 (200 KB)
Non DFS Used: 4016508928 (3.74 GB)
DFS Remaining: 5124132437 (4.77 GB)
DFS Used%: 0.00%
DFS Remaining%: 48.80%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 13
Last contact: Sun Aug 26 14:37:36 CST 2018
Last Block Report: Sun Aug 26 14:16:41 CST 2018
Num of Blocks: 11
Name: 192.168.56.103:9866 (192.168.56.103)
Hostname: 192.168.56.103
Decommission Status : Normal
Configured Capacity: 10499674112 (9.78 GB)
DFS Used: 204800 (200 KB)
Non DFS Used: 4025098240 (3.75 GB)
DFS Remaining: 5115543125 (4.76 GB)
DFS Used%: 0.00%
DFS Remaining%: 48.72%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 13
Last contact: Sun Aug 26 14:37:35 CST 2018
Last Block Report: Sun Aug 26 14:17:31 CST 2018
Num of Blocks: 11
- 測(cè)試hbase
hbase(main):001:0> create 'test', 'cf'
Created table test
Took 3.2068 seconds
=> Hbase::Table - test
hbase(main):002:0> list 'test'
TABLE
test
1 row(s)
Took 0.0540 seconds
=> ["test"]
hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1'
Took 0.2722 seconds
hbase(main):004:0> put 'test', 'row2', 'cf:b', 'value2'
Took 0.0136 seconds
hbase(main):005:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1535266220041, value=value1
row2 column=cf:b, timestamp=1535266224174, value=value2
2 row(s)
Took 0.1010 seconds
hbase(main):008:0>