說明:不少讀者反饋昔驱,想使用開源組件搭建Hadoop平臺疹尾,然后再部署Kylin,但是遇到各種問題骤肛。這里我為讀者部署一套環(huán)境纳本,請朋友們參考一下。如果還有問題腋颠,再交流繁成。
系統(tǒng)環(huán)境以及各組件版本信息
Linux操作系統(tǒng):
cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
JDK版本:
java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build25.111-b14, mixed mode)
Hadoop組件版本:
Hive:apache-hive-1.2.1-bin
Hadoop:hadoop-2.7.2
HBase:hbase-1.1.9-bin
Zookeeper:zookeeper-3.4.6
Kylin版本:
apache-kylin-1.5.4.1-hbase1.x-bin
三個節(jié)點情況以及安裝的組件(僅測試):
192.168.1.129 ldvl-kyli-a01 ldvl-kyli-a01.idc.dream.com
192.168.1.130 ldvl-kyli-a02 ldvl-kyli-a02.idc.dream.com
192.168.1.131 ldvl-kyli-a03 ldvl-kyli-a03.idc.dream.com
基礎(chǔ)組件部署
JDK環(huán)境搭建(3個節(jié)點)
rpm包安裝:
rpm -ivh jdk-8u111-linux-x64.rpm
配置環(huán)境變量:
vi /etc/profile
export JAVA_HOME=/usr/java/default
export JRE_HOME=/usr/java/default/jre
exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
source /etc/profile
驗證:
java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build25.111-b14, mixed mode)
Zookeeper環(huán)境搭建(3個節(jié)點)
安裝:
tar -zxvf zookeeper-3.4.6.tar.gz -C /usr/local/
cd /usr/local/
ln -s zookeeper-3.4.6 zookeeper
創(chuàng)建數(shù)據(jù)和日志目錄
mkdir /usr/local/zookeeper/zkdata
mkdir /usr/local/zookeeper/zkdatalog
配置Zookeeper參數(shù)
cd /usr/local/zookeeper/conf
cp zoo_sample.cfg zoo.cfg
修改好的配置文件如下:
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/usr/local/zookeeper/zkdata
dataLogDir=/usr/local/zookeeper/zkdatalog
clientPort=2181
server.1=ldvl-kyli-a01:2888:3888
server.2=ldvl-kyli-a02:2888:3888
server.3=ldvl-kyli-a03:2888:3888
創(chuàng)建myid
cd /usr/local/zookeeper/zkdata
echo 1 > myid #每個節(jié)點根據(jù)上面的配置(server.x)創(chuàng)建對應(yīng)的文件內(nèi)容
啟動Zookeeper:
zkServer.sh start
查看狀態(tài):
192.168.1.129節(jié)點:
zkServer.sh status
JMX enabled by default
Using config:/usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
192.168.1.130節(jié)點:
zkServer.sh status
JMX enabled by default
Using config:/usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: leader
192.168.1.131節(jié)點:
zkServer.sh status
JMX enabled by default
Using config:/usr/local/zookeeper/bin/../conf/zoo.cfg
Mode: follower
MariaDB數(shù)據(jù)庫
安裝:
yum install MariaDB-server MariaDB-client
啟動:
systemctl start mariadb
設(shè)置root密碼,安全加固等:
mysql_secure_installation
關(guān)閉防火墻
systemctl disable firewalld
systemctl stop firewalld
同時淑玫,也需要關(guān)閉SELinux巾腕,可修改 /etc/selinux/config 文件,將其中的 SELINUX=enforcing 改為 SELINUX=disabled即可絮蒿。
三個節(jié)點保證時間同步
可以通過ntp服務(wù)進(jìn)行設(shè)置
Hadoop組件部署
Hadoop
創(chuàng)建組和用戶:
groupadd hadoop
useradd -s /bin/bash -d /app/hadoop -m hadoop-g hadoop
passwd hadoop
下面所有的操作都是在hadoop用戶下面操作
切換到hadoop用戶下面創(chuàng)建信任關(guān)系:
ssh-keygen -t rsa
ssh-copy-id -p 22 hadoop@192.168.1.129
ssh-copy-id -p 22 hadoop@192.168.1.130
ssh-copy-id -p 22 hadoop@192.168.1.131
解壓縮:
$ tar -zxvf hadoop-2.7.2.tar.gz
設(shè)置軟鏈接:
$ ln -s hadoop-2.7.2 hadoop
配置:
$ cd /app/hadoop/hadoop/etc/hadoop
l core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ldvl-kyli-a01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/app/hadoop/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
</configuration>
l hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/app/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/app/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.http.address</name>
<value>ldvl-kyli-a01:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>ldvl-kyli-a01:50090</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
<description>HDFS blocksize of 256MB for largefile-systems.</description>
</property>
<property>
<name>dfs.datanode.max.xcievers</name>
<value>4096</value>
</property>
</configuration>
l yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>ldvl-kyli-a01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>ldvl-kyli-a01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>ldvl-kyli-a01:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>ldvl-kyli-a01:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>ldvl-kyli-a01:8088</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>Configuration to enable or disable logaggregation.Shuffle service that needs to be set for Map Reduceapplications.</description>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
l mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>ldvl-kyli-a01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>ldvl-kyli-a01:19888</value>
</property>
</configuration>
l slaves
ldvl-kyli-a01
ldvl-kyli-a02
ldvl-kyli-a03
l hadoop-env.sh尊搬,mapred-env.sh和yarn-env.sh
export JAVA_HOME=/usr/java/default
環(huán)境變量配置(這里我將所有的組件的環(huán)境變量都配置好了,后面每個組件我就不再說明):
$ cat .bashrc
export JAVA_HOME=/usr/java/default
export JRE_HOME=/usr/java/default/jre
exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
export HIVE_HOME=/app/hadoop/hive
export HADOOP_HOME=/app/hadoop/hadoop
export HBASE_HOME=/app/hadoop/hbase
added by HCAT
export HCAT_HOME=/app/hadoop/hive/hcatalog
added by Kylin
export KYLIN_HOME=/app/hadoop/kylin
export KYLIN_CONF=/app/hadoop/kylin/conf
exportPATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$HBASE_HOME/bin:${KYLIN_HOME}/bin:$PATH
創(chuàng)建HDFS的數(shù)據(jù)目錄
$ mkdir -p /app/hadoop/hdfs/data
$ mkdir -p /app/hadoop/hdfs/name
$ mkdir -p /app/hadoop/tmp
加入上面的hadoop所有配置都配置完成了土涝,你也可以全部拷貝到其他節(jié)點佛寿。
HDFS格式化:
$ hdfs namenode -format
$ start-dfs.sh
$ start-yarn.sh
$ mr-jobhistory-daemon.sh starthistoryserver
然后進(jìn)行驗證操作,比如同通過jps查看進(jìn)程但壮,通過web頁面服務(wù)hdfs和yarn冀泻,執(zhí)行wordcount的測試程序等等
Hive組件部署
安裝:
$ tar -zxvf apache-hive-1.2.1-bin.tar.gz
$ ln -s apache-hive-1.2.1-bin hive
配置:
$ cd /app/hadoop/hive/conf
l hive-env.sh
export HIVE_HOME=/app/hadoop/hive
HADOOP_HOME=/app/hadoop/hadoop
export HIVE_CONF_DIR=/app/hadoop/hive/conf
l hive-site.xml
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://ldvl-kyli-a01:9000/user/hive/warehouse</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>hdfs://ldvl-kyli-a01:9000/user/hive/scratchdir</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://ldvl-kyli-a01:3306/metastore?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.metastore.local</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://ldvl-kyli-a01:9083</value>
</property>
</configuration>
l hive-log4j.properties
hive.log.dir=/app/hadoop/hive/log
hive.log.file=hive.log
將mysql-connector-java-5.1.38-bin.jar放到Hive的lib目錄下面:
$ cp mysql-connector-java-5.1.38-bin.jar/app/hadoop/hive/lib/
創(chuàng)建Hive元數(shù)據(jù)庫:
MariaDB [(none)]> create database metastore character set latin1;
grant all on metastore.* to hive@"%" identified by "123456" with grant option;
flush privileges;
啟動服務(wù):
nohup hive --service metastore -v &
$ tailf nohup.out
Starting Hive Metastore Server
17/03/16 14:10:29 WARN conf.HiveConf:HiveConf of name hive.metastore.local does not exist
Starting hive metastore on port 9083
HBase組件部署
安裝:
$ tar -zxvf hbase-1.1.9-bin.tar.gz
$ ln -s hbase-1.1.9 hbase
配置:
l hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ldvl-kyli-a01:9000/hbaseforkylin</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<property>
<name>hbase.master.info.port</name>
<value>16010</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>ldvl-kyli-a01,ldvl-kyli-a02,ldvl-kyli-a03</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/usr/local/zookeeper/zkdata</value>
</property>
</configuration>
l regionservers
ldvl-kyli-a02
ldvl-kyli-a03
l hbase-env.sh
export JAVA_HOME=/usr/java/latest
export HBASE_OPTS="-Xmx268435456-XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode-Djava.net.preferIPv4Stack=true $HBASE_OPTS"
exportHBASE_MASTER_OPTS="$HBASE_MASTER_OPTS -XX:PermSize=128m-XX:MaxPermSize=128m"
exportHBASE_REGIONSERVER_OPTS="$HBASE_REGIONSERVER_OPTS -XX:PermSize=128m-XX:MaxPermSize=128m"
export HBASE_LOG_DIR=${HBASE_HOME}/logs
export HBASE_PID_DIR=${HBASE_HOME}/logs
export HBASE_MANAGES_ZK=false
如果日志目錄不存在常侣,需要提前創(chuàng)建好。
啟動HBase服務(wù):
$ start-hbase.sh
Kylin環(huán)境部署(我只選第一個節(jié)點安裝腔长,僅測試)
安裝:
$ tar -zxvf apache-kylin-1.5.4.1-hbase1.x-bin.tar.gz
$ ln -s apache-kylin-1.5.4.1-hbase1.x-bin kylin
配置:
$ cd kylin/conf/
l kylin.properties # 基本默認(rèn)值
kyin.server.mode=all
kylin.rest.servers=192.168.1.129:7070
kylin.rest.timezone=GMT+8
kylin.hive.client=cli
kylin.hive.keep.flat.table=false
kylin.metadata.url=kylin_metadata@hbase
kylin.storage.url=hbase
kylin.storage.cleanup.time.threshold=172800000
kylin.hdfs.working.dir=/kylin
kylin.hbase.default.compression.codec=none
kylin.hbase.region.cut=5
kylin.hbase.hfile.size.gb=2
kylin.hbase.region.count.min=1
kylin.hbase.region.count.max=50
環(huán)境變量配置:
$ cat .bashrc
export JAVA_HOME=/usr/java/default
export JRE_HOME=/usr/java/default/jre
exportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
exportPATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
export HIVE_HOME=/app/hadoop/hive
export HADOOP_HOME=/app/hadoop/hadoop
export HBASE_HOME=/app/hadoop/hbase
added by HCAT
export HCAT_HOME=/app/hadoop/hive/hcatalog
added by Kylin
export KYLIN_HOME=/app/hadoop/kylin
export KYLIN_CONF=/app/hadoop/kylin/conf
exportPATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$HBASE_HOME/bin:${KYLIN_HOME}/bin:$PATH
檢查Kylin用來的環(huán)境變量:
$ ${KYLIN_HOME}/bin/check-env.sh
KYLIN_HOME is set to /app/hadoop/kylin
$ kylin/bin/find-hbase-dependency.sh
hbase dependency: /app/hadoop/hbase/lib/hbase-common-1.1.9.jar
$ kylin/bin/find-hive-dependency.sh
Logging initialized using configuration infile:/app/hadoop/apache-hive-1.2.1-bin/conf/hive-log4j.properties
HCAT_HOME is set to:/app/hadoop/hive/hcatalog, use it to find hcatalog path:
hive dependency:/app/hadoop/hive/conf:/app/hadoop/hive/lib/jcommander-1.32.jar:/app/hadoop/hive/lib/stringtemplate-3.2.1.jar:/app/hadoop/hive/lib/hive-shims-0.23-1.2.1.jar:/app/hadoop/hive/lib/hive-jdbc-1.2.1-standalone.jar:/app/hadoop/hive/lib/hamcrest-core-1.1.jar:/app/hadoop/hive/lib/commons-compress-1.4.1.jar:/app/hadoop/hive/lib/xz-1.0.jar:/app/hadoop/hive/lib/hive-common-1.2.1.jar:/app/hadoop/hive/lib/guava-14.0.1.jar:/app/hadoop/hive/lib/commons-collections-3.2.1.jar:/app/hadoop/hive/lib/jta-1.1.jar:/app/hadoop/hive/lib/antlr-2.7.7.jar:/app/hadoop/hive/lib/maven-scm-provider-svn-commons-1.4.jar:/app/hadoop/hive/lib/hive-metastore-1.2.1.jar:/app/hadoop/hive/lib/hive-jdbc-1.2.1.jar:/app/hadoop/hive/lib/commons-httpclient-3.0.1.jar:/app/hadoop/hive/lib/ivy-2.4.0.jar:/app/hadoop/hive/lib/geronimo-annotation_1.0_spec-1.1.1.jar:/app/hadoop/hive/lib/commons-pool-1.5.4.jar:/app/hadoop/hive/lib/maven-scm-api-1.4.jar:/app/hadoop/hive/lib/mysql-connector-java-5.1.38-bin.jar:/app/hadoop/hive/lib/commons-configuration-1.6.jar:/app/hadoop/hive/lib/accumulo-start-1.6.0.jar:/app/hadoop/hive/lib/asm-commons-3.1.jar:/app/hadoop/hive/lib/libfb303-0.9.2.jar:/app/hadoop/hive/lib/commons-dbcp-1.4.jar:/app/hadoop/hive/lib/log4j-1.2.16.jar:/app/hadoop/hive/lib/hive-shims-common-1.2.1.jar:/app/hadoop/hive/lib/junit-4.11.jar:/app/hadoop/hive/lib/antlr-runtime-3.4.jar:/app/hadoop/hive/lib/commons-cli-1.2.jar:/app/hadoop/hive/lib/commons-logging-1.1.3.jar:/app/hadoop/hive/lib/ant-1.9.1.jar:/app/hadoop/hive/lib/hive-contrib-1.2.1.jar:/app/hadoop/hive/lib/httpcore-4.4.jar:/app/hadoop/hive/lib/datanucleus-api-jdo-3.2.6.jar:/app/hadoop/hive/lib/commons-beanutils-1.7.0.jar:/app/hadoop/hive/lib/curator-recipes-2.6.0.jar:/app/hadoop/hive/lib/netty-3.7.0.Final.jar:/app/hadoop/hive/lib/accumulo-trace-1.6.0.jar:/app/hadoop/hive/lib/jetty-all-server-7.6.0.v20120127.jar:/app/hadoop/hive/lib/servlet-api-2.5.jar:/app/hadoop/hive/lib/curator-client-2.6.0.jar:/app/hadoop/hive/lib/hive-shims-scheduler-1.2.1.jar:/app/hadoop/hive/lib/commons-lang-2.6.jar:/app/hadoop/hive/lib/geronimo-jaspic_1.0_spec-1.0.jar:/app/hadoop/hive/lib/curator-framework-2.6.0.jar:/app/hadoop/hive/lib/asm-tree-3.1.jar:/app/hadoop/hive/lib/hive-beeline-1.2.1.jar:/app/hadoop/hive/lib/velocity-1.5.jar:/app/hadoop/hive/lib/maven-scm-provider-svnexe-1.4.jar:/app/hadoop/hive/lib/commons-io-2.4.jar:/app/hadoop/hive/lib/ant-launcher-1.9.1.jar:/app/hadoop/hive/lib/mail-1.4.1.jar:/app/hadoop/hive/lib/accumulo-core-1.6.0.jar:/app/hadoop/hive/lib/geronimo-jta_1.1_spec-1.1.1.jar:/app/hadoop/hive/lib/oro-2.0.8.jar:/app/hadoop/hive/lib/eigenbase-properties-1.1.5.jar:/app/hadoop/hive/lib/commons-math-2.1.jar:/app/hadoop/hive/lib/apache-log4j-extras-1.2.17.jar:/app/hadoop/hive/lib/commons-compiler-2.7.6.jar:/app/hadoop/hive/lib/commons-digester-1.8.jar:/app/hadoop/hive/lib/ST4-4.0.4.jar:/app/hadoop/hive/lib/parquet-hadoop-bundle-1.6.0.jar:/app/hadoop/hive/lib/datanucleus-core-3.2.10.jar:/app/hadoop/hive/lib/json-20090211.jar:/app/hadoop/hive/lib/bonecp-0.8.0.RELEASE.jar:/app/hadoop/hive/lib/hive-service-1.2.1.jar:/app/hadoop/hive/lib/snappy-java-1.0.5.jar:/app/hadoop/hive/lib/stax-api-1.0.1.jar:/app/hadoop/hive/lib/jetty-all-7.6.0.v20120127.jar:/app/hadoop/hive/lib/jline-2.12.jar:/app/hadoop/hive/lib/libthrift-0.9.2.jar:/app/hadoop/hive/lib/hive-testutils-1.2.1.jar:/app/hadoop/hive/lib/accumulo-fate-1.6.0.jar:/app/hadoop/hive/lib/hive-cli-1.2.1.jar:/app/hadoop/hive/lib/hive-accumulo-handler-1.2.1.jar:/app/hadoop/hive/lib/jpam-1.1.jar:/app/hadoop/hive/lib/groovy-all-2.1.6.jar:/app/hadoop/hive/lib/httpclient-4.4.jar:/app/hadoop/hive/lib/avro-1.7.5.jar:/app/hadoop/hive/lib/zookeeper-3.4.6.jar:/app/hadoop/hive/lib/hive-hwi-1.2.1.jar:/app/hadoop/hive/lib/hive-exec-1.2.1.jar:/app/hadoop/hive/lib/hive-shims-0.20S-1.2.1.jar:/app/hadoop/hive/lib/super-csv-2.2.0.jar:/app/hadoop/hive/lib/opencsv-2.3.jar:/app/hadoop/hive/lib/commons-vfs2-2.0.jar:/app/hadoop/hive/lib/hive-serde-1.2.1.jar:/app/hadoop/hive/lib/commons-beanutils-core-1.8.0.jar:/app/hadoop/hive/lib/derby-10.10.2.0.jar:/app/hadoop/hive/lib/plexus-utils-1.5.6.jar:/app/hadoop/hive/lib/datanucleus-rdbms-3.2.9.jar:/app/hadoop/hive/lib/jdo-api-3.0.1.jar:/app/hadoop/hive/lib/joda-time-2.5.jar:/app/hadoop/hive/lib/activation-1.1.jar:/app/hadoop/hive/lib/janino-2.7.6.jar:/app/hadoop/hive/lib/regexp-1.3.jar:/app/hadoop/hive/lib/hive-shims-1.2.1.jar:/app/hadoop/hive/lib/paranamer-2.3.jar:/app/hadoop/hive/lib/hive-hbase-handler-1.2.1.jar:/app/hadoop/hive/lib/tempus-fugit-1.1.jar:/app/hadoop/hive/lib/commons-codec-1.4.jar:/app/hadoop/hive/lib/hive-ant-1.2.1.jar:/app/hadoop/hive/lib/jsr305-3.0.0.jar:/app/hadoop/hive/lib/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar:/app/hadoop/hive/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar
環(huán)境檢查沒有問題袭祟,開始啟動Kylin服務(wù):
kylin.sh start
導(dǎo)入樣例:
$ sample.sh
然后通過Kylin的Web頁面重新加載元數(shù)據(jù),然后構(gòu)建Cube就可以查詢了:
查詢:
select part_dt, sum(price) as total_selled,count(distinct seller_id) as sellers from kylin_sales group by part_dt order bypart_dt
轉(zhuǎn)自 http://blog.csdn.net/jiangshouzhuang/article/details/64151586