最近一直在研究大數(shù)據(jù)砰嘁,這里記錄一下自己創(chuàng)建Docker集群并且使用ansible搭建Hadoop大數(shù)據(jù)平臺的過程,其中主要借鑒了一下網(wǎng)上的一篇較為全面的文章勘究,不過到時遇到蠻多坑的矮湘,所以將那篇文章做了改動,添加或者減少一些步驟口糕,至少能夠保證搭建這樣的Hadoop集群順手即來缅阳。
基礎(chǔ)環(huán)境
Centos7 ; Daocloud加速器 ; 163 yum源 ; 一臺虛擬機或者云主機 ;
不用殺程序員祭天,但是要有一顆完成到底的心
集群架構(gòu)
集群包含4臺采用Docker創(chuàng)建的容器環(huán)境景描,無需多余的虛擬機十办。
OS | hostname | IP | |
---|---|---|---|
Centos7 | cluster-master | 172.18.0.2 | |
Centos7 | cluster-slave1 | 172.18.0.3 | |
Centos7 | cluster-slave2 | 172.18.0.4 | |
Centos7 | cluster-slave3 | 172.18.0.5 |
Docker安裝
Docker安裝可以看我的文章:Docker快速安裝以及換鏡像源
curl -sSL https://get.daocloud.io/docker | sh
##換源
###這里可以參考這篇文章http://www.reibang.com/p/34d3b4568059
curl -sSL https://get.daocloud.io/daotools/set_mirror.sh | sh -s http://67e93489.m.daocloud.io
systemctl restart docker
拉取Centos7:latest鏡像
筆者此時的latest鏡像的版本為Centos7.4
docker pull daocloud.io/library/centos:latest
拉取完成之后可以使用這條命令查看images是否下載成功(duoyu)
docker image`
創(chuàng)建容器
按照集群的架構(gòu),創(chuàng)建容器時需要設(shè)置固定IP超棺,所以先要在docker使用如下命令創(chuàng)建固定IP的子網(wǎng)
docker network create --subnet=172.18.0.0/16 netgroup
docker的子網(wǎng)創(chuàng)建完成之后就可以創(chuàng)建固定IP的容器了
#cluster-master
docker run -d --privileged -ti -v /sys/fs/cgroup:/sys/fs/cgroup --name cluster-master -h cluster-master --net netgroup --ip 172.18.0.2 daocloud.io/library/centos /usr/sbin/init
#cluster-slaves
docker run -d --privileged -ti -v /sys/fs/cgroup:/sys/fs/cgroup --name cluster-slave1 -h cluster-slave1 --net netgroup --ip 172.18.0.3 daocloud.io/library/centos /usr/sbin/init
docker run -d --privileged -ti -v /sys/fs/cgroup:/sys/fs/cgroup --name cluster-slave2 -h cluster-slave2 --net netgroup --ip 172.18.0.4 daocloud.io/library/centos /usr/sbin/init
docker run -d --privileged -ti -v /sys/fs/cgroup:/sys/fs/cgroup --name cluster-slave3 -h cluster-slave3 --net netgroup --ip 172.18.0.5 daocloud.io/library/centos /usr/sbin/init
那位博主還很貼心的解釋了在Centos7下使用簡單方式創(chuàng)建容器后遇到sshd啟動失敗的問題向族,所以需要添加參數(shù)--privileged和-v /sys/fs/cgroup:/sys/fs/cgroup,并在啟動的時候運行/usr/sbin/init.
在每一容器上部署Openssh
#cluster-master需要修改配置文件(特殊)
#cluster-master
#換源
[root@cluster-master /]# yum -y install wget
[root@cluster-master /]# mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
[root@cluster-master /]# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS7-Base-163.repo
[root@cluster-master /]# yum makecache
#安裝openssh
[root@cluster-master /]# yum -y install openssh openssh-server openssh-clients
[root@cluster-master /]# systemctl start sshd
####ssh自動接受新的公鑰
####master設(shè)置ssh登錄自動添加kown_hosts
[root@cluster-master /]# vi /etc/ssh/ssh_config
將原來的StrictHostKeyChecking ask
設(shè)置StrictHostKeyChecking為no
保存
[root@cluster-master /]# systemctl restart sshd
接著分別對slaves安裝openssh
[root@cluster-slave1 /]# yum -y install wget
[root@cluster-slave1 /]# mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
[root@cluster-slave1 /]# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS7-Base-163.repo
[root@cluster-slave1 /]# yum makecache
#安裝openssh
[root@cluster-slave1 /]#yum -y install openssh openssh-server openssh-clients
[root@cluster-slave1 /]# systemctl start sshd
分別在cluster-slave2,cluster-slave3重復(fù)以上步驟棠绘,還是貼下代碼把
#cluster-slave2
[root@cluster-slave2 /]# yum -y install wget
[root@cluster-slave2 /]# mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
[root@cluster-slave2 /]# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS7-Base-163.repo
[root@cluster-slave2 /]# yum makecache
#安裝openssh
[root@cluster-slave2 /]#yum -y install openssh openssh-server openssh-clients
[root@cluster-slave2 /]# systemctl start sshd
[root@cluster-slave3 /]# yum -y install wget
[root@cluster-slave3 /]# mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
[root@cluster-slave3 /]# wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.163.com/.help/CentOS7-Base-163.repo
[root@cluster-slave3 /]# yum makecache
#安裝openssh
[root@cluster-slave3 /]#yum -y install openssh openssh-server openssh-clients
[root@cluster-slave3 /]# systemctl start sshd
有些人可能會問件相,為什么不用ansible進行操作呢?
這里有必要進行解釋一下弄唧,ansible是給予sshd服務(wù)對若干個slave子主機進行操作的适肠,如果沒有openssh的免密碼登錄,它什么都不是候引;如果覺得麻煩侯养,可以去docker pull一個包含sshd服務(wù)centos-sshd的鏡像,或者自己寫Dockerfile進行構(gòu)建順便換源澄干,這里僅僅介紹一下原生的鏡像搭建Hadoop集群.
cluster-master公鑰分發(fā)
在master機上執(zhí)行ssh-keygen -t rsa并一路回車逛揩,完成之后會生成~/.ssh目錄柠傍,目錄下有id_rsa(私鑰文件)和id_rsa.pub(公鑰文件),再將id_rsa.pub重定向到文件authorized_keys
ssh-keygen -t rsa
#一路回車
[root@cluster-master /]# cat ~/.ssh/id_rsa.pub > ~/.ssh/authorized_keys
文件生成之后用scp將公鑰文件分發(fā)到集群slave主機
[root@cluster-master /]# ssh root@cluster-slave1 'mkdir ~/.ssh'
[root@cluster-master /]# scp ~/.ssh/authorized_keys root@cluster-slave1:~/.ssh
[root@cluster-master /]# ssh root@cluster-slave2 'mkdir ~/.ssh'
[root@cluster-master /]# scp ~/.ssh/authorized_keys root@cluster-slave2:~/.ssh
[root@cluster-master /]# ssh root@cluster-slave3 'mkdir ~/.ssh'
[root@cluster-master /]# scp ~/.ssh/authorized_keys root@cluster-slave3:~/.ssh
分發(fā)完成之后測試(ssh root@cluster-slave1)是否已經(jīng)可以免輸入密碼登錄辩稽。另外本次實驗使用到了root用戶蹂喻,如果在其他用戶下使用免密碼登錄晕城,需要確保用戶對~/.ssh/authorized_keys文件有可操作權(quán)限悯舟。
Ansible安裝
為什么不用源碼安裝呢眯停?
原文博主使用了源碼安裝,并安裝到/opt/目錄下喷众,但是沒有給出正確的搭建過程各谚,缺少了很多ansible使用必要的文件(只是提到了hosts)比如 inventory文件需要配置[default]hosts文件目錄需要設(shè)置。
這里使用官方自帶安裝:
[root@cluster-master /]# yum -y install epel-release
[root@cluster-master /]# yum -y install ansible
#這樣的話ansible會被安裝到/etc/ansible目錄下
此時我們再去編輯ansible的hosts文件
vi /etc/ansible/hosts
hosts文件內(nèi)容如下
[cluster]
cluster-master
cluster-slave1
cluster-slave2
cluster-slave3
[master]
cluster-master
[slaves]
cluster-slave1
cluster-slave2
cluster-slave3
配置docker容器hosts
由于/etc/hosts文件在容器啟動時被重寫到千,直接修改內(nèi)容在容器重啟后不能保留昌渤,為了讓容器在重啟之后獲取集群hosts,使用了一種啟動容器后重寫hosts的方法憔四。
需要在~/.bashrc中追加以下指令
:>/etc/hosts
cat >>/etc/hosts<<EOF
127.0.0.1 localhost
172.18.0.2 cluster-master
172.18.0.3 cluster-slave1
172.18.0.4 cluster-slave2
172.18.0.5 cluster-slave3
EOF
source ~/.bashrc
使配置文件生效,可以看到/etc/hosts文件已經(jīng)被改為需要的內(nèi)容
[root@cluster-master ansible]# cat /etc/hosts
127.0.0.1 localhost
172.18.0.2 cluster-master
172.18.0.3 cluster-slave1
172.18.0.4 cluster-slave2
172.18.0.5 cluster-slave3
用ansible分發(fā).bashrc至集群slave下
ansible cluster -m copy -a "src=~/.bashrc dest=~/"
到這里好不容易到了搭建Hadoop這個步驟膀息,感覺博主理解的東西真的是好多啊從docker一直到ansible-playbook,真是不容易啊.
Hadoop
在集群中安裝openjdk
使用ansible在在集群中安裝openjdk
[root@cluster-master ansible]# ansible cluster -m yum -a "name=java-1.8.0-openjdk,java-1.8.0-openjdk-devel state=latest"
在cluster-master上安裝hadoop
將hadoop安裝包下載至/opt目錄下
這里采用Hadoop 2.x系列最穩(wěn)定的stable版本2.7.4 你可以選擇更新成hadoop-3.x beta版本 或者 hadoop 2.8.2,只要你能駕馭的了
[root@cluster-master opt]# wget http://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.7.4/hadoop-2.7.4.tar.gz
下載完成之后解壓安裝包,并創(chuàng)建鏈接文件
[root@cluster-master opt]# tar -xzvf hadoop-2.7.4.tar.gz [root@cluster-master opt]# ln -s hadoop-2.7.4 hadoop
設(shè)置java和hadoop環(huán)境變量(.bashrc)
# hadoop
export HADOOP_HOME=/opt/hadoop
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
#java
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.151-1.b12.el7_4.x86_64/
export PATH=$HADOOP_HOME/bin:$PATH
注意一下了赵,這里的JAVA_HOME的版本java-1.8.0-openjdk-1.8.0.151-1.b12.el7_4.x86_64/需要以實際情況為準潜支,你可以打開你的/usr/lib/jvm/目錄進行自行確認,此處為實際設(shè)置柿汛。
修改hadoop運行所需配置文件
掌握過Hadoop的朋友都知道毁腿,Hadoop集群搭建需要修改幾個必要的配置文件以及xml參數(shù)設(shè)置,此處不深究.
[root@cluster-master opt]# cd $HADOOP_HOME/etc/hadoop/
首先我們修改**core-site.xml**這個文件:vi core-site.xml
```bash
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<!-- file system properties -->
<property>
<name>fs.default.name</name>
<value>hdfs://cluster-master:9000</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>4320</value>
</property>
</configuration>
hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions.superusergroup</name>
<value>staff</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>cluster-master:9001</value>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>cluster-master:50030</value>
</property>
<property>
<name>mapreduce.jobhisotry.address</name>
<value>cluster-master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>cluster-master:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/jobhistory/done</value>
</property>
<property>
<name>mapreduce.intermediate-done-dir</name>
<value>/jobhisotry/done_intermediate</value>
</property>
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>true</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>cluster-master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>cluster-master:18040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cluster-master:18030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cluster-master:18025</value>
</property> <property>
<name>yarn.resourcemanager.admin.address</name>
<value>cluster-master:18141</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cluster-master:18088</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>86400</value>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>86400</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir-suffix</name>
<value>logs</value>
</property>
</configuration>
打包hadoop文件
將hadoop鏈接文件和hadoop-2.7.4打包成一個文件方便ansible分發(fā)到slave主機
[root@cluster-master opt]# tar -cvf hadoop-dis.tar hadoop hadoop-2.7.4
使用ansible-playbook分發(fā).bashrc和hadoop-dis.tar至slave主機
---
- hosts: cluster
tasks:
- name: copy .bashrc to slaves
copy: src=~/.bashrc dest=~/
notify:
- exec source
- name: copy hadoop-dis.tar to slaves
unarchive: src=/opt/hadoop-dis.tar dest=/opt
handlers:
- name: exec source
shell: source ~/.bashrc
不得不提的是苛茂,博主這個yaml文件寫的真的是不錯
將以上yaml保存為hadoop-dis.yaml,并執(zhí)行
[root@cluster-master opt]# ansible-playbook hadoop-dis.yaml
hadoop-dis.tar會自動解壓到slave主機的/opt目錄下
格式化namenode
[root@cluster-master opt]# hadoop namenode -format
此時如果你看到return 0 ;Sucessfully formatted等字樣說明HDFS集群格式化成功了.如有不成功,可先自行查找錯誤,實在不行可以在評論區(qū)留言.
啟動hadoop集群
到這一步已經(jīng)可以開始hadoop之旅了鸠窗,啟動比較簡單妓羊,在$HADOOP_HOME/sbin下有幾個啟動和停止的腳本如下:
[root@cluster-master opt]# cd $HADOOP_HOME/sbin
[root@cluster-master sbin]# ls -l
total 120
-rwxr-xr-x. 1 20415 101 2752 Aug 1 00:35 distribute-exclude.sh
-rwxr-xr-x. 1 20415 101 6452 Aug 1 00:35 hadoop-daemon.sh
-rwxr-xr-x. 1 20415 101 1360 Aug 1 00:35 hadoop-daemons.sh
-rwxr-xr-x. 1 20415 101 1640 Aug 1 00:35 hdfs-config.cmd
-rwxr-xr-x. 1 20415 101 1427 Aug 1 00:35 hdfs-config.sh
-rwxr-xr-x. 1 20415 101 2291 Aug 1 00:35 httpfs.sh
-rwxr-xr-x. 1 20415 101 3128 Aug 1 00:35 kms.sh
-rwxr-xr-x. 1 20415 101 4080 Aug 1 00:35 mr-jobhistory-daemon.sh
-rwxr-xr-x. 1 20415 101 1648 Aug 1 00:35 refresh-namenodes.sh
-rwxr-xr-x. 1 20415 101 2145 Aug 1 00:35 slaves.sh
-rwxr-xr-x. 1 20415 101 1779 Aug 1 00:35 start-all.cmd
-rwxr-xr-x. 1 20415 101 1471 Aug 1 00:35 start-all.sh
-rwxr-xr-x. 1 20415 101 1128 Aug 1 00:35 start-balancer.sh
-rwxr-xr-x. 1 20415 101 1401 Aug 1 00:35 start-dfs.cmd
-rwxr-xr-x. 1 20415 101 3734 Aug 1 00:35 start-dfs.sh
-rwxr-xr-x. 1 20415 101 1357 Aug 1 00:35 start-secure-dns.sh
-rwxr-xr-x. 1 20415 101 1571 Aug 1 00:35 start-yarn.cmd
-rwxr-xr-x. 1 20415 101 1347 Aug 1 00:35 start-yarn.sh
-rwxr-xr-x. 1 20415 101 1770 Aug 1 00:35 stop-all.cmd
-rwxr-xr-x. 1 20415 101 1462 Aug 1 00:35 stop-all.sh
-rwxr-xr-x. 1 20415 101 1179 Aug 1 00:35 stop-balancer.sh
-rwxr-xr-x. 1 20415 101 1455 Aug 1 00:35 stop-dfs.cmd
-rwxr-xr-x. 1 20415 101 3206 Aug 1 00:35 stop-dfs.sh
-rwxr-xr-x. 1 20415 101 1340 Aug 1 00:35 stop-secure-dns.sh
-rwxr-xr-x. 1 20415 101 1642 Aug 1 00:35 stop-yarn.cmd
-rwxr-xr-x. 1 20415 101 1340 Aug 1 00:35 stop-yarn.sh
-rwxr-xr-x. 1 20415 101 4295 Aug 1 00:35 yarn-daemon.sh
-rwxr-xr-x. 1 20415 101 1353 Aug 1 00:35 yarn-daemons.sh
到目前為止,集群已經(jīng)搭建完畢稍计,你可以嘗試./start-all.sh腳本啟動整個集群然后用jps看看集群所需服務(wù)是否全都啟動了躁绸,這里不在贅述.
我已經(jīng)打包好在dockerhub上面,歡迎下載.
docker pull awedocker/hadoop-master
docker pull awedocker/hadoop-slave1
docker pull awedocker/hadoop-slave2
docker pull awedocker/hadoop-slave3
記得按照教程劃分子網(wǎng).
轉(zhuǎn)載自:http://www.reibang.com/p/0c7b6de487ce 有大量改動 不喜勿噴