Centos7/Redhat7 部署CDH偽分布式Hadoop集群

Cloudera提供了一個(gè)可擴(kuò)展的菩浙,靈活的集成平臺(tái)锐墙,可以輕松管理企業(yè)中快速增長(zhǎng)的數(shù)據(jù)量和各種數(shù)據(jù)顶猜。 Cloudera產(chǎn)品和解決方案使您能夠部署和管理Apache Hadoop和相關(guān)項(xiàng)目汹碱,操縱和分析數(shù)據(jù)碧磅,并保持?jǐn)?shù)據(jù)的安全和受保護(hù)尊沸。

先決條件
Centos7.x主機(jī)一臺(tái)

Target
部署CDH偽分布式Hadoop集群應(yīng)用

部署好的版本

[root@localhost ~]# hadoop version
Hadoop 2.6.0-cdh5.13.1
Subversion http://github.com/cloudera/hadoop -r 0061e3eb8ab164e415630bca11d299a7c2ec74fd
Compiled by jenkins on 2017-11-09T16:34Z
Compiled with protoc 2.5.0
From source with checksum 16d5272b34af2d8a4b4b7ee8f7c4cbe
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.13.1.jar

偶然間查到了cdh官網(wǎng)的偽分布式安裝教程
這里做下筆記和記錄.


開(kāi)始部署
(筆者以Centos7.x為例)

1.JAVA環(huán)境

#到oracle.com下載jdk1.8.161
$ wget http://download.oracle.com/otn-pub/java/jdk/8u161-b12/2f38c3b165be4555a1fa6e98c45e0808/jdk-8u161-linux-x64.rpm?AuthParam=1516458261_e7574995a6546eeecbe0e4e901bc61a8

#上面這個(gè)網(wǎng)址可能會(huì)由于session live失效
#到官網(wǎng)重新download 即可
$ rpm -ivh jdk-8u161-linux-x64.rpm

Set the Java_Home

$ vim ~/.bashrc
#Add the JAVA_HOME
export JAVA_HOME=/usr/java/jdk1.8.0_161
#保存退出
$ source ~/.bashrc

2.Download the CDH 5 Package

$ wget http://archive.cloudera.com/cdh5/one-click-install/redhat/6/x86_64/cloudera-cdh-5-0.x86_64.rpm

$ yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
#For instructions on how to add a CDH 5 yum repository or build your own CDH 5 yum repository

3.Install CDH 5

#Add a repository key
$ rpm --import http://archive.cloudera.com/cdh5/redhat/7/x86_64/cdh/RPM-GPG-KEY-cloudera
#Install Hadoop in pseudo-distributed mode: To install Hadoop with YARN:
$ yum install hadoop-conf-pseudo -y  

4.Starting Hadoop

查看安裝好的文件默認(rèn)存放位置

[root@localhost ~]# rpm -ql hadoop-conf-pseudo
/etc/hadoop/conf.pseudo
/etc/hadoop/conf.pseudo/README
/etc/hadoop/conf.pseudo/core-site.xml
/etc/hadoop/conf.pseudo/hadoop-env.sh
/etc/hadoop/conf.pseudo/hadoop-metrics.properties
/etc/hadoop/conf.pseudo/hdfs-site.xml
/etc/hadoop/conf.pseudo/log4j.properties
/etc/hadoop/conf.pseudo/mapred-site.xml
/etc/hadoop/conf.pseudo/yarn-site.xml

無(wú)需改動(dòng),開(kāi)始部署

Step 1.格式化namenode hdfs namenode -format

[root@localhost ~]# hdfs namenode -format
18/01/21 00:13:39 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   user = root
..........................................
18/01/21 00:13:41 INFO common.Storage: Storage directory /var/lib/hadoop-hdfs/cache/root/dfs/name has been successfully formatted.
...........................................
18/01/21 00:13:41 INFO util.ExitUtil: Exiting with status 0
18/01/21 00:13:41 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
************************************************************/

Step 2: 啟動(dòng)HDFS集群
for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done

[root@localhost ~]# for x in `cd /etc/init.d ; ls hadoop-hdfs-*` ; do sudo service $x start ; done
starting datanode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-datanode-localhost.out
Started Hadoop datanode (hadoop-hdfs-datanode):            [  OK  ]
starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-localhost.out
Started Hadoop namenode:                                   [  OK  ]
starting secondarynamenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-secondarynamenode-localhost.out
Started Hadoop secondarynamenode:                          [  OK  ]
#為了確認(rèn)服務(wù)是否以及啟動(dòng),可以使用jps命令或者查看webUI:http://localhost:50070

Step 3: Create the directories needed for Hadoop processes.

建立Hadoop進(jìn)程所需的相關(guān)目錄
/usr/lib/hadoop/libexec/init-hdfs.sh

[root@localhost ~]#   /usr/lib/hadoop/libexec/init-hdfs.sh
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /tmp'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 1777 /tmp'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var/log'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 1775 /var/log'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown yarn:mapred /var/log'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /tmp/hadoop-yarn'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R mapred:mapred /tmp/hadoop-yarn'
....................................
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/oozie/share/lib/sqoop'
+ ls '/usr/lib/hive/lib/*.jar'
+ ls /usr/lib/hadoop-mapreduce/hadoop-streaming-2.6.0-cdh5.13.1.jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -put /usr/lib/hadoop-mapreduce/hadoop-streaming*.jar /user/oozie/share/lib/mapreduce-streaming'
+ ls /usr/lib/hadoop-mapreduce/hadoop-distcp-2.6.0-cdh5.13.1.jar /usr/lib/hadoop-mapreduce/hadoop-distcp.jar
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -put /usr/lib/hadoop-mapreduce/hadoop-distcp*.jar /user/oozie/share/lib/distcp'
+ ls '/usr/lib/pig/lib/*.jar' '/usr/lib/pig/*.jar'
+ ls '/usr/lib/sqoop/lib/*.jar' '/usr/lib/sqoop/*.jar'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod -R 777 /user/oozie'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R oozie /user/oozie'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /var/lib/hadoop-hdfs/cache/mapred/mapred/staging'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chmod 1777 /var/lib/hadoop-hdfs/cache/mapred/mapred/staging'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown -R mapred /var/lib/hadoop-hdfs/cache/mapred'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -mkdir -p /user/spark/applicationHistory'
+ su -s /bin/bash hdfs -c '/usr/bin/hadoop fs -chown spark /user/spark/applicationHistory'

Step 4: Verify the HDFS File Structure:

確認(rèn)HDFS的目錄結(jié)構(gòu)hadoop fs -ls -R /

[root@localhost ~]#  sudo -u hdfs hadoop fs -ls -R /
drwxrwxrwx   - hdfs  supergroup          0 2018-01-20 16:42 /benchmarks
drwxr-xr-x   - hbase supergroup          0 2018-01-20 16:42 /hbase
drwxrwxrwt   - hdfs  supergroup          0 2018-01-20 16:41 /tmp
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn/staging
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn/staging/history
drwxrwxrwt   - mapred mapred              0 2018-01-20 16:42 /tmp/hadoop-yarn/staging/history/done_intermediate
drwxr-xr-x   - hdfs   supergroup          0 2018-01-20 16:44 /user
drwxr-xr-x   - mapred  supergroup          0 2018-01-20 16:42 /user/history
drwxrwxrwx   - hive    supergroup          0 2018-01-20 16:42 /user/hive
drwxrwxrwx   - hue     supergroup          0 2018-01-20 16:43 /user/hue
drwxrwxrwx   - jenkins supergroup          0 2018-01-20 16:42 /user/jenkins
drwxrwxrwx   - oozie   supergroup          0 2018-01-20 16:43 /user/oozie
................

Step 5: Start YARN

啟動(dòng)Yarn管理器

  • service hadoop-yarn-resourcemanager start
  • service hadoop-yarn-nodemanager start
  • service hadoop-mapreduce-historyserver start
[root@localhost ~]# service hadoop-yarn-resourcemanager start
starting resourcemanager, logging to /var/log/hadoop-yarn/yarn-yarn-resourcemanager-localhost.out
Started Hadoop resourcemanager:                            [  OK  ]
[root@localhost ~]# service hadoop-yarn-nodemanager start
starting nodemanager, logging to /var/log/hadoop-yarn/yarn-yarn-nodemanager-localhost.out
Started Hadoop nodemanager:                                [  OK  ]
[root@localhost ~]# service hadoop-mapreduce-historyserver start
starting historyserver, logging to /var/log/hadoop-mapreduce/mapred-mapred-historyserver-localhost.out
STARTUP_MSG:   java = 1.8.0_161
Started Hadoop historyserver:                              [  OK  ]

通過(guò)jps查看相關(guān)服務(wù)是否啟動(dòng).

[root@localhost ~]# jps
5232 ResourceManager
3425 SecondaryNameNode
5906 Jps
5827 JobHistoryServer
3286 NameNode
5574 NodeManager
3162 DataNode

Step 6: 創(chuàng)建用戶(hù)目錄

[root@localhost ~]# sudo -u hdfs hadoop fs -mkdir /taroballs/
[root@localhost ~]# hadoop fs -ls /
Found 6 items
drwxrwxrwx   - hdfs  supergroup          0 2018-01-20 16:42 /benchmarks
drwxr-xr-x   - hbase supergroup          0 2018-01-20 16:42 /hbase
drwxr-xr-x   - hdfs  supergroup          0 2018-01-20 16:48 /taroballs
drwxrwxrwt   - hdfs  supergroup          0 2018-01-20 16:41 /tmp
drwxr-xr-x   - hdfs  supergroup          0 2018-01-20 16:44 /user
drwxr-xr-x   - hdfs  supergroup          0 2018-01-20 16:44 /var

在Yarn上運(yùn)行一個(gè)簡(jiǎn)單的例子

#首先在root用戶(hù)下建立個(gè)Input文件夾
[root@localhost ~]# hadoop fs -mkdir input
[root@localhost ~]# hadoop fs -ls /user/root/
Found 1 items
drwxr-xr-x   - root supergroup          0 2018-01-20 17:51 /user/root/input
#然后put一些東西上去
[root@localhost ~]# hadoop fs -put /etc/hadoop/conf/*.xml input/
[root@localhost ~]# hadoop fs -ls input/
Found 4 items
-rw-r--r--   1 root supergroup       2133 2018-01-20 17:54 input/core-site.xml
-rw-r--r--   1 root supergroup       2324 2018-01-20 17:54 input/hdfs-site.xml
-rw-r--r--   1 root supergroup       1549 2018-01-20 17:54 input/mapred-site.xml
-rw-r--r--   1 root supergroup       2375 2018-01-20 17:54 input/yarn-site.xml

Set HADOOP_MAPRED_HOME

#Set HADOOP_MAPRED_HOME
[root@localhost ~]# vim ~/.bashrc
#Add the HADOOP_MAPRED_HOME
export HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
#保存退出
[root@localhost ~]# source ~/.bashrc

運(yùn)行Hadoop MR實(shí)例

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input outputroot23 'dfs[a-z.]+'

#運(yùn)行Hadoop simple
[root@localhost ~]# hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar grep input outputroot23 'dfs[a-z.]+' 
18/01/20 17:55:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
18/01/20 17:55:55 WARN mapreduce.JobResourceUploader: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
18/01/20 17:55:55 INFO input.FileInputFormat: Total input paths to process : 4
18/01/20 17:56:44 INFO mapreduce.Job: Job job_1516438047064_0004 running in uber mode : false
18/01/20 17:56:44 INFO mapreduce.Job:  map 0% reduce 0%
18/01/20 17:56:51 INFO mapreduce.Job:  map 100% reduce 0%
18/01/20 17:56:59 INFO mapreduce.Job:  map 100% reduce 100%
18/01/20 17:56:59 INFO mapreduce.Job: Job job_1516438047064_0004 completed successfully
18/01/20 17:56:59 INFO mapreduce.Job: Counters: 49
    File System Counters
        FILE: Number of bytes read=330
        FILE: Number of bytes written=287357
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=599
        HDFS: Number of bytes written=244
        HDFS: Number of read operations=7
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=2
    Job Counters 
        Launched map tasks=1
        Launched reduce tasks=1
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=4358
        Total time spent by all reduces in occupied slots (ms)=4738
        Total time spent by all map tasks (ms)=4358
        Total time spent by all reduce tasks (ms)=4738
        Total vcore-milliseconds taken by all map tasks=4358
        Total vcore-milliseconds taken by all reduce tasks=4738
        Total megabyte-milliseconds taken by all map tasks=4462592
        Total megabyte-milliseconds taken by all reduce tasks=4851712
    Map-Reduce Framework
        Map input records=10
        Map output records=10
        Map output bytes=304
        Map output materialized bytes=330
        Input split bytes=129
        Combine input records=0
        Combine output records=0
        Reduce input groups=1
        Reduce shuffle bytes=330
        Reduce input records=10
        Reduce output records=10
        Spilled Records=20
        Shuffled Maps =1
        Failed Shuffles=0
        Merged Map outputs=1
        GC time elapsed (ms)=161
        CPU time spent (ms)=1320
        Physical memory (bytes) snapshot=328933376
        Virtual memory (bytes) snapshot=5055086592
        Total committed heap usage (bytes)=170004480
    Shuffle Errors
        BAD_ID=0
        CONNECTION=0
        IO_ERROR=0
        WRONG_LENGTH=0
        WRONG_MAP=0
        WRONG_REDUCE=0
    File Input Format Counters 
        Bytes Read=470
    File Output Format Counters 
        Bytes Written=244

Result

[root@localhost ~]# hadoop fs -ls 
Found 2 items
drwxr-xr-x   - root supergroup          0 2018-01-20 17:54 input
drwxr-xr-x   - root supergroup          0 2018-01-20 17:56 outputroot23
[root@localhost ~]# hadoop fs -ls outputroot23
Found 2 items
-rw-r--r--   1 root supergroup          0 2018-01-20 17:56 outputroot23/_SUCCESS
-rw-r--r--   1 root supergroup        244 2018-01-20 17:56 outputroot23/part-r-00000
[root@localhost ~]# hadoop fs -cat outputroot23/part-r-00000
1   dfs.safemode.min.datanodes
1   dfs.safemode.extension
1   dfs.replication
1   dfs.namenode.name.dir
1   dfs.namenode.checkpoint.dir
1   dfs.domain.socket.path
1   dfs.datanode.hdfs
1   dfs.datanode.data.dir
1   dfs.client.read.shortcircuit
1   dfs.client.file
[root@localhost ~]# 

大功告成~CDH偽分布式Hadoop集群搭建成功~如有勘誤敬拓,歡迎斧正~

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個(gè)濱河市裙戏,隨后出現(xiàn)的幾起案子乘凸,更是在濱河造成了極大的恐慌,老刑警劉巖累榜,帶你破解...
    沈念sama閱讀 216,843評(píng)論 6 502
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件营勤,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡壹罚,警方通過(guò)查閱死者的電腦和手機(jī)葛作,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 92,538評(píng)論 3 392
  • 文/潘曉璐 我一進(jìn)店門(mén),熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)猖凛,“玉大人赂蠢,你說(shuō)我怎么就攤上這事”嬗荆” “怎么了虱岂?”我有些...
    開(kāi)封第一講書(shū)人閱讀 163,187評(píng)論 0 353
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)菠红。 經(jīng)常有香客問(wèn)我第岖,道長(zhǎng),這世上最難降的妖魔是什么试溯? 我笑而不...
    開(kāi)封第一講書(shū)人閱讀 58,264評(píng)論 1 292
  • 正文 為了忘掉前任蔑滓,我火速辦了婚禮,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘烫饼。我一直安慰自己猎塞,他們只是感情好,可當(dāng)我...
    茶點(diǎn)故事閱讀 67,289評(píng)論 6 390
  • 文/花漫 我一把揭開(kāi)白布杠纵。 她就那樣靜靜地躺著荠耽,像睡著了一般。 火紅的嫁衣襯著肌膚如雪比藻。 梳的紋絲不亂的頭發(fā)上铝量,一...
    開(kāi)封第一講書(shū)人閱讀 51,231評(píng)論 1 299
  • 那天,我揣著相機(jī)與錄音银亲,去河邊找鬼慢叨。 笑死,一個(gè)胖子當(dāng)著我的面吹牛务蝠,可吹牛的內(nèi)容都是我干的拍谐。 我是一名探鬼主播,決...
    沈念sama閱讀 40,116評(píng)論 3 418
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼馏段,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼轩拨!你這毒婦竟也來(lái)了?” 一聲冷哼從身側(cè)響起院喜,我...
    開(kāi)封第一講書(shū)人閱讀 38,945評(píng)論 0 275
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤亡蓉,失蹤者是張志新(化名)和其女友劉穎,沒(méi)想到半個(gè)月后喷舀,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體砍濒,經(jīng)...
    沈念sama閱讀 45,367評(píng)論 1 313
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 37,581評(píng)論 2 333
  • 正文 我和宋清朗相戀三年硫麻,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了爸邢。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
    茶點(diǎn)故事閱讀 39,754評(píng)論 1 348
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡拿愧,死狀恐怖甲棍,靈堂內(nèi)的尸體忽然破棺而出,到底是詐尸還是另有隱情赶掖,我是刑警寧澤感猛,帶...
    沈念sama閱讀 35,458評(píng)論 5 344
  • 正文 年R本政府宣布,位于F島的核電站奢赂,受9級(jí)特大地震影響陪白,放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜膳灶,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 41,068評(píng)論 3 327
  • 文/蒙蒙 一咱士、第九天 我趴在偏房一處隱蔽的房頂上張望立由。 院中可真熱鬧,春花似錦序厉、人聲如沸锐膜。這莊子的主人今日做“春日...
    開(kāi)封第一講書(shū)人閱讀 31,692評(píng)論 0 22
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)道盏。三九已至,卻和暖如春文捶,著一層夾襖步出監(jiān)牢的瞬間荷逞,已是汗流浹背。 一陣腳步聲響...
    開(kāi)封第一講書(shū)人閱讀 32,842評(píng)論 1 269
  • 我被黑心中介騙來(lái)泰國(guó)打工粹排, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留种远,地道東北人。 一個(gè)月前我還...
    沈念sama閱讀 47,797評(píng)論 2 369
  • 正文 我出身青樓顽耳,卻偏偏與公主長(zhǎng)得像坠敷,于是被迫代替她去往敵國(guó)和親。 傳聞我的和親對(duì)象是個(gè)殘疾皇子射富,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 44,654評(píng)論 2 354

推薦閱讀更多精彩內(nèi)容