安裝準(zhǔn)備
- oozie-4.0.0-cdh5.3.6 http://archive.cloudera.com/cdh5/cdh/5/oozie-4.0.0-cdh5.3.6.tar.gz
- ext-2.2.zip http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
1. 解壓
[hadoop@hadoop131 software]$ tar zxvf oozie-4.0.0-cdh5.3.6.tar.gz -C ../bigdata/hadoop/
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ cd /opt/software/
[hadoop@hadoop131 software]$ cd ../bigdata/hadoop/oozie-4.0.0-cdh5.3.6/
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ ls
bin lib NOTICE.txt oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz src
conf libtools oozie-core oozie-server oozie.war
docs LICENSE.txt oozie-examples.tar.gz oozie-sharelib-4.0.0-cdh5.3.6.tar.gz release-log.txt
解壓oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz, 創(chuàng)建libext文件夾, 把解壓出來的jar包都放進(jìn)去
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ mkdir libext
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ tar zxvf oozie-hadooplibs-4.0.0-cdh5.3.6.tar.gz
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ cp oozie-4.0.0-cdh5.3.6/hadooplibs/hadooplib-2.5.0-cdh5.3.6.oozie-4.0.0-cdh5.3.6/* libext/
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ cp oozie-4.0.0-cdh5.3.6/hadooplibs/hadooplib-2.5.0-mr1-cdh5.3.6.oozie-4.0.0-cdh5.3.6/* libext/
將ext-2.2.zip,放入libext
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ cp /opt/software/ext-2.2.zip libext/
拷貝 Mysql 驅(qū)動(dòng)包到 libext 目錄下
[hadoop@hadoop131 etc]$ cp /opt/software/mysql-connector-java-5.1.47.jar /opt/bigdata/hadoop/oozie-4.0.0-cdh5.3.6/libext/
2.編輯配置文件
Hadoop2.7.3/etc/hadoop下
core-site.xml
<!-- Oozie Server的Hostname -->
## 允許哪些框架被oozie 代理victor用戶去操作hadoop,victor修改成自己的用戶名
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<!-- 允許被Oozie代理的用戶組 -->
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
mapred-site.xml
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop131:10020</value>
</property>
<!-- 配置 MapReduce JobHistory Server web ui 地址, 默認(rèn)端口19888 -->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop131:19888</value>
</property>
yarn-site.xml
<!-- 任務(wù)歷史服務(wù) -->
<property>
<name>yarn.log.server.url</name>
<value>http://hadoop131:19888/jobhistory/logs/</value>
</property>
分發(fā)配置(沒有xsync就用scp代替)
[hadoop@hadoop131 hadoop]$ cd ..
[hadoop@hadoop131 etc]$ ls
hadoop
[hadoop@hadoop131 etc]$ xsync hadoop/
oozie-4.0.0-cdh5.3.6/conf下
oozie-site.xml
<property>
<name>oozie.service.JPAService.jdbc.driver</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.url</name>
<value>jdbc:mysql://hadoop131:3306/oozie</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.username</name>
<value>pcadmin</value>
</property>
<property>
<name>oozie.service.JPAService.jdbc.password</name>
<value>000000</value>
</property>
<property>
<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
<value>*=/opt/bigdata/hadoop/hadoop-2.7.3/etc/hadoop</value>
<description>讓Oozie引用Hadoop的配置文件“*=”不能刪</description>
</property>
3 啟動(dòng)集群
[hadoop@hadoop131 etc]$ start-dfs.sh
[hadoop@hadoop132 zkData]$ start-yarn.sh
[hadoop@hadoop131 etc]$ mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /opt/bigdata/hadoop/hadoop-2.7.3/logs/mapred-hadoop-historyserver-hadoop131.out
[hadoop@hadoop131 etc]$ jps
9696 Jps
9141 DataNode
9030 NameNode
9655 JobHistoryServer
9436 NodeManager
4655 QuorumPeerMain
4. 創(chuàng)建oozie數(shù)據(jù)庫
[hadoop@hadoop131 hadoop]$ mysql -uroot -p000000
mysql> create database oozie DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
mysql> quit;
5. 初始化oozie
[hadoop@hadoop131 hadoop]$ cd /opt/bigdata/hadoop/oozie-4.0.0-cdh5.3.6/
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh sharelib create -fs hdfs://hadoop131:9000 -locallib oozie-sharelib-4.0.0-cdh5.3.6-yarn.tar.gz
打開hadoop文件管理頁面可以看到文件已經(jīng)生成了
6.創(chuàng)建oozie.sql
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh db create -run -sqlfile oozie.sql
打包項(xiàng)目,生成WAR包
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh prepare-war
報(bào)錯(cuò)
原因: 缺少unzip, 然后再次運(yùn)行缺少zip
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ sudo yum -y install unzip
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ sudo yum -y install zip
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ bin/oozie-setup.sh prepare-war
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
出現(xiàn)下面提示就OK了
New Oozie WAR file with added 'ExtJS library, JARs' at /opt/bigdata/hadoop/oozie-4.0.0-cdh5.3.6/oozie-server/webapps/oozie.war
INFO: Oozie is ready to be started
7. 啟動(dòng)oozie
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ bin/oozie-start.sh
打開hadoop131:11000, 已啟動(dòng)
======================================================
oozie啟動(dòng)錯(cuò)誤后無法關(guān)閉提示PID file found but no matching process was found. Stop aborted.
刪除目錄下oozie-server/temp/oozie.pid
======================================================
8. Oozie 調(diào)度 wordcount mapreduce
### 在oozie目錄下創(chuàng)建oozie-apps文件夾
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ mkdir oozie-apps
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ tar zxvf oozie-examples.tar.gz
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ cp examples/apps/map-reduce/ oozie-apps/
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ cd oozie-apps/
[hadoop@hadoop131 oozie-apps]$ mv map-reduce mr-wordcount
[hadoop@hadoop131 oozie-apps]$ cd mr-wordcount
[hadoop@hadoop131 mr-wordcount]$ vim job.properties
[hadoop@hadoop131 mr-wordcount]$ vim workflow.xml
[hadoop@hadoop131 mr-wordcount]$ mkdir lib
[hadoop@hadoop131 mr-wordcount]$ mkdir input
9.刪掉lib下的jar包
job.properties
#hdfs
nameNode=hdfs://hadoop131:9000
#yarn
jobTracker=hadoop132:8032
queueName=default
examplesRoot=oozie-apps
oozie.wf.application.path=${nameNode}/user/hadoop/${examplesRoot}/mr-wordcount/workflow.xml
outputDir=map-reduce
workflow.xml
<workflow-app xmlns="uri:oozie:workflow:0.5" name="mr-wordcount">
<start to="mr-node"/>
<action name="mr-node">
<map-reduce>
<job-tracker>${jobTracker}</job-tracker>
<name-node>${nameNode}</name-node>
<prepare>
<delete path="${nameNode}/user/hadoop/${examplesRoot}/mr-wordcount/output"/>
</prepare>
<configuration>
<property>
<name>mapred.job.queue.name</name>
<value>${queueName}</value>
</property>
<!--New API-->
<property>
<name>mapred.mapper.new-api</name>
<value>true</value>
</property>
<property>
<name>mapred.reducer.new-api</name>
<value>true</value>
</property>
<!--mapper class-->
<property>
<name>mapreduce.job.map.class</name>
<value>org.apache.hadoop.examples.WordCount$TokenizerMapper</value>
</property>
<property>
<name>mapreduce.map.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapreduce.map.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<!--reducer class-->
<property>
<name>mapreduce.job.reduce.class</name>
<value>org.apache.hadoop.examples.WordCount$IntSumReducer</value>
</property>
<property>
<name>mapreduce.job.output.key.class</name>
<value>org.apache.hadoop.io.Text</value>
</property>
<property>
<name>mapreduce.job.output.value.class</name>
<value>org.apache.hadoop.io.IntWritable</value>
</property>
<!--INPUT-->
<property>
<name>mapred.input.dir</name>
<value>${nameNode}/user/hadoop/${examplesRoot}/mr-wordcount/input</value>
</property>
<!--OUTPUT-->
<property>
<name>mapred.output.dir</name>
<value>${nameNode}/user/hadoop/${examplesRoot}/mr-wordcount/output</value>
</property>
</configuration>
</map-reduce>
<ok to="end"/>
<error to="fail"/>
</action>
<kill name="fail">
<message>Map/Reduce failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
</kill>
<end name="end"/>
</workflow-app>
把測(cè)試使用的數(shù)據(jù)復(fù)制到input文件夾下(略)
將hadoop的示例wordcount程序復(fù)制到lib文件夾下
[hadoop@hadoop131 mr-wordcount]$ cp /opt/bigdata/hadoop/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar lib/
#上傳到HDFS
[hadoop@hadoop131 mr-wordcount]$ cd ..
[hadoop@hadoop131 oozie-apps]$ cd ..
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ hdfs dfs -put oozie-apps/ /user/hadoop/
#執(zhí)行
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ bin/oozie job -oozie http://hadoop131:11000/oozie -config oozie-apps/mr-wordcount/job.properties -run
打開hadoop131:11000可看到正在執(zhí)行的任務(wù),點(diǎn)擊job log可以查看日志, 如果在MapReduce過程中發(fā)生的錯(cuò)誤在如下位置追蹤錯(cuò)誤(你打開jobHistoryServer了嗎?)
成功的截圖如下:
此時(shí)我們可以在HDFS里看到output內(nèi)容
#這里就不下載查看了
[hadoop@hadoop131 oozie-4.0.0-cdh5.3.6]$ hdfs dfs -cat /user/hadoop/oozie-apps/mr-wordcount/output/* | tail -10
yours, 1
yours; 1
yourself 15
yourself, 7
yourself,' 1
yourself.' 3
yourself; 3
yourself?' 1
youth, 2
youth. 1
======================================================
1.oozie運(yùn)行時(shí)logs報(bào)錯(cuò)JA009: Unknown rpc kind in rpc headerRPC_WRITABLE
原因是mr1與mr2jar包產(chǎn)生沖突
解決方法: 將libext文件夾下關(guān)于mr1的jar包刪除
2/oozie需要在hdfs和本機(jī)上有相同目錄, 可以xsync/scp oozie-apps目錄
3.各節(jié)點(diǎn)賬密配置路徑務(wù)必仔細(xì)核對(duì)