flink on yarn 測(cè)試環(huán)境搭建
軟件版本
hadoop 2.7.3
flink 0.8.0
jdk-8u171-linux-x64
tar -zxvf jdk-8u171-linux-x64.tar.gz
mv jdk1.8.0_171/ /opt/jdk
tar -zxvf hadoop-2.7.3.tar.gz
mv hadoop-2.7.3/ /opt/hadoop
tar -zxvf apache-maven-3.6.3-bin.tar.gz
mv apache-maven-3.6.3/ /opt/maven
vi /opt/maven/conf/settings.xml
<mirror>
? ? <id>aliyunmaven</id>
? ? <mirrorOf>*</mirrorOf>
? ? <name>阿里云公共倉庫</name>
? ? <url>https://maven.aliyun.com/repository/public</url>
</mirror>
vi .bashrc
export JAVA_HOME=/opt/jdk
export M2_HOME=/opt/maven
export HADOOP_HOME=/opt/hadoop
export FLINK_HOME=/opt/flink
export PATH=$PATH:$JAVA_HOME/bin:$M2_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$FLINK_HOME/bin
編譯 flink
tar -zxvf flink-0.8.0-src.tgz
cd flink-0.8.0
mvn clean install -DskipTests -Dhadoop.version=2.7.3
復(fù)制 flink on yarn 到 /opt 目錄
cp flink-dist/target/flink-0.8.0-bin/flink-yarn-0.8.0/? /opt/flink
配置 hdfs
mkdir /opt/namenode
mkdir /opt/datanode
vi /opt/hadoop/etc/hadoop/core-site.xml
? ? <property>
? ? ? ? <name>fs.defaultFS</name>
? ? ? ? <value>hdfs://localhost:9000</value>
? ? </property>
? ? <property>
? ? ? ? <name>file.blocksize</name>
? ? ? ? <value>134217728</value>
? ? </property>
? ? <property>
? ? ? ? <name>hadoop.tmp.dir</name>
? ? ? ? <value>/tmp/hadoop</value>
? ? </property>
vi /opt/hadoop/etc/hadoop/hdfs-site.xml
? ? <property>
? ? ? ? <name>dfs.replication</name>
? ? ? ? <value>1</value>
? ? </property>
? ? <property>
? ? ? ? <name>dfs.datanode.handler.count</name>
? ? ? ? <value>5</value>
? ? </property>
? ? <property>
? ? ? ? <name>dfs.namenode.handler.count</name>
? ? ? ? <value>5</value>
? ? </property>
? ? <property>
? ? ? ? <name>dfs.namenode.service.handler.count</name>
? ? ? ? <value>5</value>
? ? </property>
? ? <property>
? ? ? ? <name>dfs.namenode.name.dir</name>
? ? ? ? <value>file:///opt/namenode</value>
? ? </property>
? ? <property>
? ? ? ? <name>dfs.datanode.data.dir</name>
? ? ? ? <value>file:///opt/datanode</value>
? ? </property>
? ? <property>
? ? ? ? <name>dfs.image.compress</name>
? ? ? ? <value>true</value>
? ? </property>
格式化 hdfs
hdfs namenode -format
啟動(dòng) namenode? secondarynamenode? datanode
hadoop-daemon.sh start namenode
hadoop-daemon.sh start secondarynamenode
hadoop-daemon.sh start datanode
配置 yarn
cp /opt/hadoop/etc/hadoop/mapred-site.xml.template /opt/hadoop/etc/hadoop/mapred-site.xml
vi /opt/hadoop/etc/hadoop/mapred-site.xml
? ? <property>
? ? ? ? <name>mapreduce.framework.name</name>
? ? ? ? <value>yarn</value>
? ? </property>
vi /opt/hadoop/etc/hadoop/yarn-site.xml
? ? <property>
? ? ? ? <name>yarn.nodemanager.aux-services</name>
? ? ? ? <value>mapreduce_shuffle</value>
? ? </property>
? ? <property>
? ? ? ? <name>yarn.resourcemanager.hostname</name>
? ? ? ? <value>192.168.14.3</value>
? ? </property>
啟動(dòng) yarn session
yarn-session.sh -jm 2048 -tm 2048 -n 1
15:00:08,071 WARN? org.apache.hadoop.util.NativeCodeLoader? ? ? ? ? ? ? ? ? ? ? - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15:00:08,214 INFO? org.apache.flink.yarn.Utils? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Could not find HADOOP_CONF_PATH, using HADOOP_HOME.
15:00:08,214 INFO? org.apache.flink.yarn.Utils? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Found configuration using hadoop home.
15:00:08,260 INFO? org.apache.flink.yarn.Client? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Copy App Master jar from local filesystem and add to local environment
15:00:08,852 INFO? org.apache.hadoop.yarn.client.RMProxy? ? ? ? ? ? ? ? ? ? ? ? - Connecting to ResourceManager at /192.168.14.3:8032
Using values:
Container Count = 1
Jar Path = /opt/flink/lib/flink-dist-0.8.0-yarn-uberjar.jar
Configuration file = /opt/flink/conf/flink-conf.yaml
JobManager memory = 2048
TaskManager memory = 2048
TaskManager cores = 1
amCommand=$JAVA_HOME/bin/java -Xmx1638M? -Dlog.file="<LOG_DIR>/jobmanager-main.log" -Dlogback.configurationFile=file:logback.xml -Dlog4j.configuration=file:log4j.properties org.apache.flink.yarn.appMaster.ApplicationMaster? 1><LOG_DIR>/jobmanager-stdout.log 2><LOG_DIR>/jobmanager-stderr.log
15:00:09,107 INFO? org.apache.flink.yarn.Utils? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Copying from file:/opt/flink/lib/flink-dist-0.8.0-yarn-uberjar.jar to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/flink-dist-0.8.0-yarn-uberjar.jar
15:00:10,173 INFO? org.apache.flink.yarn.Utils? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Copying from /opt/flink/conf/flink-conf.yaml to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/flink-conf.yaml
15:00:10,193 INFO? org.apache.flink.yarn.Utils? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Copying from file:/opt/flink/conf/logback.xml to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/logback.xml
15:00:10,219 INFO? org.apache.flink.yarn.Utils? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Copying from file:/opt/flink/conf/log4j.properties to hdfs://localhost:9000/user/root/.flink/application_1588744655039_0008/log4j.properties
15:00:10,249 INFO? org.apache.flink.yarn.Client? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Submitting application master application_1588744655039_0008
15:00:10,282 INFO? org.apache.hadoop.yarn.client.api.impl.YarnClientImpl? ? ? ? - Submitted application application_1588744655039_0008
Flink JobManager is now running on centos1:6131
JobManager Web Interface: http://centos1:8088/proxy/application_1588744655039_0008/
Number of connected TaskManagers changed to 1. Slots available: 1
提交任務(wù)
hadoop fs -mkdir /input
hadoop fs -mkdir /output
hadoop fs -put /opt/flink/NOTICE /input
flink run -c org.apache.flink.examples.java.wordcount.WordCount? /opt/flink/examples/flink-java-examples-0.8.0-WordCount.jar hdfs:///input/NOTICE hdfs:///output/20200506eFound a yarn properties file (.yarn-properties) file, using "centos1:6131" to connect to the JobManager
05/06/2020 15:37:29: Job execution switched to status RUNNING
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to SCHEDULED
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to DEPLOYING
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to RUNNING
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to SCHEDULED
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to DEPLOYING
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to RUNNING
05/06/2020 15:37:29: CHAIN DataSource (at getTextDataSet(WordCount.java:141) (org.apache.flink.api.java.io.TextInputFormat)) -> FlatMap (FlatMap at main(WordCount.java:69)) -> Combine(SUM(1), at main(WordCount.java:72) (1/1) switched to FINISHED
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter:? )) (1/1) switched to SCHEDULED
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter:? )) (1/1) switched to DEPLOYING
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter:? )) (1/1) switched to RUNNING
05/06/2020 15:37:29: Reduce (SUM(1), at main(WordCount.java:72) (1/1) switched to FINISHED
05/06/2020 15:37:29: DataSink(CsvOutputFormat (path: hdfs:/output/20200506e, delimiter:? )) (1/1) switched to FINISHED
05/06/2020 15:37:29: Job execution switched to status FINISHED