一迎罗、創(chuàng)建maven項目
需提前自行選定項目工作目錄吊奢。
并在項目目錄下執(zhí)行命令:
mvn archetype:generate
如下圖所示時,按回車
下面解讀回車之后的操作
groupId: maven
artifactId:(項目名稱)WordCount
version: 1.0
'package' maven: : org.example
Y: : 按下回車確認,即可創(chuàng)建成功
二研儒、編輯項目文件
2.1在本地創(chuàng)建WordCount.java
在文件中添加如下代碼:
package org.example;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
public class WordCount{
public class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> {
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String[] strs = value.toString().split(" ");
for (String str : strs) {
context.write(new Text(str), new IntWritable(1));
}
}
}
public class WordCountReduce extends Reducer<Text,IntWritable,Text,IntWritable>{
protected void reduce(Text key,Iterable<IntWritable>values,Context context) throws IOException,InterruptedException{
int sum=0;
for(IntWritable val : values){
sum += val.get();
}
context.write(key,new IntWritable(sum));
}
}
public static void main (String[] args) throws Exception{
Configuration conf=new Configuration();
Job job=Job.getInstance(conf);
job.setJarByClass(WordCount.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
job.setMapperClass(WordCountMap.class);
job.setReducerClass(WordCountReduce.class);
FileInputFormat.addInputPath(job,new Path(args[0]));
FileOutputFormat.setOutputPath(job,new Path(args[1]));
job.waitForCompletion(true);
}
}
編輯好之后通過xftp上傳至節(jié)點
此處建議使用xftp上傳文件坟岔,由于maven的目錄過多谒兄,容易出錯
2.2修改依賴文件
修改pom.xml
下圖為該文件的初始內容
對其進行修改
將
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
修改為
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<java.version>1.8</java.version>
<hadoop.version>3.1.3</hadoop.version>
<log4j.version>1.2.14</log4j.version>
<junit.version>4.8.2</junit.version>
</properties>
將
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
刪除
復制下列內容到該文件
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>${log4j.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>${junit.version}</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>org.example.WordCount</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
</plugins>
</build>
注意!要在</project>之前進行以上操作I绺丁3衅!邻耕!
三、打包項目
進入項目根目錄燕鸽,如圖
執(zhí)行命令
mvn package
出現(xiàn)如上圖所示的界面兄世,即為打包成功!C嘣邸碘饼!
(如果有error報錯,請根據(jù)報錯內容檢查并修改程序文件悲伶,以及依賴文件艾恼。
這時候項目文件夾內多了一個 target 文件夾,文件夾內就有我們的 jar 包啦t镲薄钠绍!
四、運行測試
這里的話就和之前一樣花沉。
啟動 Hadoop
啟動 Yarn
在 hadoop 目錄下創(chuàng)建 test 文件夾柳爽、文件夾內創(chuàng)建 input output 二級目錄
在 input 文件夾內放入 input1.txt input2.txt
hadoop jar WordCount-1.0.jar /test/input /test/output
注意該操作要在jar包所在目錄下進行!
運行界面以及查看結果同之前一致