軟件準備
1. 編譯源碼包
(1) 解壓源碼包耍休,修改項目根目錄下的pom文件
根據自己使用的版本修改各組件的版本號催跪,以下是我修改后的pom文件:
<groupId>com.hortonworks</groupId>
<artifactId>shc</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<packaging>pom</packaging>
<properties>
<spark.version>2.3.0</spark.version>
<hbase.version>1.2.6</hbase.version>
<phoenix.version>4.14.0-HBase-1.2</phoenix.version>
<java.version>1.8</java.version>
</properties>
說明:
- 以上內容只是我修改的部分害幅,沒有修改的我沒有貼出來
- 我修改了version诚亚,那么在子模塊的pom中牵署,也要修改為一樣的version
以下是我修改后的兩個子模塊core和examples中的pom文件闹究,只修改了version
<parent>
<groupId>com.hortonworks</groupId>
<artifactId>shc</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>shc-core</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<packaging>jar</packaging>
<name>HBase Spark Connector Project Core</name>
<parent>
<groupId>com.hortonworks</groupId>
<artifactId>shc</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>shc-examples</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
<packaging>jar</packaging>
<name>HBase Spark Connector Project Examples</name>
(2) 編譯源碼
在源碼包根目錄下執(zhí)行mvn命令:
mvn install -DskipTests
執(zhí)行成功后,你的本地maven倉庫中已經有了這個項目的jar包
2. 創(chuàng)建測試shc的maven工程
(1) 新建maven工程珠插,在pom中引入我們編譯好的shc-core的依賴
注意惧磺,我們只需要shc-core的依賴
<dependency>
<groupId>com.hortonworks</groupId>
<artifactId>shc-core</artifactId>
<version>spark-2.3.0-hbase-1.2.6</version>
</dependency>
(2) 導入spark相關的依賴,并解決依賴沖突
# 以下spark的依賴包排除了hadoop-client包捻撑,因為與shc-core中的hadoop-client有版本沖突
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.3.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.3.0</version>
</dependency>
# 手動引入hadoop-client的依賴
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.7.6</version>
</dependency>
這里選擇引入hadoop2.7.6版本的hadoop-client是因為2.7版本的hadoop可以兼容全部版本的hbase磨隘,下圖為hbase官網的hadoop與hbase各版本的兼容性對照表:
(3) 引入hbase相關的依賴并解決依賴沖突
這里只需要排除掉沖突的依賴就可以了
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.2.6</version>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-mapreduce-client-core</artifactId>
</exclusion>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>1.2.6</version>
<exclusions>
<exclusion>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
</exclusion>
</exclusions>
</dependency>
(4) 把hdfs-site.xml、core-site.xml和hbase-site.xml放到項目的resources目錄下
(5) 其他
- 我在ideal中創(chuàng)建的這個maven工程顾患,加入了scala-sdk番捂,不再贅述
-
修改了項目架構,加入了scala主程序文件夾和測試文件夾
- 配置了maven相關的插件江解,加入了scala編譯插件
<build>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.4.2</version>
<executions>
<execution>
<id>scala-compile-first</id>
<phase>process-resources</phase>
<goals>
<goal>add-source</goal>
<goal>compile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>2.11.8</scalaVersion>
<recompileMode>incremental</recompileMode>
<useZincServer>true</useZincServer>
<args>
<arg>-unchecked</arg>
<arg>-deprecation</arg>
<arg>-feature</arg>
</args>
<javacArgs>
<javacArg>-source</javacArg>
<javacArg>1.8</javacArg>
<javacArg>-target</javacArg>
<javacArg>1.8</javacArg>
</javacArgs>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
<configuration>
<skip>true</skip>
<source>1.8</source>
<target>1.8</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-resources-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<!-- 設置jar包的主類 -->
<!--
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.bonc.rdpe.spark.hbase.SparkToHBase</mainClass>
</transformer>
</transformers>
-->
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
至此设预,開始測試shc之前的全部準備工作就做好了!