1. 安裝JDK
1.1 JDK安裝步驟
- 下載JDK安裝包(下載Linux系統(tǒng)的 .tar.gz 的安裝包)
https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html
- 更新Ubuntu源
sudo apt-get update
- 將JDK壓縮包解壓到Ubuntu系統(tǒng)中 /usr/local/ 中
sudo tar -zxvf jdk-8u251-linux-x64.tar.gz -C /usr/local/
- 將解壓的文件夾重命名為 jdk8
cd /usr/local/
sudo mv jdk1.8.0_251/ jdk8
- 添加到環(huán)境變量
cd /home/tarena/
sudo gedit .bashrc
在文件末尾添加如下內(nèi)容:
export JAVA_HOME=/usr/local/jdk8
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=.:$JAVA_HOME/bin:$PATH
source .bashrc
- 驗(yàn)證是否安裝成功
java -version
出現(xiàn)java的版本則證明安裝并添加到環(huán)境變量成功 java version "1.8.0_251"
2. 安裝Hadoop并配置偽分布式
2.1 Hadoop安裝配置步驟
- 安裝SSH
sudo apt-get install ssh
- 配置免登錄認(rèn)證,避免使用Hadoop時(shí)的權(quán)限問題
ssh-keygen -t rsa (輸入此條命令后一路回車)
cd ~/.ssh
cat id_rsa.pub >> authorized_keys
ssh localhost (發(fā)現(xiàn)并未讓輸入密碼即可連接)
exit (退出遠(yuǎn)程連接狀態(tài))
-
下載Hadoop 2.10(374M)
https://archive.apache.org/dist/hadoop/common/hadoop-2.10.0/hadoop-2.10.0.tar.gz
解壓到 /usr/local 目錄中,并將文件夾重命名為 hadoop佩迟,最后設(shè)置權(quán)限
sudo tar -zxvf hadoop-2.10.0.tar.gz -C /usr/local/
cd /usr/local
sudo mv hadoop-2.10.0/ hadoop2.10
sudo chown -R tarena hadoop2.10/
- 驗(yàn)證Hadoop
cd /usr/local/hadoop2.10/bin
./hadoop version (此處出現(xiàn)hadoop的版本)
- 設(shè)置JAVE_HOME環(huán)境變量
sudo gedit /usr/local/hadoop2.10/etc/hadoop/hadoop-env.sh
把原來的export JAVA_HOME=${JAVA_HOME}改為
export JAVA_HOME=/usr/local/jdk8
- 設(shè)置Hadoop環(huán)境變量
sudo gedit /home/tarena/.bashrc
在末尾追加
export HADOOP_HOME=/usr/local/hadoop2.10
export CLASSPATH=.:{JAVA_HOME}/lib:${HADOOP_HOME}/sbin:$PATH
export PATH=.:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
source /home/tarena/.bashrc
偽分布式配置,修改2個(gè)配置文件(core-site.xml 和 hdfs-site.xml)
修改core-site.xml
sudo gedit /usr/local/hadoop2.10/etc/hadoop/core-site.xml
添加如下內(nèi)容
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/hadoop2.10/tmp</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
- 修改hdfs-site.xml
sudo gedit /usr/local/hadoop2.10/etc/hadoop/hdfs-site.xml
添加如下內(nèi)容
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop2.10/tmp/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop2.10/tmp/dfs/data</value>
</property>
</configuration>
- 配置YARN - 1
cd /usr/local/hadoop2.10/etc/hadoop
cp mapred-site.xml.template mapred-site.xml
sudo gedit mapred-site.xml
添加如下配置
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
- 配置YARN - 2
sudo gedit yarn-site.xml
添加如下配置:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
- 執(zhí)行NameNode格式化
cd /usr/local/hadoop2.10/bin
./hdfs namenode -format
出現(xiàn) Storage directory /usr/local/hadoop2.10/tmp/dfs/name has been successfully formatted
則表示格式化成功
- 啟動(dòng)Hadoop所有組件
cd /usr/local/hadoop2.10/sbin
./start-all.sh
啟動(dòng)時(shí)可能會(huì)出現(xiàn)警告竿屹,直接忽略即可报强,不影響正常使用
- 啟動(dòng)成功后,可訪問Web頁面查看 NameNode 和 Datanode 信息拱燃,還可以在線查看 HDFS 中的文件
http://localhost:50070
- 查看Hadoop相關(guān)組件進(jìn)程
jps
會(huì)發(fā)現(xiàn)如下進(jìn)程
NameNode --- 50070
DataNode --- 50075
SecondaryNameNode --- 50090
ResourceManager --- 8088
NodeManager
- 測(cè)試 - 將本地文件上傳至hdfs
hadoop fs -put 一個(gè)本地的任意文件 /
hadoop fs -ls /
也可以在瀏覽器中Utilities->Browse the file system
查看
3. Hive安裝
3.1 詳細(xì)安裝步驟
- 下載hive安裝包(2.3.7版本)
http://us.mirrors.quenda.co/apache/hive/
- 解壓到 /usr/local/ 目錄下
sudo tar -zxvf apache-hive-2.3.7-bin.tar.gz -C /usr/local
- 給文件夾重命名
sudo mv /usr/local/apache-hive-2.3.7-bin /usr/local/hive2.3.7
- 設(shè)置環(huán)境變量
sudo gedit /home/tarena/.bashrc
在末尾添加如下內(nèi)容
export HIVE_HOME=/usr/local/hive2.3.7
export PATH=.:${HIVE_HOME}/bin:$PATH
- 刷新環(huán)境變量
source /home/tarena/.bashrc
- 下載并添加連接MySQL數(shù)據(jù)庫的jar包(8.0.19 Ubuntu Linux Ubuntu Linux 18.04)
下載鏈接: https://downloads.mysql.com/archives/c-j/
解壓后找到 mysql-connector-java-8.0.19.jar
將其拷貝到 /usr/local/hive2.3.7/lib
sudo cp -p mysql-connector-java-8.0.19.jar /usr/local/hive2.3.7/lib/
- 創(chuàng)建hive-site.xml配置文件
sudo touch /usr/local/hive2.3.7/conf/hive-site.xml
sudo gedit /usr/local/hive2.3.7/conf/hive-site.xml
并添加如下內(nèi)容
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
</property>
</configuration>
- 在hive配置文件中添加hadoop路徑
cd /usr/local/hive2.3.7/conf
sudo cp -p hive-env.sh.template hive-env.sh
sudo gedit /usr/local/hive2.3.7/conf/hive-env.sh
添加如下內(nèi)容:
HADOOP_HOME=/usr/local/hadoop2.10
export HIVE_CONF_DIR=/usr/local/hive2.3.7/conf
- hive元數(shù)據(jù)初始化
schematool -dbType mysql -initSchema
- 測(cè)試hive
hive
hive>show databases;
如果能夠正常顯示內(nèi)容秉溉,則hive安裝并配置完畢