sparksql之hive數(shù)據(jù)倉(cāng)庫(kù)安裝及配置

一邑雅、安裝概述

計(jì)劃使用sparksql組件從hive中讀取數(shù)據(jù)片橡,基于前三篇文章妈经,我已經(jīng)安裝好了hadoop淮野、spark和mysql捧书,對(duì)于我想通過(guò)sparksql來(lái)讀取hdfs上的數(shù)據(jù)來(lái)說(shuō),這三個(gè)軟件必不可少骤星。安裝hive數(shù)據(jù)倉(cāng)庫(kù)经瓷,還需要下載hive安裝包以及mysql的驅(qū)動(dòng)。

二洞难、mysql驅(qū)動(dòng)下載

  1. 下載地址:https://downloads.mysql.com/archives/c-j/
    我的mysql版本是5.7.17舆吮,官網(wǎng)上的驅(qū)動(dòng)選擇5.x即可
    image.png
  2. 利用xftp或xsecure工具上傳jar包到hadoop用戶的根目錄

三、hive下載

  1. hive版本的下載需要基于hadoop版本队贱,具體查看hive的官方說(shuō)明色冀,https://hive.apache.org/downloads.html
    image.png
  2. 下載地址:https://dlcdn.apache.org/hive/,我選擇的版本是2.3.9
    image.png

    image.png
  3. 利用xftp或xsecure工具上傳jar包到hadoop用戶的根目錄

四柱嫌、hive安裝

  1. 解壓hive到hadoop用戶的根目錄
[hadoop@hadoop01 ~]$ tar -zxvf apache-hive-2.3.9-bin.tar.gz
  1. 解壓mysql到hadoop用戶的根目錄
[hadoop@hadoop01 ~]$ tar -zxvf mysql-connector-java-5.1.49.tar.gz
  1. 將mysql的驅(qū)動(dòng)包拷貝到hive的lib目錄下
[hadoop@hadoop01 ~]$ cp mysql-connector-java-5.1.49/mysql-connector-java-5.1.49.jar apache-hive-2.3.9-bin/lib/
  1. 修改hive-env.sh
[hadoop@hadoop01 ~]$ echo $HADOOP_HOME
/home/hadoop/hadoop-2.10.1
[hadoop@hadoop01 ~]$ cd apache-hive-2.3.9-bin/conf/
[hadoop@hadoop01 conf]$ cp hive-env.sh.template hive-env.sh
[hadoop@hadoop01 conf]$ vim hive-env.sh

HADOOP_HOME=/home/hadoop/hadoop-2.10.1
  1. 修改hive-site.xml锋恬,hive-default.xml.template是hive-site.xml文件的示例文件,但是這個(gè)文件默認(rèn)的配置太多了编丘,不太方便与学,我決定新建這個(gè)文件
[hadoop@hadoop01 conf]$ vim hive-site.xml

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://172.16.100.26:3306/hive?createDatabaseIfNotExist=true</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>123456</value>
    </property>
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
    <property>
        <name>datanucleus.schema.autoCreateAll</name>
        <value>true</value>
    </property>
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/usr/local/warehouse</value>
    </property>
    <property>
         <name>hive.metastore.uris</name>
         <value>thrift://172.16.100.26:9083</value>
    </property>
</configuration>
  1. 在.bashrc中添加hive的環(huán)境變量
[hadoop@hadoop01 conf]$ vim ~/.bashrc

export JAVA_HOME=/usr/local/java
export HADOOP_HOME=/home/hadoop/hadoop-2.10.1
export SCALA_HOME=/home/hadoop/scala-2.12.2
export SPARK_HOME=/home/hadoop/spark-3.0.3-bin-hadoop2.7
export HIVE_HOME=/home/hadoop/apache-hive-2.3.9-bin
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SCALA_HOME/bin:$SPARK_HOME/bin:$HIVE_HOME/bin

[hadoop@hadoop01 conf]$ source ~/.bashrc
  1. 初始化hive的元數(shù)據(jù)
[hadoop@hadoop01 lib]$ cd ../bin
[hadoop@hadoop01 bin]$ ./schematool -initSchema -dbType mysql
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/apache-hive-2.3.9-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.10.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:    jdbc:mysql://172.16.100.26:3306/hive?createDatabaseIfNotExist=true
Metastore Connection Driver :    com.mysql.jdbc.Driver
Metastore connection User:   root
Mon Dec 06 09:17:23 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Mon Dec 06 09:17:23 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Initialization script completed
Mon Dec 06 09:17:24 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
schemaTool completed
  1. 登錄mysql數(shù)據(jù)庫(kù),查看是否通過(guò)hive-site.xml的配置創(chuàng)建了hive的源數(shù)據(jù)庫(kù)
[hadoop@hadoop01 lib]$ mysql -uhive -p123456
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 12
Server version: 5.7.36 MySQL Community Server (GPL)

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
mysql> use hive;
mysql> show tables;
+---------------------------+
| Tables_in_hive            |
+---------------------------+
| AUX_TABLE                 |
| BUCKETING_COLS            |
| CDS                       |
......
| TYPES                     |
| TYPE_FIELDS               |
| VERSION                   |
| WRITE_SET                 |
+---------------------------+
57 rows in set (0.01 sec)
  1. 啟動(dòng)hadoop服務(wù)
mysql> exit
Bye
[hadoop@hadoop01 lib]$ cd $HADOOP_HOME/sbin
[hadoop@hadoop01 sbin]$ ./stop-all.sh
[hadoop@hadoop01 sbin]$ ./start-dfs.sh
  1. 啟動(dòng)hive-server服務(wù)
[hadoop@hadoop01 sbin]$ nohup hive --service hiveserver2 &
  1. 進(jìn)入hive客戶端
[hadoop@hadoop01 sbin]$ hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/java/jdk1.8.0_311/bin:/usr/local/mysql/bin:/root/bin:/home/hadoop/hadoop-2.10.1/bin:/home/hadoop/hadoop-2.10.1/sbin:/home/hadoop/scala-2.12.2/bin:/home/hadoop/spark-3.0.3-bin-hadoop2.7/bin:/home/hadoop/hadoop-2.10.1/bin:/home/hadoop/hadoop-2.10.1/sbin:/home/hadoop/scala-2.12.2/bin:/home/hadoop/spark-3.0.3-bin-hadoop2.7/bin:/home/hadoop/apache-hive-2.3.9-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/apache-hive-2.3.9-bin/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.10.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/home/hadoop/apache-hive-2.3.9-bin/lib/hive-common-2.3.9.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
  1. hive安裝結(jié)果測(cè)試
hive> create database test;
OK
Time taken: 0.19 seconds
hive> show databases;
OK
default
test
Time taken: 0.017 seconds, Fetched: 2 row(s)
  1. hdfs界面驗(yàn)證嘉抓,http://172.16.100.26:50070/explorer.html#/usr/local/warehouse
    image.png
  2. 開(kāi)啟metastore服務(wù)索守,使用thrift協(xié)議保證spark與hive的通信
hive> exit
[hadoop@hadoop01 sbin]$ nohup hive --service metastore &
  1. 配置spark,使spark能夠讀取hive數(shù)據(jù)庫(kù)抑片,將hive-site.xml復(fù)制到spark的conf目錄下
[hadoop@hadoop01 sbin]$ cd $HIVE_HOME/conf
[hadoop@hadoop01 conf]$ cp hive-site.xml $SPARK_HOME/conf
  1. 通過(guò)spark-sql交互界面卵佛,查看能否訪問(wèn)hive數(shù)據(jù)庫(kù)表
[hadoop@hadoop01 conf]$ cd $SPARK_HOME/bin
[hadoop@hadoop01 bin]$ ./spark-sql 
...
spark-sql> show databases;
21/12/06 10:03:38 INFO CodeGenerator: Code generated in 154.101485 ms
21/12/06 10:03:38 INFO CodeGenerator: Code generated in 5.977354 ms
default
test
Time taken: 1.688 seconds, Fetched 2 row(s)
21/12/06 10:03:38 INFO SparkSQLCLIDriver: Time taken: 1.688 seconds, Fetched 2 row(s)

五、我遇到的問(wèn)題

  1. 在配置hive-site.xml時(shí)敞斋,已經(jīng)完成了初始化mysql的元數(shù)據(jù)庫(kù)级遭,但是我想重新修改一下數(shù)據(jù)庫(kù)配置信息。修改完成后渺尘,再次執(zhí)行./schematool -initSchema -dbType mysql挫鸽,提示我格式化失敗。
    解決方案鸥跟,使用mysql的root用戶丢郊,刪除已經(jīng)初始化后的數(shù)據(jù)庫(kù),重新再執(zhí)行即可医咨。
  2. 明文登錄mysql數(shù)據(jù)庫(kù)時(shí)枫匾,即直接將密碼寫在命令行上,會(huì)提示warning拟淮,可忽略這個(gè)安全提示
    mysql: [Warning] Using a password on the command line interface can be insecure.
    這樣操作干茉,才是正確的做法:
[hadoop@hadoop01 ~]$ mysql -uhive -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 31
Server version: 5.7.36 MySQL Community Server (GPL)

Copyright (c) 2000, 2021, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

不過(guò)無(wú)傷大雅,開(kāi)發(fā)環(huán)境

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
  • 序言:七十年代末很泊,一起剝皮案震驚了整個(gè)濱河市角虫,隨后出現(xiàn)的幾起案子沾谓,更是在濱河造成了極大的恐慌,老刑警劉巖戳鹅,帶你破解...
    沈念sama閱讀 212,454評(píng)論 6 493
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件均驶,死亡現(xiàn)場(chǎng)離奇詭異,居然都是意外死亡枫虏,警方通過(guò)查閱死者的電腦和手機(jī)妇穴,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 90,553評(píng)論 3 385
  • 文/潘曉璐 我一進(jìn)店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來(lái)隶债,“玉大人腾它,你說(shuō)我怎么就攤上這事∷蓝铮” “怎么了携狭?”我有些...
    開(kāi)封第一講書人閱讀 157,921評(píng)論 0 348
  • 文/不壞的土叔 我叫張陵,是天一觀的道長(zhǎng)回俐。 經(jīng)常有香客問(wèn)我逛腿,道長(zhǎng),這世上最難降的妖魔是什么仅颇? 我笑而不...
    開(kāi)封第一講書人閱讀 56,648評(píng)論 1 284
  • 正文 為了忘掉前任单默,我火速辦了婚禮,結(jié)果婚禮上忘瓦,老公的妹妹穿的比我還像新娘搁廓。我一直安慰自己,他們只是感情好耕皮,可當(dāng)我...
    茶點(diǎn)故事閱讀 65,770評(píng)論 6 386
  • 文/花漫 我一把揭開(kāi)白布境蜕。 她就那樣靜靜地躺著,像睡著了一般凌停。 火紅的嫁衣襯著肌膚如雪粱年。 梳的紋絲不亂的頭發(fā)上,一...
    開(kāi)封第一講書人閱讀 49,950評(píng)論 1 291
  • 那天罚拟,我揣著相機(jī)與錄音台诗,去河邊找鬼。 笑死赐俗,一個(gè)胖子當(dāng)著我的面吹牛拉队,可吹牛的內(nèi)容都是我干的。 我是一名探鬼主播阻逮,決...
    沈念sama閱讀 39,090評(píng)論 3 410
  • 文/蒼蘭香墨 我猛地睜開(kāi)眼粱快,長(zhǎng)吁一口氣:“原來(lái)是場(chǎng)噩夢(mèng)啊……” “哼!你這毒婦竟也來(lái)了?” 一聲冷哼從身側(cè)響起事哭,我...
    開(kāi)封第一講書人閱讀 37,817評(píng)論 0 268
  • 序言:老撾萬(wàn)榮一對(duì)情侶失蹤漫雷,失蹤者是張志新(化名)和其女友劉穎,沒(méi)想到半個(gè)月后慷蠕,有當(dāng)?shù)厝嗽跇?shù)林里發(fā)現(xiàn)了一具尸體珊拼,經(jīng)...
    沈念sama閱讀 44,275評(píng)論 1 303
  • 正文 獨(dú)居荒郊野嶺守林人離奇死亡食呻,尸身上長(zhǎng)有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點(diǎn)故事閱讀 36,592評(píng)論 2 327
  • 正文 我和宋清朗相戀三年流炕,在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了。 大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片仅胞。...
    茶點(diǎn)故事閱讀 38,724評(píng)論 1 341
  • 序言:一個(gè)原本活蹦亂跳的男人離奇死亡每辟,死狀恐怖,靈堂內(nèi)的尸體忽然破棺而出干旧,到底是詐尸還是另有隱情渠欺,我是刑警寧澤,帶...
    沈念sama閱讀 34,409評(píng)論 4 333
  • 正文 年R本政府宣布椎眯,位于F島的核電站挠将,受9級(jí)特大地震影響,放射性物質(zhì)發(fā)生泄漏编整。R本人自食惡果不足惜舔稀,卻給世界環(huán)境...
    茶點(diǎn)故事閱讀 40,052評(píng)論 3 316
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望掌测。 院中可真熱鬧内贮,春花似錦、人聲如沸汞斧。這莊子的主人今日做“春日...
    開(kāi)封第一講書人閱讀 30,815評(píng)論 0 21
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽(yáng)粘勒。三九已至竞端,卻和暖如春,著一層夾襖步出監(jiān)牢的瞬間庙睡,已是汗流浹背婶熬。 一陣腳步聲響...
    開(kāi)封第一講書人閱讀 32,043評(píng)論 1 266
  • 我被黑心中介騙來(lái)泰國(guó)打工, 沒(méi)想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留埃撵,地道東北人赵颅。 一個(gè)月前我還...
    沈念sama閱讀 46,503評(píng)論 2 361
  • 正文 我出身青樓,卻偏偏與公主長(zhǎng)得像暂刘,于是被迫代替她去往敵國(guó)和親饺谬。 傳聞我的和親對(duì)象是個(gè)殘疾皇子,可洞房花燭夜當(dāng)晚...
    茶點(diǎn)故事閱讀 43,627評(píng)論 2 350

推薦閱讀更多精彩內(nèi)容