HIVE安裝與使用-內嵌模式

一释簿、Hive簡介

什么是Hive
- Hive是基于Hadoop的一個數據倉庫工具，可以將結構化的數據文件映射為一張數據庫表论熙，并提供類SQL查詢功能隔盛。
- 本質是將SQL轉換為MapReduce程序旦签。
- Hive本身不存儲數據，完全依賴于HDFS和MapReduce肪跋，Hive可以將結構化的數據文件映射為一張數據庫表歧蒋，Hive中表純邏輯，就是表的元數據。而Hbase是物理表谜洽，定位是NoSQL萝映。
為什么使用Hive
- 操作接口采用類SQL語法，提供快速開發(fā)的能力阐虚。
- 避免了去寫MapReduce序臂，減少開發(fā)人員的學習成本。
- 擴展功能很方便敌呈。
Hive的特點
- 可擴展
  Hive可以自由的擴展集群的規(guī)模贸宏，一般情況下不需要重啟服務。
- 延展性
  Hive支持用戶自定義函數磕洪，用戶可以根據自己的需求來實現自己的函數吭练。
- 容錯
  良好的容錯性，節(jié)點出現問題SQL仍可完成執(zhí)行析显。
Hive的運行模式
- 內嵌模式
  將元數據保存在本地內嵌的 Derby 數據庫中鲫咽，這是使用Hive最簡單的方式。但是這種方式缺點也比較明顯谷异，因為一個內嵌的 Derby 數據庫每次只能訪問一個數據文件分尸，這也就意味著它不支持多會話連接。
- 本地模式
  這種模式是將元數據保存在本地獨立的數據庫中（一般是MySQL）歹嘹，這用就可以支持多會話和多用戶連接了箩绍。
- 遠程模式
  此模式應用于 Hive客戶端較多的情況。把MySQL數據庫獨立出來尺上，將元數據保存在遠端獨立的 MySQL服務中材蛛，避免了在每個客戶端都安裝MySQL服務從而造成冗余浪費的情況。

二怎抛、安裝與配置

首先要安裝hadoop
略卑吭。

下載hive
網址：http://hive.apache.org/downloads.html

[hadoop@master ~]$ wget http://www-eu.apache.org/dist/hive/stable-2/apache-hive-2.1.1-bin.tar.gz
[hadoop@master ~]$ tar -xvf apache-hive-2.1.1-bin.tar.gz 
[hadoop@master ~]$ cd apache-hive-2.1.1-bin
[hadoop@master apache-hive-2.1.1-bin]$ ls
bin  conf  examples  hcatalog  jdbc  lib  LICENSE  NOTICE  README.txt  RELEASE_NOTES.txt  scripts
[hadoop@master apache-hive-2.1.1-bin]$ pwd
/home/hadoop/apache-hive-2.1.1-bin

設置環(huán)境變量

[hadoop@master apache-hive-2.1.1-bin]$ vim ~/.bash_profile 
export HIVE_HOME=/home/hadoop/apache-hive-2.1.1-bin
export PATH=$HIVE_HOME/bin:$PATH
[hadoop@master apache-hive-2.1.1-bin]$ . ~/.bash_profile

內嵌模式

修改 Hive 配置文件
$HIVE_HOME/conf對應的是Hive的配置文件路徑,該路徑下的hive-site.xml是Hive工程的配置文件。默認情況下马绝，該文件并不存在豆赏，我們需要拷貝它的模版來實現：

[hadoop@master conf]$ cp hive-default.xml.template hive-site.xml

hive-site.xml 的主要配置有：

#該參數指定了 Hive 的數據存儲目錄，默認位置在 HDFS 上面的 /user/hive/warehouse 路徑下富稻。
 <property>
    <name>hive.metastore.warehouse.dir</name>
    <value>/user/hive/warehouse</value>
    <description>location of default database for the warehouse</description>
  </property>
  #該參數指定了 Hive 的數據臨時文件目錄掷邦，默認位置為 HDFS 上面的 /tmp/hive 路徑下。
  <property>
    <name>hive.exec.scratchdir</name>
    <value>/tmp/hive</value>
    <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
  </property>

修改 Hive 目錄下 /conf/hive-env.sh 文件

[hadoop@master conf]$ cp hive-env.sh.template hive-env.sh
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/home/hadoop/hadoop-2.7.3

# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=/home/hadoop/apache-hive-2.1.1-bin/conf

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=/home/hadoop/apache-hive-2.1.1-bin/lib

創(chuàng)建必要目錄

[hadoop@master ~]$ hdfs dfs -ls /
Found 3 items
drwx------   - hadoop supergroup          0 2017-04-06 18:01 /tmp
drwxr-xr-x   - hadoop supergroup          0 2017-04-06 17:58 /user
drwxr-xr-x   - hadoop supergroup          0 2017-04-06 17:58 /usr
[hadoop@master ~]$ hdfs dfs -ls /user
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2017-04-08 11:00 /user/hadoop
#創(chuàng)建目錄
[hadoop@master ~]$ hdfs dfs -mkdir -p /user/hive/warehouse
[hadoop@master ~]$ hdfs dfs -mkdir -p /tmp/hive
#賦予寫權限
[hadoop@master ~]$ hdfs dfs -chmod a+w /tmp/hive
[hadoop@master ~]$ hdfs dfs -chmod a+w /user/hive/warehouse

修改 io.tmpdir 路徑
同時唉窃，要修改 hive-site.xml 中所有包含 ${system:java.io.tmpdir} 字段的 value 即路徑耙饰，你可以自己新建一個目錄來替換它，例如 /home/Hadoop/cloud/apache-hive-2.1.1-bin/iotmp纹份。然后使用vim全局替換命令替換

#這里是本地路徑苟跪，不是hdfs路徑
[hadoop@master conf]$ mkdir /home/hadoop/apache-hive-2.1.1-bin/iotmp
#vim全局替換
%s#${system:java.io.tmpdir}#/home/hadoop/cloud/apache-hive-2.1.1-bin/iotmp#g  
#還需要將如下的system:刪除
 ${system:java.io.tmpdir}/${ system:user.name}

三廷痘、運行Hive

初始化

#首先要運行服務metastore
[hadoop@master apache-hive-2.1.1-bin]$ hive --service metastore
#初始化derby
[hadoop@master apache-hive-2.1.1-bin]$ schematool -initSchema -dbType derby
#啟動hive
[hadoop@master apache-hive-2.1.1-bin]$ hive
hive>

重新初始化derby需要刪除目錄:

[hadoop@master apache-hive-2.1.1-bin]$ rm -rf metastore_db/

創(chuàng)建數據庫

hive> create database db_hive_test;
OK
Time taken: 0.282 seconds

切換到新建數據庫并查看databases

hive> use db_hive_test;
OK
Time taken: 0.016 seconds
hive> show databases;
OK
db_hive_test
default
Time taken: 0.013 seconds, Fetched: 2 row(s)

創(chuàng)建測試表

hive> create table student(id int,name string) row format delimited fields terminated by '\t';
OK
Time taken: 0.4 seconds
hive> desc student;
OK
id                      int                                         
name                    string                                      
Time taken: 0.052 seconds, Fetched: 2 row(s)

裝載本地數據到Hive測試表

#先在本地創(chuàng)建測試文件student.txt
[hadoop@master hive]$ cat student.txt 
1   zhangsan
2   baiqio
333 aaadf
#上傳并加載測試文件到Hive表
hive> load data local inpath '~/hive/student.txt' into table db_hive_test.student;
FAILED: SemanticException Line 1:23 Invalid path ''~/hive/student.txt'': No files matching path file:/home/hadoop/apache-hive-2.1.1-bin/~/hive/student.txt
hive> load data local inpath 'hive/student.txt' into table db_hive_test.student;
FAILED: SemanticException Line 1:23 Invalid path ''hive/student.txt'': No files matching path file:/home/hadoop/apache-hive-2.1.1-bin/hive/student.txt
hive> load data local inpath '../hive/student.txt' into table db_hive_test.student;
Loading data to table db_hive_test.student
OK
Time taken: 0.675 seconds

操作student表

hive> select * from student;
OK
1   zhangsan
2   baiqio
333 aaadf
Time taken: 1.006 seconds, Fetched: 3 row(s)
hive> select * from student where id=1;
OK
1   zhangsan
Time taken: 0.445 seconds, Fetched: 1 row(s)

本地文件student.txt上傳到hdfs路徑

[hadoop@master hive]$ hdfs dfs -ls /user/hive/warehouse
Found 1 items
drwxrwxrwx   - hadoop supergroup          0 2017-04-08 14:37 /user/hive/warehouse/db_hive_test.db
[hadoop@master hive]$ hdfs dfs -ls /user/hive/warehouse/db_hive_test.db
Found 1 items
drwxrwxrwx   - hadoop supergroup          0 2017-04-08 14:47 /user/hive/warehouse/db_hive_test.db/student
[hadoop@master hive]$ hdfs dfs -ls /user/hive/warehouse/db_hive_test.db/student
Found 1 items
-rwxrwxrwx   2 hadoop supergroup         30 2017-04-08 14:47 /user/hive/warehouse/db_hive_test.db/student/student.txt

從HDFS文件導入數據到Hive

#先上傳文件
[hadoop@master hive]$ cat student.txt 
4   zhangsan
5   baiqio
6   aaadf
[hadoop@master hive]$ hdfs dfs -mkdir hive
[hadoop@master hive]$ hdfs dfs -put student.txt hive/student.txt2
#再導入數據
hive> load data inpath 'hive/student.txt2' into table student;
Loading data to table db_hive_test.student
OK
Time taken: 0.324 seconds
hive> select * from student;
OK
1   zhangsan
2   baiqio
333 aaadf
4   zhangsan
5   baiqio
6   aaadf
Time taken: 0.21 seconds, Fetched: 6 row(s)

Hive查詢結果導出到文件

#注意：第一行無分號
hive> insert overwrite local directory '/home/wyp/Documents/result'
hive> select * from test;
#還可以導出到HDFS文件系統
hive> insert overwrite directory '/home/wyp/Documents/result'
hive> select * from test;
#最好指定列分隔符
hive> insert overwrite local directory '/home/wyp/Documents/result'
hive> row format delimited
hive> fields terminated by '\t'
hive> select * from test;

將數據抽象成數據庫表后，對數據的操作和統計是非常方便的件已。

?著作權歸作者所有,轉載或內容合作請聯系作者

人面猴
序言：七十年代末笋额，一起剝皮案震驚了整個濱河市，隨后出現的幾起案子篷扩，更是在濱河造成了極大的恐慌兄猩，老刑警劉巖，帶你破解...
沈念sama閱讀 217,185評論 6贊 503
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件鉴未，死亡現場離奇詭異枢冤，居然都是意外死亡，警方通過查閱死者的電腦和手機铜秆，發(fā)現死者居然都...
沈念sama閱讀 92,652評論 3贊 393
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進店門淹真，熙熙樓的掌柜王于貴愁眉苦臉地迎上來，“玉大人连茧，你說我怎么就攤上這事核蘸。” “怎么了啸驯？”我有些...
開封第一講書人閱讀 163,524評論 0贊 353
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵客扎，是天一觀的道長。經常有香客問我罚斗，道長徙鱼，這世上最難降的妖魔是什么？我笑而不...
開封第一講書人閱讀 58,339評論 1贊 293
?港島之戀（遺憾婚禮）
正文為了忘掉前任针姿，我火速辦了婚禮疆偿，結果婚禮上，老公的妹妹穿的比我還像新娘搓幌。我一直安慰自己，他們只是感情好迅箩，可當我...
茶點故事閱讀 67,387評論 6贊 391
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布溉愁。她就那樣靜靜地躺著，像睡著了一般饲趋。火紅的嫁衣襯著肌膚如雪拐揭。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 51,287評論 1贊 301
城市分裂傳說
那天奕塑，我揣著相機與錄音堂污，去河邊找鬼。笑死龄砰，一個胖子當著我的面吹牛盟猖，可吹牛的內容都是我干的讨衣。我是一名探鬼主播，決...
沈念sama閱讀 40,130評論 3贊 418
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼式镐，長吁一口氣：“原來是場噩夢啊……” “哼反镇！你這毒婦竟也來了？” 一聲冷哼從身側響起娘汞，我...
開封第一講書人閱讀 38,985評論 0贊 275
萬榮殺人案實錄
序言：老撾萬榮一對情侶失蹤歹茶，失蹤者是張志新（化名）和其女友劉穎，沒想到半個月后你弦，有當地人在樹林里發(fā)現了一具尸體惊豺，經...
沈念sama閱讀 45,420評論 1贊 313
?護林員之死
正文獨居荒郊野嶺守林人離奇死亡，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內容為張勛視角年9月15日...
茶點故事閱讀 37,617評論 3贊 334
?白月光啟示錄
正文我和宋清朗相戀三年禽作，在試婚紗的時候發(fā)現自己被綠了尸昧。大學時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點故事閱讀 39,779評論 1贊 348
活死人
序言：一個原本活蹦亂跳的男人離奇死亡领迈，死狀恐怖彻磁，靈堂內的尸體忽然破棺而出，到底是詐尸還是另有隱情狸捅，我是刑警寧澤衷蜓，帶...
沈念sama閱讀 35,477評論 5贊 345
?日本核電站爆炸內幕
正文年R本政府宣布，位于F島的核電站尘喝，受9級特大地震影響磁浇，放射性物質發(fā)生泄漏。R本人自食惡果不足惜朽褪，卻給世界環(huán)境...
茶點故事閱讀 41,088評論 3贊 328
男人毒藥：我在死后第九天來索命
文/蒙蒙一置吓、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧缔赠，春花似錦衍锚、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 31,716評論 0贊 22
一樁弒父案戴质，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至踢匣，卻和暖如春告匠，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背离唬。一陣腳步聲響...
開封第一講書人閱讀 32,857評論 1贊 269
情欲美人皮
我被黑心中介騙來泰國打工后专，沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留，地道東北人输莺。一個月前我還...
沈念sama閱讀 47,876評論 2贊 370
代替公主和親
正文我出身青樓戚哎，卻偏偏與公主長得像裸诽，于是被迫代替她去往敵國和親。傳聞我的和親對象是個殘疾皇子建瘫，可洞房花燭夜當晚...
茶點故事閱讀 44,700評論 2贊 354

HIVE安裝與使用-內嵌模式

一释簿、Hive簡介

二怎抛、安裝與配置

三廷痘、運行Hive

推薦閱讀更多精彩內容