安裝Hive
1.將 apache-hive-1.2.1-bin.tar.gz 解壓到指定的目錄嗽元,修改其名稱為hive。修改hive配置文件名稱hive-env.sh.template名稱為hive-env.sh掠兄,mv hive-env.sh.template hive-env.sh
2.配置hive-env.sh文件:配置HADOOP_HOME路徑像云,配置HIVE_CONF_DIR路徑
3.啟動Hive前,要確保Hadoop以啟動
4.基本操作:啟動hive?bin/hive蚂夕,查看數(shù)據(jù)庫show databases;?打開默認(rèn)數(shù)據(jù)庫?use default;?顯示default數(shù)據(jù)庫中的表?show tables;?顯示數(shù)據(jù)庫中有幾張表??show tables;?查看表的結(jié)構(gòu)?desc student;?創(chuàng)建一張表?create table student(id int, name string);?向表中插入數(shù)據(jù)??insert into student values(1000,"ss");?查詢表中數(shù)據(jù)?select * from student;?退出hive?quit;
MySql安裝
1. 切換為root賬戶 ,查看mysql是否安裝迅诬,如果安裝了,卸載mysql
2.如果為安裝婿牍,在安裝前要先卸載MariaDB數(shù)據(jù)庫侈贷,安裝mysql,列出所有被安裝的rpm package 等脂,rpm -qa | grep mariadb俏蛮,強(qiáng)制卸載rpm -e --nodeps + 名稱。
3.安裝MySql服務(wù)器?rpm -ivh MySQL-server-5.6.24-1.el6.x86_64.rpm
查看產(chǎn)生的隨機(jī)密碼?cat /root/.mysql_secret ,查看mysql狀態(tài)?service mysql status,啟動mysql?service mysql start
4.安裝MySql客戶端?rpm -ivh MySQL-client-5.6.24-1.el6.x86_64.rpm,鏈接mysql??mysql -uroot -p + 之前產(chǎn)生的隨機(jī)密碼慎菲,修改密碼SET PASSWORD=PASSWORD(‘xxxxxx’)嫁蛇;退出mysql exit
5.MySql中user表中主機(jī)配置锨并,配置只要是root用戶+密碼露该,在任何主機(jī)上都能登錄MySQL數(shù)據(jù)庫。
1.進(jìn)入mysql?mysql -uroot -p 密碼第煮,2.顯示數(shù)據(jù)庫?show databases;?3.使用mysql數(shù)據(jù)庫?use mysql;4.展示mysql數(shù)據(jù)庫中的所有表?show tables; 5.展示user表的結(jié)構(gòu)?desc user; 6.查詢user表?select User, Host, Password from user;7.修改user表解幼,把Host表內(nèi)容修改為%?update user set host='%' where host='localhost';?8.刪除root用戶的其他host?delete from user where Host='';?delete from user where Host='';delete from user where Host='';9.刷新flush privileges;10.退出?quit;
6.Hive元數(shù)據(jù)配置到MySql
解壓驅(qū)動包抑党,將驅(qū)動包拷貝到hive的lib目錄下,將root賬號退出撵摆,修改驅(qū)動包的所有者為用戶
7.配置Metastore到MySql底靠,在conf目錄下,創(chuàng)建vi hive-site.xml特铝,根據(jù)官方文檔配置參數(shù)暑中,拷貝數(shù)據(jù)到hive-site.xml文件中hive-site?官方文檔地址,
內(nèi)容如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
??<name>javax.jdo.option.ConnectionURL</name>
??<value>jdbc:mysql://主機(jī)名稱:端口號/metastore?createDatabaseIfNotExist=true</value>
??<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
??<name>javax.jdo.option.ConnectionDriverName</name>
??<value>com.mysql.jdbc.Driver</value>
??<description>Driver class name for a JDBC metastore</description>
</property>
<property>
??<name>javax.jdo.option.ConnectionUserName</name>
??<value>MySQL賬號</value>
??<description>username to use against metastore database</description>
</property>
<property>
??<name>javax.jdo.option.ConnectionPassword</name>
??<value>MySQL密碼</value>
??<description>password to use against metastore database</description>
</property>
</configuration>
配置完畢后鲫剿,如果啟動hive異常鳄逾,可以重新啟動虛擬機(jī)。(重啟后灵莲,別忘了啟動hadoop集群)
8.查詢后信息顯示配置?在hive-site.xml文件中添加如下配置信息雕凹,就可以實現(xiàn)顯示當(dāng)前數(shù)據(jù)庫,以及查詢表的頭信息配置政冻。<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
修改hive-site.xml?關(guān)閉元數(shù)據(jù)檢查?增加如下配置:
<property>
????<name>hive.metastore.schema.verification</name>
????<value>false</value>
</property>
3.Hive運行引擎Tez
1.解壓縮apache-tez-0.9.1-bin.tar.gz枚抵,修改名稱?mv apache-tez-0.9.1-bin/ tez-0.9.1
2.在Hive中配置Tez, 在hive-env.sh文件中添加tez環(huán)境變量配置和依賴包環(huán)境變量配置
# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=hadoop 安裝目錄
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=hive conf目錄
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export TEZ_HOME= #是你的tez的解壓目錄
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
????export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
????export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done
export HIVE_AUX_JARS_PATH=hadoop目錄common包中的lzojar包 位置$TEZ_JARS
3.在hive-site.xml文件中添加如下配置明场,更改hive計算引擎
<property>
????<name>hive.execution.engine</name>
????<value>tez</value>
</property>
4.配置Tez,在Hive的Conf目錄下汽摹,創(chuàng)建一個tez-site.xml文件,添加如下內(nèi)容
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name> ???<value>${fs.defaultFS}/tez/tez-0.9.1,${fs.defaultFS}/tez/tez-0.9.1/lib</value>
</property>
<property>
<name>tez.lib.uris.classpath</name> ??? <value>${fs.defaultFS}/tez/tez-0.9.1,${fs.defaultFS}/tez/tez-0.9.1/lib</value>
</property>
<property>
?????<name>tez.use.cluster.hadoop-libs</name>
?????<value>true</value>
</property>
<property>
?????<name>tez.history.logging.service.class</name> ???????<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
</configuration>
5.上傳Tez到集群
6.測試苦锨,啟動Hive竖慧,創(chuàng)建表,向表中插入數(shù)據(jù)逆屡,如果沒有報錯圾旨,就表示成功了,如果報
The NodeManager is killing your container. It sounds like you are trying to use hadoop streaming which is running as a child process of the map-reduce task. The NodeManager monitors the entire process tree of the task and if it eats up more memory than the maximum set in mapreduce.map.memory.mb or mapreduce.reduce.memory.mb respectively, we would expect the Nodemanager to kill the task, otherwise your task is stealing memory belonging to other containers, which you don't want.
解決方法:修改 hadoop - yarn-site.xml?
property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
關(guān)掉虛擬內(nèi)存檢查魏蔗。