默認情況下hive訪問不到子目錄下面數(shù)據(jù)的問題
我們通過把hdp的數(shù)據(jù)遷移到了cdh蒋情,然后發(fā)現(xiàn)一些hive表存儲的地方在用day進行分區(qū)之后埠况,在day分區(qū)的目錄之下,還有存在1/或者2/這樣的子目錄棵癣,我們發(fā)現(xiàn)通過分區(qū)條件來進行sql查詢辕翰,查不到任何數(shù)據(jù),可以通過設(shè)置配置
set hive.mapred.supports.subdirectories=true;
set mapred.input.dir.recursive=true;
來保證hive命令查詢的時候可以讀到分區(qū)目錄下面的子目錄的數(shù)據(jù)
sentry相關(guān)的權(quán)限
安裝sentry的相關(guān)文章
http://blog.xiaoxiaomo.com/2016/10/19/Sentry-%E9%80%9A%E8%BF%87Cloudera-Manager%E9%85%8D%E7%BD%AESentry/
我們也通過hadoop sentry 來管理權(quán)限狈谊, 需要把hdfs金蜀、hive刷后、hue相關(guān)的sentry關(guān)聯(lián)配置打開,然后通過hue里面的配置來設(shè)置 hive table 和 hdfs 的相關(guān)權(quán)限渊抄,但是如果不設(shè)置linux文件系統(tǒng)的權(quán)限尝胆,在使用load linux文件的時候,會出現(xiàn)錯誤
The required privileges: Server=server1->URI=file:///opt/logfiles/userlogsup/muserbehaviorlog/1020210137/20171124/201711241200.txt->action=*;
可以在hue里面選擇url护桦,然后手動填寫 file:///opt/logfiles/
并給予ALL權(quán)限含衔,可以解決上述的錯誤,所以除開 hive table 和 hdfs的文件二庵,還可以通過sentry來管理 linux 機器上的文件目錄
spark2找不到對應(yīng)的包
在CDH安裝了spark2之后贪染,執(zhí)行hive會出現(xiàn)錯誤提示
ls: 無法訪問/opt/cloudera/parcels/SPARK2/lib/spark2/lib/spark-assembly-*.jar: 沒有那個文件或目錄
是因為spark2修改了lib包的位置,改成了jars催享,修改hive的啟動腳本杭隙,把lib/spark-assembly-*.jar
改成 jars/*
,這樣啟動就不會出現(xiàn)這個錯誤了因妙。
hiveserver2死鎖的問題
今天在導入數(shù)據(jù)的時候痰憎,由于導入錯誤,直接對于正在導入的進程執(zhí)行了中斷攀涵,然后再去刪除或者查詢相應(yīng)的表的時候铣耘, 直接卡住不動了,查看日志以故,發(fā)現(xiàn)出現(xiàn)了死鎖蜗细,可以通過修改hive-site.xml文件
<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
把value改成false,關(guān)閉死鎖怒详,來解決這個問題
查詢hive出現(xiàn)數(shù)組越界的問題
在用hive查詢的時候炉媒,出現(xiàn)
http://cdh-master-244:8088/taskdetails.jsp?jobid=job_1511834313005_0092&tipid=task_1511834313005_0092_m_000000
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row
at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
... 8 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 9
at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluateLong(ConstantVectorExpression.java:99)
at org.apache.hadoop.hive.ql.exec.vector.expressions.ConstantVectorExpression.evaluate(ConstantVectorExpression.java:147)
at org.apache.hadoop.hive.ql.exec.vector.expressions.aggregates.VectorUDAFCount.aggregateInputSelection(VectorUDAFCount.java:96)
at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeBase.processAggregators(VectorGroupByOperator.java:148)
at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:322)
at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:866)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.processOp(VectorFilterOperator.java:111)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:98)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 9 more
可以通過增加配置
set hive.vectorized.execution.enabled=false;
來進行解決,相關(guān)的issue地址
https://issues.apache.org/jira/browse/HIVE-11933