ambari 2.7.8 Hive on spark 配置問題解決

ambari 2.7.8 Hive on spark 配置問題解決

因為ambari默認(rèn)的引擎為tez拄轻,所以建議直接使用tez

ambari綁定的hive on spark 適配并不是特別好 記錄一下解決的問題


版本

Hadoop 3.1.1

Hive 3.1.0

Spark2 2.3.0


綜合解決

#解決hive 啟動使用spark引擎直接報錯 將spark包放到hive路徑下 使hive可以使用spark引擎
cp /usr/hdp/current/spark2-client/jars/spark-core_*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/scala-library*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/spark-network-common*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/spark-unsafe*.jar /usr/hdp/current/hive-server2-hive/lib/

cp /usr/hdp/current/spark2-client/jars/scala-reflect-*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/spark-launcher*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/spark-yarn*.jar /usr/hdp/current/hive-server2-hive/lib/

#Custome hive-site配置 (不這樣做 需要更改每臺機(jī)器的spark lib 更加麻煩)
spark.yarn.jars=hdfs://dwh-test01:8020/spark-jars/*

#上傳spark jar到hdfs
sudo -u hdfs hdfs dfs -mkdir /spark-jars
sudo -u hdfs hdfs dfs -chmod 777 /spark-jars
hadoop fs -put /usr/hdp/current/spark2-client/jars/*.jar /spark-jars/

# 執(zhí)行hdp-select versions將結(jié)果配置到 Custom mapred-site 網(wǎng)上有人配置在yarn钢坦,hive,spark 但是我配了都沒生效
hdp.version=3.1.5.0-152

#刪除spark自帶的錯誤的hive包
hadoop fs -rm /spark-jars/hive*.jar
hadoop fs -rm /spark-jars/spark-hive*.jar
#說明:(影響 HIVE ON YARN 的 INSERT語法)
hadoop fs -rm /spark-jars/hive-exec-1.21.2.3.1.5.0-152.jar
#說明:(影響 HIVE ON YARN 的 GROUP BY語法)
hadoop fs -rm /spark-jars/orc-core-1.4.4-nohive.jar

# hive-site取消勾選(影響 HIVE ON YARN 的 JOIN ON語法)
hive.mapjoin.optimized.hashtable=false;

#todo 指定是用spark作為引擎(不生效 暫時只能在sql里手動set)
hive.execution.engine=spark
#Advanced hive-interactive-site 關(guān)閉 刪除hive.execution.engine限制
Restricted session configs=hive.execution.mode

#測試:SQL執(zhí)行 手動測試
set hive.execution.engine=spark;
set hive.mapjoin.optimized.hashtable=false;


Error while processing statement: FAILED: Execution Error, return code 3

[42000][3] Error while processing statement: FAILED: 
Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. 
Spark job failed during runtime. Please check stacktrace for the root cause.


解決

  1. 如果任務(wù)已經(jīng)跑在yarn上 想辦法查看spark-history的日志 再進(jìn)一步排查
  2. 如果hive還無法跑在yarn上毛仪,查看ambari集成的服務(wù)的默認(rèn)日志路徑 /var/log/hive/.. 再進(jìn)一步排查


Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session b1d48791-b28a-446c-9900-2dc48e2c751a)'

024-03-05T18:02:35,272 ERROR [HiveServer2-Background-Pool: Thread-177]: operation.Operation (:()) - Error running hive query: 
2024-03-05T18:07:09,694 ERROR [HiveServer2-Background-Pool: Thread-104]: client.SparkClientImpl (:()) - Error while waiting for client to connect.
2024-03-05T18:07:09,715 ERROR [HiveServer2-Background-Pool: Thread-104]: spark.SparkTask (:()) - Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session b1d48791-b28a-446c-9900-2dc48e2c751a)'
2024-03-05T18:07:09,715 ERROR [HiveServer2-Background-Pool: Thread-104]: spark.SparkTask (:()) - Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session b1d48791-b28a-446c-9900-2dc48e2c751a)'
2024-03-05T18:07:09,715 ERROR [HiveServer2-Background-Pool: Thread-104]: ql.Driver (:()) - FAILED: command has been interrupted: during query execution: 
2024-03-05T18:09:07,775 ERROR [HiveServer2-Background-Pool: Thread-136]: client.SparkClientImpl (:()) - Timed out waiting for client to connect.
2024-03-05T18:09:07,779 ERROR [HiveServer2-Background-Pool: Thread-136]: spark.SparkTask (:()) - Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session 7e9ae27f-5a9a-4200-b8de-b6fff293612f)'
2024-03-05T18:09:07,779 ERROR [HiveServer2-Background-Pool: Thread-136]: spark.SparkTask (:()) - Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session 7e9ae27f-5a9a-4200-b8de-b6fff293612f)'
2024-03-05T18:09:07,780 ERROR [HiveServer2-Background-Pool: Thread-136]: ql.Driver (:()) - FAILED: Execution Error, return code 30041 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Failed to create Spark client for Spark session 7e9ae27f-5a9a-4200-b8de-b6fff293612f
2024-03-05T18:09:07,792 ERROR [HiveServer2-Background-Pool: Thread-136]: operation.Operation (:()) - Error running hive query: 
2024-03-05T18:21:49,052 ERROR [HiveServer2-Background-Pool: Thread-109]: client.SparkClientImpl (:()) - Timed out waiting for client to connect.
2024-03-05T18:21:49,072 ERROR [HiveServer2-Background-Pool: Thread-109]: spark.SparkTask (:()) - Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session b62a3365-ca4a-4ac8-aece-0a8db5a90cdf)'
2024-03-05T18:21:49,072 ERROR [HiveServer2-Background-Pool: Thread-109]: spark.SparkTask (:()) - Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create Spark client for Spark session b62a3365-ca4a-4ac8-aece-0a8db5a90cdf)'


解決

#解決hive 啟動使用spark引擎直接報錯 將spark包放到hive路徑下 使hive可以使用spark引擎
cp /usr/hdp/current/spark2-client/jars/spark-core_*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/scala-library*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/spark-network-common*.jar /usr/hdp/current/hive-server2-hive/lib/
cp /usr/hdp/current/spark2-client/jars/spark-unsafe*.jar /usr/hdp/current/hive-server2-hive/lib/



java.lang.NoSuchFieldError: SPARK_RPC_SERVER_ADDRESS

24/03/06 17:54:33 ERROR ApplicationMaster: Uncaught exception: 
org.apache.spark.SparkException: Exception thrown in awaitResult: 
    at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
    at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:345)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
    at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
    at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
    at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
    at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: java.util.concurrent.ExecutionException: Boxed Error
    at scala.concurrent.impl.Promise$.resolver(Promise.scala:59)
    at scala.concurrent.impl.Promise$.scala$concurrent$impl$Promise$$resolveTry(Promise.scala:51)
    at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
    at scala.concurrent.Promise$class.tryFailure(Promise.scala:112)
    at scala.concurrent.impl.Promise$DefaultPromise.tryFailure(Promise.scala:157)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:739)
Caused by: java.lang.NoSuchFieldError: SPARK_RPC_SERVER_ADDRESS
    at org.apache.hive.spark.client.rpc.RpcConfiguration.<clinit>(RpcConfiguration.java:48)
    at org.apache.hive.spark.client.RemoteDriver.<init>(RemoteDriver.java:138)
    at org.apache.hive.spark.client.RemoteDriver.main(RemoteDriver.java:536)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$4.run(ApplicationMaster.scala:721)
24/03/06 17:54:33 INFO ApplicationMaster: Deleting staging directory hdfs://HA-Namespace/user/hive/.sparkStaging/application_1709716929078_0004
24/03/06 17:54:33 INFO ShutdownHookManager: Shutdown hook called


解決

##Ambari web 設(shè)置 Custome hive-site配置hive on spark 使用hdfs的jar包
spark.yarn.jars=hdfs://dwh-test01:8020/spark-jars/*

#上傳 spark2的jar到hdfs
hadoop fs -put /usr/hdp/current/spark2-client/jars/*.jar /spark-jars/

#刪除spark2自帶的錯誤的hive包
hadoop fs -rm /spark-jars/hive*.jar
hadoop fs -rm /spark-jars/spark-hive*.jar


java.lang.NoSuchMethodError: org.apache.orc.OrcFile

24/03/07 12:00:33 ERROR RemoteDriver: Failed to run job 3ecd93be-704b-4f42-aa50-6fa7aec5d9cd
java.lang.NoSuchMethodError: org.apache.orc.OrcFile$ReaderOptions.useUTCTimestamp(Z)Lorg/apache/orc/OrcFile$ReaderOptions;
    at org.apache.hadoop.hive.ql.io.orc.OrcFile$ReaderOptions.useUTCTimestamp(OrcFile.java:94)
    at org.apache.hadoop.hive.ql.io.orc.OrcFile$ReaderOptions.<init>(OrcFile.java:70)
    at org.apache.hadoop.hive.ql.io.orc.OrcFile.readerOptions(OrcFile.java:100)
    at org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.isRawFormatFile(AcidUtils.java:2344)
    at org.apache.hadoop.hive.ql.io.AcidUtils$MetaDataFile.isRawFormat(AcidUtils.java:2339)
    at org.apache.hadoop.hive.ql.io.AcidUtils.parsedDelta(AcidUtils.java:1037)
    at org.apache.hadoop.hive.ql.io.AcidUtils.parseDelta(AcidUtils.java:1028)
    at org.apache.hadoop.hive.ql.io.AcidUtils.getChildState(AcidUtils.java:1347)
    at org.apache.hadoop.hive.ql.io.AcidUtils.getAcidState(AcidUtils.java:1163)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.processForWriteIds(HiveInputFormat.java:641)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.processPathsForMmRead(HiveInputFormat.java:605)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:495)
    at org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:789)
    at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:552)
    at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:200)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
    at org.apache.spark.rdd.RDD.getNumPartitions(RDD.scala:267)
    at org.apache.spark.api.java.JavaRDDLike$class.getNumPartitions(JavaRDDLike.scala:65)
    at org.apache.spark.api.java.AbstractJavaRDDLike.getNumPartitions(JavaRDDLike.scala:45)
    at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateMapInput(SparkPlanGenerator.java:215)
    at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateParentTran(SparkPlanGenerator.java:142)
    at org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:114)
    at org.apache.hadoop.hive.ql.exec.spark.RemoteHiveSparkClient$JobStatusJob.call(RemoteHiveSparkClient.java:359)
    at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:378)
    at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:343)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)


解決

解決辦法 參考上一步

錯誤原因:spark2自帶的錯誤的版本的hive包 產(chǎn)生了jar包沖突
刪掉hdfs上的orc-core-1.4.4-nohive.jar即可(前提是已經(jīng)配置了hive-site配置了spark.yarn.jars)



org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException

java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer cannot be cast to org.apache.hadoop.hive.ql.exec.vector.mapjoin.hashtable.VectorMapJoinTableContainer

24/03/07 10:31:32 INFO DAGScheduler: ResultStage 9 (Map 1) failed in 0.178 s due to Job aborted due to stage failure: Task 0 in stage 9.0 failed 4 times, most recent failure: Lost task 0.3 in stage 9.0 (TID 27, dwh-test03, executor 2): java.lang.IllegalStateException: Hit error while closing operators - failing tree: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:203)
    at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunctionResultList.closeRecordProcessor(HiveMapFunctionResultList.java:58)
    at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:96)
    at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:42)
    at scala.collection.Iterator$class.foreach(Iterator.scala:891)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
    at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
    at org.apache.spark.rdd.AsyncRDDActions$$anonfun$foreachAsync$1$$anonfun$apply$12.apply(AsyncRDDActions.scala:127)
    at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2190)
    at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:109)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerStringOperator.process(VectorMapJoinInnerStringOperator.java:384)
    at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
    at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.process(VectorSelectOperator.java:158)
    at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
    at org.apache.hadoop.hive.ql.exec.vector.VectorFilterOperator.process(VectorFilterOperator.java:136)
    at org.apache.hadoop.hive.ql.exec.Operator.vectorForward(Operator.java:965)
    at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:938)
    at org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
    at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.closeOp(VectorMapOperator.java:990)
    at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:732)
    at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.close(SparkMapRecordHandler.java:180)
    ... 15 more
Caused by: java.lang.NullPointerException
    at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerGenerateResultOperator.commonSetup(VectorMapJoinInnerGenerateResultOperator.java:119)
    at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinInnerStringOperator.process(VectorMapJoinInnerStringOperator.java:109)
    ... 27 more


24/03/07 10:31:32 WARN TaskSetManager: Lost task 1.0 in stage 9.0 (TID 22, dwh-test02, executor 1): java.lang.RuntimeException: Map operator initialization failed: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer cannot be cast to org.apache.hadoop.hive.ql.exec.vector.mapjoin.hashtable.VectorMapJoinTableContainer
    at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:124)
    at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:55)
    at org.apache.hadoop.hive.ql.exec.spark.HiveMapFunction.call(HiveMapFunction.java:30)
    at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
    at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$7$1.apply(JavaRDDLike.scala:186)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:801)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:109)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.ClassCastException: org.apache.hadoop.hive.ql.exec.persistence.MapJoinBytesTableContainer cannot be cast to org.apache.hadoop.hive.ql.exec.vector.mapjoin.hashtable.VectorMapJoinTableContainer
    at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.setUpHashTable(VectorMapJoinCommonOperator.java:493)
    at org.apache.hadoop.hive.ql.exec.vector.mapjoin.VectorMapJoinCommonOperator.completeInitializationOp(VectorMapJoinCommonOperator.java:462)
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:469)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:399)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:572)
    at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:524)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:385)
    at org.apache.hadoop.hive.ql.exec.spark.SparkMapRecordHandler.init(SparkMapRecordHandler.java:115)


解決

發(fā)現(xiàn)報錯的源碼在:src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/VectorMapJoinCommonOperator.java

private void setUpHashTable() {

HashTableImplementationType hashTableImplementationType = vectorDesc.getHashTableImplementationType();
switch (vectorDesc.getHashTableImplementationType()) {
case OPTIMIZED:
  {
    // Create our vector map join optimized hash table variation *above* the
    // map join table container.
    vectorMapJoinHashTable = VectorMapJoinOptimizedCreateHashTable.createHashTable(conf,
            mapJoinTables[posSingleVectorMapJoinSmallTable]);
  }
  break;

case FAST:
  {
    // Get our vector map join fast hash table variation from the
    // vector map join table container.
    VectorMapJoinTableContainer vectorMapJoinTableContainer =
            (VectorMapJoinTableContainer) mapJoinTables[posSingleVectorMapJoinSmallTable];
    vectorMapJoinHashTable = vectorMapJoinTableContainer.vectorMapJoinHashTable();
  }
  break;
default:
  throw new RuntimeException("Unknown vector map join hash table implementation type " + hashTableImplementationType.name());
}
LOG.info("Using " + vectorMapJoinHashTable.getClass().getSimpleName() + " from " + this.getClass().getSimpleName());

case FAST 的代碼是有問題的,發(fā)現(xiàn)可以通過修改參數(shù) 可以通過修改配置不走FAST

set hive.mapjoin.optimized.hashtable=false;

查看ambari web的參數(shù) 發(fā)現(xiàn)了這個參數(shù)在web上有

備注是:hive.mapjoin.optimized.hashtable
Whether Hive should use memory-optimized hash table for MapJoin.Only works on Tez,
because memory-optimized hashtable cannot be serialized.

翻譯:hive.mapjoin.optimized.hashtable
Hive是否應(yīng)該為MapJoin使用內(nèi)存優(yōu)化的哈希表。僅適用于Tez,
因為內(nèi)存優(yōu)化的hashtable無法序列化冲泥。

是專門給tez引擎的,直接取消勾選壁涎,重啟hive即可


參考

hive on spark 官方文檔

有一定指導(dǎo)作用凡恍,但也不能完全解決問題

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市怔球,隨后出現(xiàn)的幾起案子嚼酝,更是在濱河造成了極大的恐慌,老刑警劉巖竟坛,帶你破解...
    沈念sama閱讀 222,252評論 6 516
  • 序言:濱河連續(xù)發(fā)生了三起死亡事件闽巩,死亡現(xiàn)場離奇詭異钧舌,居然都是意外死亡,警方通過查閱死者的電腦和手機(jī)涎跨,發(fā)現(xiàn)死者居然都...
    沈念sama閱讀 94,886評論 3 399
  • 文/潘曉璐 我一進(jìn)店門洼冻,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人隅很,你說我怎么就攤上這事撞牢。” “怎么了叔营?”我有些...
    開封第一講書人閱讀 168,814評論 0 361
  • 文/不壞的土叔 我叫張陵普泡,是天一觀的道長。 經(jīng)常有香客問我审编,道長撼班,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 59,869評論 1 299
  • 正文 為了忘掉前任垒酬,我火速辦了婚禮砰嘁,結(jié)果婚禮上,老公的妹妹穿的比我還像新娘勘究。我一直安慰自己矮湘,他們只是感情好,可當(dāng)我...
    茶點故事閱讀 68,888評論 6 398
  • 文/花漫 我一把揭開白布口糕。 她就那樣靜靜地躺著缅阳,像睡著了一般。 火紅的嫁衣襯著肌膚如雪景描。 梳的紋絲不亂的頭發(fā)上十办,一...
    開封第一講書人閱讀 52,475評論 1 312
  • 那天,我揣著相機(jī)與錄音超棺,去河邊找鬼向族。 笑死,一個胖子當(dāng)著我的面吹牛棠绘,可吹牛的內(nèi)容都是我干的件相。 我是一名探鬼主播,決...
    沈念sama閱讀 41,010評論 3 422
  • 文/蒼蘭香墨 我猛地睜開眼氧苍,長吁一口氣:“原來是場噩夢啊……” “哼夜矗!你這毒婦竟也來了?” 一聲冷哼從身側(cè)響起让虐,我...
    開封第一講書人閱讀 39,924評論 0 277
  • 序言:老撾萬榮一對情侶失蹤紊撕,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后澄干,有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體逛揩,經(jīng)...
    沈念sama閱讀 46,469評論 1 319
  • 正文 獨居荒郊野嶺守林人離奇死亡柠傍,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內(nèi)容為張勛視角 年9月15日...
    茶點故事閱讀 38,552評論 3 342
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發(fā)現(xiàn)自己被綠了辩稽。 大學(xué)時的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片惧笛。...
    茶點故事閱讀 40,680評論 1 353
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖逞泄,靈堂內(nèi)的尸體忽然破棺而出患整,到底是詐尸還是另有隱情,我是刑警寧澤喷众,帶...
    沈念sama閱讀 36,362評論 5 351
  • 正文 年R本政府宣布各谚,位于F島的核電站,受9級特大地震影響到千,放射性物質(zhì)發(fā)生泄漏昌渤。R本人自食惡果不足惜,卻給世界環(huán)境...
    茶點故事閱讀 42,037評論 3 335
  • 文/蒙蒙 一憔四、第九天 我趴在偏房一處隱蔽的房頂上張望膀息。 院中可真熱鬧,春花似錦了赵、人聲如沸潜支。這莊子的主人今日做“春日...
    開封第一講書人閱讀 32,519評論 0 25
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽冗酿。三九已至,卻和暖如春络断,著一層夾襖步出監(jiān)牢的瞬間裁替,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 33,621評論 1 274
  • 我被黑心中介騙來泰國打工妓羊, 沒想到剛下飛機(jī)就差點兒被人妖公主榨干…… 1. 我叫王不留胯究,地道東北人。 一個月前我還...
    沈念sama閱讀 49,099評論 3 378
  • 正文 我出身青樓躁绸,卻偏偏與公主長得像,于是被迫代替她去往敵國和親臣嚣。 傳聞我的和親對象是個殘疾皇子净刮,可洞房花燭夜當(dāng)晚...
    茶點故事閱讀 45,691評論 2 361

推薦閱讀更多精彩內(nèi)容