Spark SQL User Impersonation 功能需要在hive-site.xml 中把hive.server2.enable.doAs設(shè)置為true
我們使用的版本是
spark 2.4
hadoop 2.7.2
hive 1.2.1
我們來重現(xiàn)這個(gè)問題
- 我們先來啟動(dòng)sts
./sbin/start-thriftserver.sh --driver-memory 1G \
--executor-memory 1G --num-executors 1 \
--master yarn --deploy-mode client \
--conf "spark.executor.extraJavaOptions=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,a
ddress=4002" \
--driver-java-options "-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=4001"
我們使用了yarn的模式啟動(dòng)的sts ,并且driver和executor都開了個(gè)遠(yuǎn)程調(diào)式的端口
- 使用beeline 連接上sts , 并且創(chuàng)建一張表 . 插入一條數(shù)據(jù)進(jìn)去
./beeline -u jdbc:hive2://localhost:10000 -n james.xu
0: jdbc:hive2://localhost:10000> create table tmp.ttt ( name string , age int);
0: jdbc:hive2://localhost:10000> insert into tmp.ttt values ('rrr', 66) ;
在insert到tmp.ttt 這張測(cè)試表時(shí)就報(bào)錯(cuò)了
org.apache.hadoop.security.AccessControlException: Permission denied: user=james.xu, access=WRITE, inode="/user/hadoop/warehouse/tmp.db/ttt/.hive-staging_hive_2019-02-28_15-26-26_543_1953699866921090093-2/-ext-10000/_temporary/0/task_20190228152645_0001_m_000000/part-00000-c688ee6c-6d01-47d9-a4f4-c3bad434e3fe-c000":hadoop:supergroup:drwxrwxr-x
問題分析
- spark sql 在向目標(biāo)表寫數(shù)據(jù)的時(shí)候 , 默認(rèn)會(huì)在對(duì)應(yīng)的目錄下會(huì)產(chǎn)生.hive-staging的臨時(shí)文件. 當(dāng)臨時(shí)文件寫完后 . driver在做commit job時(shí). 會(huì)將.hive-staging這個(gè)臨時(shí)目錄下的數(shù)據(jù)文件move到tmp.db/ttt 下面. 上面截圖中堆棧報(bào)錯(cuò)的信息 就是在最后做commit job 時(shí)報(bào)錯(cuò)了.
我們看下面下ttt這個(gè)目錄下所有文件的權(quán)限, 前面幾個(gè)臨時(shí)文件權(quán)限貌似沒問題 . 后面幾個(gè)臨時(shí)文件的權(quán)限的owner居然是hadoop這個(gè)超級(jí)用戶的.
貌似問題出現(xiàn)在這里.so , 去先去driver中創(chuàng)建臨時(shí)文件夾的地方打個(gè)斷點(diǎn)看下
熟悉hadoop的同學(xué)知道 , 我們?cè)趏rg.apache.hadoop.hdfs.DFSClient primitiveMkdir 創(chuàng)建文件夾這個(gè)方法中打個(gè)斷點(diǎn) 遠(yuǎn)程連上sts 的driver
從上面圖片的方法調(diào)用堆椂针看到了doAs , 創(chuàng)建staging第一層目錄時(shí)有userGroupInfomation的信息所以沒問題
那我們繼續(xù)在executor中創(chuàng)建文件的地方打個(gè)斷點(diǎn)往下看
詳細(xì)的具體堆棧如下
Breakpoint reached at org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
Breakpoint reached
at org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:261)
at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:246)
at org.apache.spark.sql.hive.execution.HiveOutputWriter.<init>(HiveFileFormat.scala:123)
at org.apache.spark.sql.hive.execution.HiveFileFormat\$\$anon\$1.newInstance(HiveFileFormat.scala:103)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.newOutputWriter(FileFormatDataWriter.scala:120)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.<init>(FileFormatDataWriter.scala:108)
at org.apache.spark.sql.execution.datasources.FileFormatWriter\$.org\$apache\$spark\$sql\$execution\$datasources\$FileFormatWriter\$\$executeTask(FileFormatWriter.scala:233)
at org.apache.spark.sql.execution.datasources.FileFormatWriter\$\$anonfun\$write\$1.apply(FileFormatWriter.scala:169)
at org.apache.spark.sql.execution.datasources.FileFormatWriter\$\$anonfun\$write\$1.apply(FileFormatWriter.scala:168)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor\$TaskRunner\$\$anonfun\$10.apply(Executor.scala:402)
at org.apache.spark.util.Utils\$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor\$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor\$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
executor 端創(chuàng)建文件(_temporary/0/_temporary) 的是在方法HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter中完成的 , 在調(diào)用方法的堆棧中的中沒有看到doAs了 , 也就沒有了UserGroupInformation 的信息 .
此時(shí)臨時(shí)的數(shù)據(jù)文件是以hadoop超級(jí)用戶寫進(jìn)去的. 然后在driver端做commit job 時(shí). 又是用James.xu 這個(gè)用戶.此時(shí)就會(huì)出現(xiàn)一開始的那個(gè)報(bào)錯(cuò)信息. Premission denied.
hive mr 引擎分析
那為什么用 hive的 mr的引擎 就沒有這個(gè)問題呢? 那繼續(xù)在hive 的mr中打個(gè)斷點(diǎn)看下
在hive 的thrift server 中想要調(diào)式reduce 的話需要加以下的參數(shù) , 注意suspend=y, 需要阻塞reduce端程序的運(yùn)行等遠(yuǎn)程debug上去
set mapreduce.reduce.java.opts -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=4002
insert into tmp.ttt select "222" , count(1) from dev.af_student
同樣在創(chuàng)建文件的地方打了個(gè)斷點(diǎn)下面是詳細(xì)的調(diào)用堆棧信息.
Breakpoint reached at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:591)
Breakpoint reached
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:591)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:566)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:675)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1016)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1043)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1092)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:616)
at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.close(ExecReducer.java:278)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:453)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(AccessController.java:-1)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
我們看到在YarnChild運(yùn)行時(shí)中就會(huì)帶上UserGroupInformation的信息. 所以hive 的mr引擎沒有這個(gè)問題.
總結(jié)
hive 的mr 引擎在啟用 user impersonation 功能時(shí)沒問題的原因是 在reduce端運(yùn)行的程序 由 yarn child 起調(diào)時(shí)就已經(jīng)加上了userGroupInformation的信息. 所以對(duì)于文件的操作權(quán)限沒問題. 而spark 的thrift server 中executor在寫臨時(shí)文件的數(shù)據(jù)文件時(shí) 沒有doAs 也就沒有UGI的信息. 而是以本身起調(diào)程序的用戶來讀寫hdfs上的文件的, 而diver端讀寫hdfs文件又是由當(dāng)前這個(gè)submit user 來完成的. 這兩端的用戶不匹配 而導(dǎo)致的問題.所以就出現(xiàn)文章開頭的permission denied .在spark社區(qū)也看到了類似問題的issues