http://dequn.github.io/2016/11/20/zeppelin-installation-and-settings/
Zeppelin 安裝配置的一些問(wèn)題
[](http://dequn.github.io/2016/11/20/zeppelin-installation-and-settings/#1-版本問(wèn)題)1.版本問(wèn)題
Zeppelin 從0.6.1版本開(kāi)始川梅,默認(rèn)是基于 Spark 2.x 和 Scala 2.11版本進(jìn)行編譯的,親測(cè) Zeppelin 0.6.2與 Spark 1.6.x 版本是不兼容的萍嬉,導(dǎo)致 Saprk Interpreters 不能正確運(yùn)行,如果需要安裝在老版本上的,需要自己從源碼編譯,可以指定 Spark跛锌、Hadoop等版本參數(shù),可以參考 [http://zeppelin.apache.org/docs/snapshot/install/build.html](http://zeppelin.apache.org/docs/snapshot/install/build.html) 届惋,如果是0.6.0版本髓帽,可與 Spark 1.6.x 之前的兼容運(yùn)行菠赚。
[](http://dequn.github.io/2016/11/20/zeppelin-installation-and-settings/#2-Phoenix-thin-連接問(wèn)題)2.Phoenix-thin 連接問(wèn)題
***2017年1月3號(hào)更新***:
Phoenix for Spark 2.x Integration的補(bǔ)丁已經(jīng)出來(lái)了,可以直接加載為DataFrame而不用通過(guò)JDBC的方式連接數(shù)據(jù)庫(kù)了郑藏,會(huì)獲得更高的效率衡查。Pheonix for Spark 2.x 版本的問(wèn)題可以參見(jiàn)[https://issues.apache.org/jira/browse/PHOENIX-3333](https://issues.apache.org/jira/browse/PHOENIX-3333),如何使用可以參見(jiàn)文章[Spark 連接 Phoenix 配置](http://dequn.github.io/2016/11/08/phoenix-spark-setting/)必盖。
Zeppelin 從0.6.0版本開(kāi)始支持 Phoenix 連接拌牲,[Phoenix](http://www.phoenix.apache.org/)默認(rèn)是在jdbc interpreter 中配置的,配置過(guò)程可以參考 [https://zeppelin.apache.org/docs/0.6.2/interpreter/jdbc.html#phoenix](https://zeppelin.apache.org/docs/0.6.2/interpreter/jdbc.html#phoenix) 歌粥,**注意一定要在Dependencies中添加artifact 依賴塌忽,如果從 maven遠(yuǎn)程庫(kù)下載太慢,可以直接填寫本地phoenix-<version>-thin-client.jar
文件路徑失驶,或者把 jar 文件復(fù)制到路徑ZEPPELIN_HOME/interpreter/jdbc
下土居。**
但是如果使用的是phoenix-thin 連接,會(huì)報(bào)錯(cuò)誤
No suitable driver found for [http://localhost:8765](http://localhost:8765/)
原因可以參見(jiàn) [https://github.com/apache/zeppelin/pull/1442](https://github.com/apache/zeppelin/pull/1442) 突勇,提供我已經(jīng)編譯好的 [zeppelin-jdbc-0.6.2.jar](http://obqjd695a.bkt.clouddn.com/zeppelin-jdbc-0.6.2.jar)装盯,替換掉 **ZEPPELIN_HOME/interpreter/jdbc
** 下邊對(duì)應(yīng)的同名文件即可。
[](http://dequn.github.io/2016/11/20/zeppelin-installation-and-settings/#文件下載-zeppelin-jdbc-0-6-2-jar)文件下載:[zeppelin-jdbc-0.6.2.jar](http://obqjd695a.bkt.clouddn.com/zeppelin-jdbc-0.6.2.jar)
[](http://dequn.github.io/2016/11/20/zeppelin-installation-and-settings/#3-zeppelin中用-scala-加載-jdbc-數(shù)據(jù)問(wèn)題)3.zeppelin中用 scala 加載 jdbc 數(shù)據(jù)問(wèn)題
***2017年1月3號(hào)更新***:
好久沒(méi)有使用甲馋,重新折騰了一下埂奈,發(fā)現(xiàn)**org.apache.hadoop.tracing.SpanReceiverHost.get(xxx)報(bào)錯(cuò)**是由于Zeppelin提供的Hadoop版本和Spark編譯時(shí)指定的版本不一致引起,只需要使用$SPARK_HOME/jars/hadoop-annotations-2.7.3.jar定躏、hadoop-auth-2.7.3.jar账磺、hadoop-common-2.7.3.jar替換掉$ZEPPELIN_HOME/lib下的對(duì)應(yīng)文件即可。具體可以參考[Zeppelin 0.6.2 使用spark2.x 的一些錯(cuò)誤處理](http://blog.csdn.net/lsshlsw/article/details/53768756)痊远。
剛開(kāi)始使用的是Spark 2.0.1垮抗,使用下面的代碼用 jdbc 讀取數(shù)據(jù)庫(kù)中的數(shù)據(jù),發(fā)現(xiàn)總是報(bào)錯(cuò)碧聪,第一個(gè)關(guān)于 xxx.hive.ql.xxx 的錯(cuò)誤冒版,在 interpreter 的配置中將zeppelin.spark.useHiveContext
項(xiàng)設(shè)置為false
即可,~~如果后面org.apache.hadoop.tracing.SpanReceiverHost.get(xxx)還繼續(xù)報(bào)錯(cuò)逞姿,可以 **升級(jí) Spark2.0.2試試** 辞嗡,我是無(wú)意在筆記本上使用 Spark2.0.2 發(fā)現(xiàn)的 。~~
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
val jdbcDf = spark.read
.format("jdbc")
.option("driver","org.apache.phoenix.queryserver.client.Driver")
.option("url","jdbc:phoenix:thin:url=http://localhost:8765;serialization=PROTOBUF")
.option("dbtable","bigjoy.imos")
.load()
java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
at org.apache.spark.sql.hive.client.HiveClientImpl.<init>(HiveClientImpl.scala:189)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.sql.hive.client.IsolatedClientLoader.createClient(IsolatedClientLoader.scala:258)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:359)
at org.apache.spark.sql.hive.HiveUtils$.newClientForMetadata(HiveUtils.scala:263)
at org.apache.spark.sql.hive.HiveSharedState.metadataHive$lzycompute(HiveSharedState.scala:39)
at org.apache.spark.sql.hive.HiveSharedState.metadataHive(HiveSharedState.scala:38)
at org.apache.spark.sql.hive.HiveSharedState.externalCatalog$lzycompute(HiveSharedState.scala:46)
at org.apache.spark.sql.hive.HiveSharedState.externalCatalog(HiveSharedState.scala:45)
at org.apache.spark.sql.hive.HiveSessionState.catalog$lzycompute(HiveSessionState.scala:50)
at org.apache.spark.sql.hive.HiveSessionState.catalog(HiveSessionState.scala:48)
at org.apache.spark.sql.hive.HiveSessionState$$anon$1.<init>(HiveSessionState.scala:63)
at org.apache.spark.sql.hive.HiveSessionState.analyzer$lzycompute(HiveSessionState.scala:63)
at org.apache.spark.sql.hive.HiveSessionState.analyzer(HiveSessionState.scala:62)
at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:49)
at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:64)
at org.apache.spark.sql.SparkSession.baseRelationToDataFrame(SparkSession.scala:382)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:143)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122)
... 47 elided
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)
... 69 more
Caused by: java.lang.reflect.InvocationTargetException: java.lang.NoSuchMethodError: org.apache.hadoop.tracing.SpanReceiverHost.get(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;)Lorg/apache/hadoop/tracing/SpanReceiverHost;
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521)
... 75 more
Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.tracing.SpanReceiverHost.get(Lorg/apache/hadoop/conf/Configuration;Ljava/lang/String;)Lorg/apache/hadoop/tracing/SpanReceiverHost;
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:634)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:619)
at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:149)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:169)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:354)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:104)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:140)
at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:146)
at org.apache.hadoop.hive.metastore.Warehouse.getWhRoot(Warehouse.java:159)
at org.apache.hadoop.hive.metastore.Warehouse.getDefaultDatabasePath(Warehouse.java:177)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB_core(HiveMetaStore.java:600)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:620)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:461)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:66)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:72)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:5762)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:199)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
... 80 more
[](http://dequn.github.io/2016/11/20/zeppelin-installation-and-settings/#參考)參考
[1]: [Zeppelin 源碼編譯](http://zeppelin.apache.org/docs/snapshot/install/build.html)[2]: [Zeppelin Phoenix Interpreter 配置](https://zeppelin.apache.org/docs/0.6.2/interpreter/jdbc.html#phoenix)[3]: [ZEPPELIN-1459: Zeppelin JDBC URL properties mangled](https://github.com/apache/zeppelin/pull/1442)[4]: [Zeppelin 0.6.2 使用spark2.x 的一些錯(cuò)誤處理](http://blog.csdn.net/lsshlsw/article/details/53768756)