不定期更新
收錄各種奇葩問(wèn)題
ambari安裝之后嗅蔬,啟動(dòng)hive MetaStore時(shí)報(bào)錯(cuò)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 293, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of 'export HIVE_CONF_DIR=/usr/hdp/current/hive-metastore/conf/conf.server ; /usr/hdp/current/hive-metastore/bin/schematool -initSchema -dbType mysql -userName hive -passWord [PROTECTED]' returned 1.
WARNING: Use "yarn jar" to launch YARN applications.
Metastore connection URL: jdbc:mysql://c6405.ambari.apache.org/hive?createDatabaseIfNotExist=true
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: hive
org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
*** schemaTool failed ***
Solution:
hive配置的mysql登陸密碼浪默,與mysql設(shè)置的hive用戶連接密碼不一致,修改mysql或hive配置的密碼肩碟,保持一致即可。
spark2.0 on yarn
1.jerseyNoClassDefFoundError
bin/spark-sql -driver-memory 10g --verbose --master yarn --packages com.databricks:spark-csv_2.10:1.3.0 --executor-memory 4g --num-executors 20 --executor-cores 2
16/05/09 13:15:21 INFO server.Server: jetty-8.y.z-SNAPSHOT
16/05/09 13:15:21 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4041
16/05/09 13:15:21 INFO util.Utils: Successfully started service 'SparkUI' on port 4041.
16/05/09 13:15:21 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://bigaperf116.svl.ibm.com:4041
Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/jersey/api/client/config/ClientConfig
at org.apache.hadoop.yarn.client.api.TimelineClient.createTimelineClient(TimelineClient.java:45)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:163)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
no issues
http://apache-spark-developers-list.1001551.n3.nabble.com/spark-2-0-issue-with-yarn-td17440.html
A temporary solution:
set yarn.timeline-service.enabled false to turn off ATS .
2.bad substitution
diagnostics: Application application_1441066518301_0013 failed 2 times due to AM Container for appattempt_1441066518301_0013_000002 exited with exitCode: 1
For more detailed output, check application tracking page:http://localhost:8088/cluster/app/application_1441066518301_0013Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e03_1441066518301_0013_02_000001
Exit code: 1
Exception message: /mnt/yarn/nm/local/usercache/stack/appcache/
application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/
launch_container.sh: line 24:$PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:
/usr/hdp/current/hadoop-client/*::$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:
/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-.6.0.${hdp.version}.jar:
/etc/hadoop/conf/secure: bad substitution
Stack trace: ExitCodeException exitCode=1: /mnt/yarn/nm/local/usercache/stack/appcache/application_1441066518301_0013/container_e03_1441066518301_0013_02_000001/launch_container.sh: line 24: $PWD:$PWD/__hadoop_conf__:$PWD/__spark__.jar:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
Solution:
此問(wèn)題一般是由于手工安裝組件而無(wú)法替換變量造成凸椿;
可修改 MapReduce2 組件配置項(xiàng) mapreduce.application.classpath 中的 ${hdp.version} 為 hdp 絕對(duì)路徑中的版本部分削祈,eg. 2.4.0.0-169。
服務(wù)啟動(dòng)報(bào)錯(cuò)ulimit -c unlimited
resource_management.core.exceptions.Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'ulimit -c unlimited ; /usrp/current/hadoop-client/sbin/hadoop-daemon.sh --config /usrp/current/hadoop-client/conf start namenode'' returned 1. -bash: line 0: ulimit: core file size: cannot modify limit: Operation not permitted
starting namenode, logging to ar/log/hadoopfs/hadoop-hdfs-namenode-wy1.jcloud.local.out
Solution:
CentOS7.1上啟動(dòng)HDFS的時(shí)候脑漫,在啟動(dòng)HDFS的namenode或者datanode的時(shí)候髓抑,非root啟動(dòng)的時(shí)候,會(huì)要求執(zhí)行ulimit -c unlimited這個(gè)命令优幸,但是執(zhí)行的時(shí)候是su稱hdfs帳號(hào)來(lái)啟動(dòng)吨拍,這時(shí)候因?yàn)閔dfs帳號(hào)沒(méi)有權(quán)限執(zhí)行這個(gè)命令,所以會(huì)導(dǎo)致HDFS的namenode或者datanode啟動(dòng)失敗网杆,處理這個(gè)問(wèn)題有一個(gè)辦法就是改Ambari的代碼羹饰,讓HDFS啟動(dòng)過(guò)程不要執(zhí)行ulimit -c unlimited命令。
需要修改的代碼是:
編輯文件:
/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/utils.py
把這一行:
cmd = format("{ulimit_cmd} {hadoop_daemon} —config {hadoop_conf_dir} {action} {name}")
中的{ulimit_cmd}刪除掉碳却,刪除之后重啟Ambari-agent即可队秩。
注冊(cè)host報(bào)錯(cuò)
ERROR 2016-08-01 13:33:38,932 main.py:309 - Fatal exception occurred:
Traceback (most recent call last):
File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 306, in
main(heartbeat_stop_callback)
File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 242, in main
stop_agent()
File "/usr/lib/python2.6/site-packages/ambari_agent/main.py", line 189, in stop_agent
sys.exit(1)
SystemExit: 1
Solution:
這是因?yàn)閍mbari默認(rèn)用的ascii編碼,如果你用中文版操作系統(tǒng)追城,可以在/usr/lib/python2.6/site-packages/ambari_agent/main.py 文件開(kāi)頭添加
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
然后再retry failed就可以了
如何刪除Ambari已有的服務(wù)
自定義服務(wù)SAMPLE后發(fā)現(xiàn)8080 web頁(yè)面中沒(méi)有刪除的方法
Solution:
- 停止服務(wù)
curl -u admin:admin -H "X-Requested-By: ambari" -X PUT -d '{"RequestInfo": {"context":"Stop Service"},"Body":{"ServiceInfo":{"state":"INSTALLED"}}}' http://localhost:8080/api/v1/clusters/hadoop/services/SAMPLE
SAMPLE服務(wù)因?yàn)閷?shí)際上沒(méi)干任何事刹碾,短暫時(shí)間后可能會(huì)自己又啟動(dòng),所以手速要快
- 刪除服務(wù)(快速立即執(zhí)行)
curl -u admin:admin -H "X-Requested-By: ambari" -X DELETE http://localhost:8080/api/v1/clusters/hadoop/services/SAMPLE
如果沒(méi)有停止的話會(huì)出現(xiàn)
{
"status" : 500,
"message" : "org.apache.ambari.server.controller.spi.SystemException: An internal system exception occurred: Cannot remove hadoop/SAMPLE. MYMASTER is in anon-removable state."
}
沒(méi)關(guān)系再次執(zhí)行就好
- 驗(yàn)證
重新訪問(wèn)8080 web頁(yè)面座柱,已經(jīng)發(fā)現(xiàn)那個(gè)SAMPLE service已經(jīng)消失了 - 再舉幾個(gè)例子:
remove a host components from a host
curl -u admin:admin -i -H 'X-Requested-By: ambari' -X DELETE 'localhost:8080/api/v1/clusters/blueCluster/hosts/elk2.jcloud.local/host_components/FLUME_HANDLER'
curl -u admin:admin -i -H 'X-Requested-By: ambari' -X DELETE 'localhost:8080/api/v1/clusters/cluster/hosts/ochadoop10/host_components/NAMENODE'
curl -u admin:admin -i -H 'X-Requested-By: ambari' -X DELETE 'localhost:8080/api/v1/clusters/hbcm_ocdp/hosts/hbom-if-58/host_components/YARN_CLIENT'
install the components
curl -u admin:admin -i -H "X-Requested-By:ambari" -X POST 'localhost:8080/api/v1/clusters/hbcm_ocdp/hosts/hbbdc-dn-09/host_components/PHOENIX_QUERY_SERVER'
curl -u admin:admin -i -H "X-Requested-By:ambari" -X PUT 'localhost:8080/api/v1/clusters/hbcm_ocdp/hosts/hbbdc-dn-09/host_components/PHOENIX_QUERY_SERVER' -d '{"HostRoles": {"state": "INSTALLED"}}'
如何重置ambari的管理員密碼
要想使用Ambari admin登陸迷帜,可以用以下辦法重置admin的密碼:
- Stop Ambari server
- Log on to ambari server host shell
- Run 'psql -U ambari ambari'
- Enter password **** (這是ambari連接到數(shù)據(jù)庫(kù)時(shí)用的密碼,默認(rèn)是bigdata, 竟然以明文的形式色洞,存儲(chǔ)在文件/etc/ambari-server/conf/password.dat)
- In psql:
update ambari.users set
user_password='538916f8943ec225d97a9a86a2c6ec0818c1cd400e09e03b660fdaaec4af29ddbb6f2b1033b81b00'
where user_name='admin'; - Quit psql: ctrl+D
- Run 'ambari-server restart'
User [dr.who] is not authorized to view the logs for application
在hadoop集群?jiǎn)⒂脵?quán)限控制后戏锹,發(fā)現(xiàn)job運(yùn)行日志的ui訪問(wèn)不了, User [dr.who] is not authorized to view the logs for application
Reason:
Resource Manager UI的默認(rèn)用戶dr.who權(quán)限不正確
Solution:
如果集群使用Ambari管理的話,在HDFS > Configurations > Custom core-site > Add Property
hadoop.http.staticuser.user=yarn
后臺(tái)腳本修改配置:
獲取配置信息:
/var/lib/ambari-server/resources/scripts/configs.sh get localhost hdp_cluster hive-site|grep hive.server2.authenticatio
"hive.server2.authentication" : "NONE",
"hive.server2.authentication.spnego.keytab" : "HTTP/_HOST@EXAMPLE.COM",
"hive.server2.authentication.spnego.principal" : "/etc/security/keytabs/spnego.service.keytab",
修改配置信息:
/var/lib/ambari-server/resources/scripts/configs.sh set localhost hdp_cluster hive-site hive.server2.authentication LDAP
ambari-sudo.sh /usr/bin/hdp-select錯(cuò)誤
ambari-sudo.sh /usr/bin/hdp-select set all `ambari-python-wrap /usr/bin/hdp-select versions | grep ^2.4.0.0-169 | tail -1`'] {'only_if': 'ls -d /usr/hdp/2.4.0.0-169*
Solution:
- What happens when you run "hdp-select versions" from the command line, as root? Does it return your current 2.4 version number? If not, inspect your /usr/hdp and make sure you have only "current" and the directories named after your versions (2.4 and older ones if you did an upgrade) there. If you have any other file there, delete it, and retry, first "hdp-select versions" and then ATS.
- go to /usr/bin/
vi hdp-select
def printVersions():
......
......
- if f not in [".", "..", "current", "share", "lost+found"]:
+ if f not in [".", "..", "current", "share", "lost+found","hadoop"]:
......
- 軟連接沖突火诸,刪除多余軟連接重試
HiveMetaStore or Hiveserver fails to come up
SupportKBSYMPTOMHiveServer2 fails to come up and error similar to the following is reported in hiveserver2.log file
2015-11-18 20:47:19,965 WARN [main]: server.HiveServer2 (HiveServer2.java:startHiveServer2(442)) - Error starting HiveServer2 on attempt 4, will retry in 60 secondsorg.apache.hive.service.ServiceException: Failed to Start HiveServer2
at org.apache.hive.service.CompositeService.start(CompositeService.java:80)
at org.apache.hive.service.server.HiveServer2.start(HiveServer2.java:366)
at org.apache.hive.service.server.HiveServer2.startHiveServer2(HiveServer2.java:412)
at org.apache.hive.service.server.HiveServer2.access$700(HiveServer2.java:78)
at org.apache.hive.service.server.HiveServer2$StartOptionExecutor.execute(HiveServer2.java:654)
at org.apache.hive.service.server.HiveServer2.main(HiveServer2.java:527)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.hive.service.ServiceException: Unable to connect to MetaStore!
at org.apache.hive.service.cli.CLIService.start(CLIService.java:154)
at org.apache.hive.service.CompositeService.start(CompositeService.java:70) ... 11 more
Caused by: MetaException(message:Got exception: org.apache.hadoop.hive.metastore.api.MetaException javax.jdo.JDOException: Exception thrown when executing query
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:596)
at org.datanucleus.api.jdo.JDOQuery.execute(JDOQuery.java:230)
at org.apache.hadoop.hive.metastore.ObjectStore.getDatabases(ObjectStore.java:701)
at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy7.getDatabases(Unknown Source)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_databases(HiveMetaStore.java:1158)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
HiveMetaStore fails to come up
2017-02-27 14:45:05,361 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:main(5908)) - Starting hive metastore on port 9083
2017-02-27 14:45:05,472 INFO [main]: metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(590)) - 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2017-02-27 14:45:05,497 INFO [main]: metastore.ObjectStore (ObjectStore.java:initialize(294)) - ObjectStore, initialize called
2017-02-27 14:45:06,193 ERROR [main]: DataNucleus.Datastore (Log4JLogger.java:error(115)) - Error : An error occurred trying to instantiate an instance of the adapter "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" for this JDBC driver : Class "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" was not found in the CLASSPATH. Please check your specification and your CLASSPATH.
Class "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" was not found in the CLASSPATH. Please check your specification and your CLASSPATH.
org.datanucleus.exceptions.ClassNotResolvedException: Class "org.datanucleus.store.rdbms.adapter.SQLAnywhereAdapter" was not found in the CLASSPATH. Please check your specification and your CLASSPATH.
at org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:216)
at org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:368)
at org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:391)
at org.datanucleus.store.rdbms.adapter.DatastoreAdapterFactory.getAdapterClass(DatastoreAdapterFactory.java:226)
at org.datanucleus.store.rdbms.adapter.DatastoreAdapterFactory.getNewDatastoreAdapter(DatastoreAdapterFactory.java:144)
at org.datanucleus.store.rdbms.adapter.DatastoreAdapterFactory.getDatastoreAdapter(DatastoreAdapterFactory.java:92)
at org.datanucleus.store.rdbms.RDBMSStoreManager.(RDBMSStoreManager.java:309)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConst
ROOT CAUSE
AMBARI-12947 , BUG-44352
Post Ambari 2.1, up to Ambari 2.1.2, its mandatory to initialize datanucleus.rdbms.datastoreAdapterClassName in Hive Configs. This is
required only if SqlAnywhere database is used. There is no option in Ambari to delete this parameter.
RESOLUTION
Upgrade to Ambari 2.1.2.
WORKAROUND
Remove Hive configuration parameter 'datanucleus.rdbms.datastoreAdapterClassName' from hive-site using configs.sh
For eg
- Dump the hive-site parameters to a file
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin get Ambari_Hostname Ambari_ClusterName hive-site > /tmp/hive-site.txt
This would dump/redirect all Ambari Hive configs parameter to /tmp/hive-site.txt - Edit the /tmp/hive-site.txt template file created above and remove 'datanucleus.rdbms.datastoreAdapterClassname'. Also remove the
lines before the 'properties' tag - Set the hive-site parameters using /tmp/hive-site.txt
/var/lib/ambari-server/resources/scripts/configs.sh -u admin -p admin set Ambari_Hostname Ambari_ClusterName hive-site /tmp/hive-site.txt - Start Hive Services
This article created by Hortonworks Support (Article: 000003468) on 2015-11-25 06:07
OS: Linux
Type: Cluster_Administration
Version: 2.1.0, 2.3.0
Support ID: 000003468
https://issues.apache.org/jira/browse/AMBARI-13114