一粟矿、情況概述
mysql中已創(chuàng)建好名為sitelight的數(shù)據(jù)庫并導(dǎo)入了數(shù)據(jù)凰棉,表名為t_site_formal,數(shù)據(jù)條數(shù)為168992陌粹。hive中執(zhí)行建庫語句撒犀,名稱為sitelight,且未指定分隔符申屹,為默認(rèn)的'\001'绘证。執(zhí)行導(dǎo)入數(shù)據(jù)的命令時(shí),需要啟動(dòng)hadoop集群哗讥、啟動(dòng)hive服務(wù)端嚷那、hive的metastore服務(wù),sqoop無需啟動(dòng)杆煞,可直接用魏宽。
二、服務(wù)啟動(dòng)
- 啟動(dòng)hdfs服務(wù)
[hadoop@hadoop01 sbin]$ pwd
/home/hadoop/hadoop-2.10.1/sbin
[hadoop@hadoop01 sbin]$ ./start-dfs.sh
- 啟動(dòng)hive服務(wù)端决乎,無所謂在哪個(gè)目錄上
[hadoop@hadoop01 sbin]$ nohup hive --service hiveserver2 &
- 啟動(dòng)metastore服務(wù)队询,無所謂在哪個(gè)目錄上
[hadoop@hadoop01 sbin]$ nohup hive --service metastore &
三、執(zhí)行腳本
sqoop import \
--connect jdbc:mysql://172.16.100.19:3306/sitelight \
--username queryuser \
--password abcde12345 \
--table t_site_formal \
--hive-import \
--hive-overwrite \
--create-hive-table \
--delete-target-dir \
--hive-database sitelight \
--hive-table t_site_formal \
-m 1
四构诚、執(zhí)行導(dǎo)入腳本遇見的錯(cuò)誤
- 報(bào)錯(cuò)信息1
[hadoop@hadoop01 sbin]$ sqoop import \
> --connect jdbc:mysql://172.16.100.19:3306/sitelight \
> --username queryuser \
> --password abcde12345 \
> --table t_site_formal \
> --hive-import \
> -m 1
Warning: /home/hadoop/sqoop-1.4.7/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/hadoop/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/hadoop/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /home/hadoop/sqoop-1.4.7/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
21/12/07 15:32:53 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
21/12/07 15:32:53 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
21/12/07 15:32:53 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
21/12/07 15:32:53 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
21/12/07 15:32:53 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
21/12/07 15:32:53 INFO tool.CodeGenTool: Beginning code generation
21/12/07 15:32:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_site_formal` AS t LIMIT 1
21/12/07 15:32:53 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_site_formal` AS t LIMIT 1
21/12/07 15:32:53 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop-2.10.1
Note: /tmp/sqoop-hadoop/compile/672d74685e64a0fe1025ce9b2b875e46/t_site_formal.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
21/12/07 15:32:54 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/672d74685e64a0fe1025ce9b2b875e46/t_site_formal.jar
21/12/07 15:32:54 WARN manager.MySQLManager: It looks like you are importing from mysql.
21/12/07 15:32:54 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
21/12/07 15:32:54 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
21/12/07 15:32:54 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
21/12/07 15:32:54 INFO mapreduce.ImportJobBase: Beginning import of t_site_formal
21/12/07 15:32:54 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
21/12/07 15:32:54 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
21/12/07 15:32:54 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/172.16.100.26:8032
21/12/07 15:32:56 INFO ipc.Client: Retrying connect to server: hadoop01/172.16.100.26:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
21/12/07 15:32:57 INFO ipc.Client: Retrying connect to server: hadoop01/172.16.100.26:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
21/12/07 15:32:58 INFO ipc.Client: Retrying connect to server: hadoop01/172.16.100.26:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
- 報(bào)錯(cuò)原因:INFO client.RMProxy: Connecting to ResourceManager at hadoop01/172.16.100.26:8032蚌斩,resourcemanager屬于yarn的主節(jié)點(diǎn),是由于hadoop服務(wù)配置時(shí)指定了yarn計(jì)算框架范嘱。
- 解決方法:一送膳、如果只想使用hadoop的hdfs文件系統(tǒng)存儲(chǔ)數(shù)據(jù)功能员魏,可不指定yarn,yarn配置的文件是mapred-site.xml叠聋,即/home/hadoop/hadoop-2.10.1/etc/hadoop/mapred-site.xml撕阎,將其修改成別名,并重啟hadoop集群即可碌补;二虏束、直接啟動(dòng)yarn服務(wù),在sbin目錄下直接啟動(dòng)厦章,即/home/hadoop/hadoop-2.10.1/sbin/start-yarn.sh
- 報(bào)錯(cuò)信息2
21/12/07 15:39:42 INFO hive.HiveImport: Loading uploaded data into Hive
21/12/07 15:39:42 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
21/12/07 15:39:42 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:537)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:355)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
... 12 more
- 報(bào)錯(cuò)原因:Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.很明顯镇匀,hadoop找不到hive的配置文件。
- 解決方法:追加當(dāng)前用戶下hadoop的環(huán)境變量
[hadoop@hadoop01 sbin]$ vim ~/.bashrc
export HADOOP_CLASSPATH=$HIVE_HOME/lib/*
- 報(bào)錯(cuò)信息3
21/12/07 17:25:05 INFO mapreduce.ImportJobBase: Transferred 40.7139 MB in 11.4171 seconds (3.566 MB/sec)
21/12/07 17:25:05 INFO mapreduce.ImportJobBase: Retrieved 168237 records.
21/12/07 17:25:05 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners for table t_site_formal
21/12/07 17:25:05 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `t_site_formal` AS t LIMIT 1
21/12/07 17:25:05 WARN hive.TableDefWriter: Column create_date had to be cast to a less precise type in Hive
21/12/07 17:25:05 WARN hive.TableDefWriter: Column audit_date had to be cast to a less precise type in Hive
21/12/07 17:25:05 INFO hive.HiveImport: Loading uploaded data into Hive
21/12/07 17:25:05 INFO conf.HiveConf: Found configuration file file:/home/hadoop/apache-hive-2.3.9-bin/conf/hive-site.xml
2021-12-07 17:25:06,664 main ERROR Could not register mbeans java.security.AccessControlException: access denied ("javax.management.MBeanTrustPermission" "register")
at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:585)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.checkMBeanTrustPermission(DefaultMBeanServerInterceptor.java:1848)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:322)
at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)
at org.apache.logging.log4j.core.jmx.Server.register(Server.java:380)
at org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:165)
at org.apache.logging.log4j.core.jmx.Server.reregisterMBeansAfterReconfigure(Server.java:138)
at org.apache.logging.log4j.core.LoggerContext.setConfiguration(LoggerContext.java:507)
at org.apache.logging.log4j.core.LoggerContext.start(LoggerContext.java:249)
at org.apache.logging.log4j.core.async.AsyncLoggerContext.start(AsyncLoggerContext.java:86)
at org.apache.logging.log4j.core.impl.Log4jContextFactory.getContext(Log4jContextFactory.java:239)
at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:157)
at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:130)
at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:100)
at org.apache.logging.log4j.core.config.Configurator.initialize(Configurator.java:187)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jDefault(LogUtils.java:154)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:90)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:82)
at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:65)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:702)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:686)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:331)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:537)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
- 報(bào)錯(cuò)原因:ERROR Could not register mbeans java.security.AccessControlException: access denied ("javax.management.MBeanTrustPermission" "register")闷袒,明確提示是java的安全策略控制不允許坑律,即jdk的安全配置需修改。
- 解決方法:找到j(luò)dk安裝目錄囊骤,并修改安全策略文件晃择。我的jdk是安裝在全局變量中,所以使用root用戶修改
[hadoop@hadoop01 sbin]$ echo $JAVA_HOME
/usr/local/java
[hadoop@hadoop01 sbin]$ exit
exit
[root@hadoop01 hadoop-2.10.1]$ cd /usr/local/java/jre/lib/security/
[root@hadoop01 security]$ vim java.policy
//在grant的{}中添加此配置
permission javax.management.MBeanTrustPermission "register";
五也物、成功運(yùn)行
LOAD DATA INPATH 'hdfs://hadoop01:8082/user/hadoop/t_site_formal' OVERWRITE INTO TABLE `sitelight`.`t_site_formal`
21/12/07 18:14:47 INFO ql.Driver: Semantic Analysis Completed
21/12/07 18:14:47 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
21/12/07 18:14:47 INFO ql.Driver: Completed compiling command(queryId=hadoop_20211207181447_5c3a828a-2830-4fa6-b852-e4731b760303); Time taken: 0.115 seconds
21/12/07 18:14:47 INFO ql.Driver: Concurrency mode is disabled, not creating a lock manager
21/12/07 18:14:47 INFO ql.Driver: Executing command(queryId=hadoop_20211207181447_5c3a828a-2830-4fa6-b852-e4731b760303):
LOAD DATA INPATH 'hdfs://hadoop01:8082/user/hadoop/t_site_formal' OVERWRITE INTO TABLE `sitelight`.`t_site_formal`
21/12/07 18:14:47 INFO ql.Driver: Starting task [Stage-0:MOVE] in serial mode
21/12/07 18:14:47 INFO hive.metastore: Closed a connection to metastore, current connections: 0
Loading data to table sitelight.t_site_formal
21/12/07 18:14:47 INFO exec.Task: Loading data to table sitelight.t_site_formal from hdfs://hadoop01:8082/user/hadoop/t_site_formal
21/12/07 18:14:47 INFO hive.metastore: Trying to connect to metastore with URI thrift://172.16.100.26:9083
21/12/07 18:14:47 INFO hive.metastore: Opened a connection to metastore, current connections: 1
21/12/07 18:14:47 INFO hive.metastore: Connected to metastore.
21/12/07 18:14:47 INFO common.FileUtils: Creating directory if it doesn't exist: hdfs://hadoop01:8082/usr/local/warehouse/sitelight.db/t_site_formal
21/12/07 18:14:47 INFO ql.Driver: Starting task [Stage-1:STATS] in serial mode
21/12/07 18:14:47 INFO exec.StatsTask: Executing stats task
21/12/07 18:14:47 INFO hive.metastore: Closed a connection to metastore, current connections: 0
21/12/07 18:14:47 INFO hive.metastore: Trying to connect to metastore with URI thrift://172.16.100.26:9083
21/12/07 18:14:47 INFO hive.metastore: Opened a connection to metastore, current connections: 1
21/12/07 18:14:47 INFO hive.metastore: Connected to metastore.
21/12/07 18:14:47 INFO hive.metastore: Closed a connection to metastore, current connections: 0
21/12/07 18:14:47 INFO hive.metastore: Trying to connect to metastore with URI thrift://172.16.100.26:9083
21/12/07 18:14:47 INFO hive.metastore: Opened a connection to metastore, current connections: 1
21/12/07 18:14:47 INFO hive.metastore: Connected to metastore.
21/12/07 18:14:47 INFO exec.StatsTask: Table sitelight.t_site_formal stats: [numFiles=1, numRows=0, totalSize=42691574, rawDataSize=0]
21/12/07 18:14:47 INFO ql.Driver: Completed executing command(queryId=hadoop_20211207181447_5c3a828a-2830-4fa6-b852-e4731b760303); Time taken: 0.291 seconds
OK
21/12/07 18:14:47 INFO ql.Driver: OK
Time taken: 0.406 seconds
21/12/07 18:14:47 INFO CliDriver: Time taken: 0.406 seconds
21/12/07 18:14:47 INFO conf.HiveConf: Using the default value passed in for log id: bb608b6a-b9af-4129-bae0-739683648882
21/12/07 18:14:47 INFO session.SessionState: Resetting thread name to main
21/12/07 18:14:47 INFO conf.HiveConf: Using the default value passed in for log id: bb608b6a-b9af-4129-bae0-739683648882
21/12/07 18:14:47 INFO session.SessionState: Deleted directory: /tmp/hive/hadoop/bb608b6a-b9af-4129-bae0-739683648882 on fs with scheme hdfs
21/12/07 18:14:47 INFO session.SessionState: Deleted directory: /tmp/hadoop/bb608b6a-b9af-4129-bae0-739683648882 on fs with scheme file
21/12/07 18:14:47 INFO hive.metastore: Closed a connection to metastore, current connections: 0
21/12/07 18:14:47 INFO hive.HiveImport: Hive import complete.
我只遇到了這些問題宫屠,就及時(shí)的記錄下來了,即使同樣的報(bào)錯(cuò)也可能是不同的原因滑蚯,如果不奏效浪蹂,就再去找找吧!