sqoop相关整理记录,sqoop相关整理


生产背景:

在从mysql导入到hive,遇到如下问题:

 1) 源mysql和集群机器不在同一个网段中,导致执行导入命令,网络连接失败。

 2) 某些字符导入到hive中,出现报错终止。 

    2.1  sqoop使用的JDBC-connector 版本太低(更换版本)。

hive导出到mysql中,遇到如下问题:

 1)某些字符插入mysql,出现报错终止。

   1.1 可能mysql本身编码的限制,某些字符不支持,比如uft8utf8mb4

   1.2  sqoop使用的JDBC-connector 版本太低(更换版本)。

 

sqoop 缺点:

 1  基于命令行的操作方式,易出错,且不安全。

 2  数据传输和数据格式是紧耦合的,这使得connector无法支持所有的数据格式

 3  用户名和密码暴漏出来

 4  sqoop安装需要root权限

Sqoop优点:

 

1   高效可控的利用资源,任务并行度,超时时间。

2   数据类型映射与转化,可自动进行,用户也可自定义 .


3   支持多种主流数据库,MySQL,OracleSQL ServerDB2等等 

 

 

Sqoop原理:

 

Sqoopinport原理:

    Sqoopimport时,需要制定split-by参数。Sqoop根据不同的split-by参数值来进行切分,然后将切分出来的区域分配到不同map中。每个map中再处理数据库中获取的一行一行的值,写入到HDFS中。同时split-by根据不同的参数类型有不同的切分方法,如比较简单的int型,Sqoop会取最大和最小split-by字段值,然后根据传入的num-mappers来确定划分几个区域。 比如select max(split_by),min(split-by) from得到的max(split-by)min(split-by)分别为10001,而num-mappers2的话,则会分成两个区域(1,500)(501-100),同时也会分成2sql2map去进行导入操作,分别为select XXX from table where split-by>=1 and split-by<500select XXX from table where split-by>=501 and split-by<=1000。最后每个map各自获取各自SQL中的数据进行导入工作。

Sqoopexport原理:根据mysql表名称,生成一个以表名称命名的Java类,该类继承了sqoopRecord的,是一个只有MapMR,且自定义了输出字段。

    

 

 

 

 

sqoop export --connect jdbc:mysql://$url:3306/$3?characterEncoding=utf8 --username $username --password $password --table $1 --export-dir $2 --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';

验证sqoop的各种报错:

 

 

mysql表

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(20),

click_url    varchar(200)

)

hive 表

CREATE TABLE TRLOG

(

PLATFORM string,

USER_ID int,

CLICK_TIME string,

CLICK_URL string

)

row format delimited

fields terminated by '\t';

 

sqoop list-databases –connect jdbc:mysql://192.168.119.129:3306/ –username li72 –password 123

1 mysql字段太短

 

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

 

 

Warning: $HADOOP_HOME is deprecated.

 

14/11/06 01:42:32 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:42:32 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:42:32 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:42:34 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:42:38 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/d5e37c20a9231b3253c97fc27d16d8a9/dm_trlog.jar

14/11/06 01:42:38 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:42:43 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:42:43 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:42:44 INFO mapred.JobClient: Running job: job_201411060114_0001

14/11/06 01:42:45 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:43:15 INFO mapred.JobClient: Task Id : attempt_201411060114_0001_m_000000_0, Status : FAILED

java.io.IOException: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1

        at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:192)

        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:651)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:766)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Data too long for column 'click_time' at row 1

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4118)

        at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4052)

        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2503)

        at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2664)

        at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2794)

        at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2155)

        at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1379)

        at org.apache.sqoop.mapreduce.AsyncSqlOutputFormat$AsyncSqlExecThread.run(AsyncSqlOutputFormat.java:233)

 

mysql字段太短了

 

drop table  dm_trlog;

create  table dm_trlog (

plat     varchar(20),

user_id  varchar(20),

click_time   varchar(200),

click_url    varchar(200)

)

 

2 hive的空字段转换 

由于Hive的NULL用\N来表示,字段用\001来分割,换行用\n来换行,导出分隔符一定要和hive表保持一致,如果为空可以指定转换为0,有些mysql数字字段不能插入\N

加上两个参数:--input-null-string '\\N' --input-null-non-string '\\N'多加一个'\',是为转义

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

14/10/23 04:53:47 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/915d24128af59c9e517580e0f07411d4/dm_pc_play_kpi.jar 
14/10/23 04:53:47 INFO mapreduce.ExportJobBase: Beginning export of dm_pc_play_kpi 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 INFO input.FileInputFormat: Total input paths to process : 1 
14/10/23 04:53:48 WARN snappy.LoadSnappy: Snappy native library is available 
14/10/23 04:53:48 INFO util.NativeCodeLoader: Loaded the native-hadoop library 
14/10/23 04:53:48 INFO snappy.LoadSnappy: Snappy native library loaded 
14/10/23 04:53:48 INFO mapred.JobClient: Running job: job_201408301703_84117 
14/10/23 04:53:49 INFO mapred.JobClient: map 0% reduce 0% 
14/10/23 04:55:45 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_0, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: "N" 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
... 10 more 

14/10/23 04:55:53 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_1, Status : FAILED 
14/10/23 04:55:58 INFO mapred.JobClient: Task Id : attempt_201408301703_84117_m_000000_2, Status : FAILED 
java.io.IOException: Can't export data, please check task tracker logs 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) 
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) 
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) 
at org.apache.hadoop.mapred.Child$4.run(Child.java:255) 
at java.security.AccessController.doPrivileged(Native Method) 
at javax.security.auth.Subject.doAs(Subject.java:415) 
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149) 
at org.apache.hadoop.mapred.Child.main(Child.java:249) 
Caused by: java.lang.NumberFormatException: For input string: "N" 
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1241) 
at java.lang.Float.valueOf(Float.java:417) 
at dm_pc_play_kpi.__loadFromFields(dm_pc_play_kpi.java:335) 
at dm_pc_play_kpi.parse(dm_pc_play_kpi.java:282) 
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83) 
... 10 more 

 

3 分隔符错误

Hive中的分隔符是’\t’ 但是导出写成’|’

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '|' --null-non-string '0' --null-string '0';

 为了测试,特意把分隔符改成 "|"

 

 

14/11/06 01:50:19 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 01:50:19 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 01:50:19 INFO tool.CodeGenTool: Beginning code generation

14/11/06 01:50:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `dm_trlog` AS t LIMIT 1

14/11/06 01:50:21 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/bigdata/hadoop

Note: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

14/11/06 01:50:25 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e474b3f8292f91dd4134b302ae35df19/dm_trlog.jar

14/11/06 01:50:25 INFO mapreduce.ExportJobBase: Beginning export of dm_trlog

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO input.FileInputFormat: Total input paths to process : 1

14/11/06 01:50:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/11/06 01:50:45 WARN snappy.LoadSnappy: Snappy native library not loaded

14/11/06 01:50:51 INFO mapred.JobClient: Running job: job_201411060114_0003

14/11/06 01:50:52 INFO mapred.JobClient:  map 0% reduce 0%

14/11/06 01:51:20 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000000_0, Status : FAILED

java.io.IOException: Can't export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        ... 10 more

 

14/11/06 01:51:21 INFO mapred.JobClient: Task Id : attempt_201411060114_0003_m_000001_0, Status : FAILED

java.io.IOException: Can't export data, please check task tracker logs

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)

        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)

        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)

        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)

        at org.apache.hadoop.mapred.Child.main(Child.java:249)

Caused by: java.util.NoSuchElementException

        at java.util.AbstractList$Itr.next(AbstractList.java:350)

        at dm_trlog.__loadFromFields(dm_trlog.java:252)

        at dm_trlog.parse(dm_trlog.java:201)

        at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)

        ... 10 more

 

这样的报错应该是分隔符的问题

 

那怎么解决呢?

 

4  mysql的网络不在集群网络中

 

sqoop export --connect jdbc:mysql://192.168.119.1:3306/student?characterEncoding=utf8 --username li72 --password 123 --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

 

 

 

Warning: $HADOOP_HOME is deprecated.

 

14/11/06 02:04:29 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 02:04:30 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 02:04:30 INFO tool.CodeGenTool: Beginning code generation

14/11/06 02:07:40 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection timed out

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 32 more

 

5  mysql停止服务

14/11/06 04:55:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.

14/11/06 04:55:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory

14/11/06 04:55:25 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:

14/11/06 04:55:25 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

14/11/06 04:55:25 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@8a0d5d

14/11/06 04:55:25 INFO tool.CodeGenTool: Beginning code generation

14/11/06 04:55:26 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.SqlManager: Error executing statement: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:605)

        at org.apache.sqoop.manager.SqlManager.execute(SqlManager.java:628)

        at org.apache.sqoop.manager.SqlManager.getColumnTypesForRawQuery(SqlManager.java:235)

        at org.apache.sqoop.manager.SqlManager.getColumnTypes(SqlManager.java:219)

        at org.apache.sqoop.manager.ConnManager.getColumnTypes(ConnManager.java:283)

        at org.apache.sqoop.orm.ClassWriter.getColumnTypes(ClassWriter.java:1255)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1072)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 32 more

14/11/06 04:55:27 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

14/11/06 04:55:27 ERROR manager.CatalogQueryManager: Failed to list columns from query: SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = (SELECT SCHEMA())   AND TABLE_NAME = 'dm_trlog' 

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1117)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:355)

        at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2461)

        at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2498)

        at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2283)

        at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:822)

        at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)

        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)

        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)

        at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)

        at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:404)

        at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:317)

        at java.sql.DriverManager.getConnection(DriverManager.java:582)

        at java.sql.DriverManager.getConnection(DriverManager.java:185)

        at org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:745)

        at org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)

        at org.apache.sqoop.manager.CatalogQueryManager.getColumnNames(CatalogQueryManager.java:147)

        at org.apache.sqoop.orm.ClassWriter.getColumnNames(ClassWriter.java:1222)

        at org.apache.sqoop.orm.ClassWriter.generate(ClassWriter.java:1074)

        at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:82)

        at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:64)

        at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)

        at org.apache.sqoop.Sqoop.run(Sqoop.java:145)

        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)

        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:238)

Caused by: java.net.ConnectException: Connection refused

        at java.net.PlainSocketImpl.socketConnect(Native Method)

        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)

        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)

        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)

        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

        at java.net.Socket.connect(Socket.java:529)

        at java.net.Socket.connect(Socket.java:478)

        at java.net.Socket.<init>(Socket.java:375)

        at java.net.Socket.<init>(Socket.java:218)

        at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:259)

        at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:305)

        ... 28 more

14/11/06 04:55:27 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.

java.lang.RuntimeException: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

 

6  Sqoop调式信息

 增加关键字--verbose

sqoop export --connect jdbc:mysql://192.168.119.129:3306/student?characterEncoding=utf8 --username li72 --password 123 --verbose --table dm_trlog --export-dir /home/bigdata/hive/data/db1.db/trlog --input-fields-terminated-by '\t' --null-non-string '0' --null-string '0';

7 修改生成的Java类,重新打包。

每次通过sqoop导入MySql的时,都会生成一个以MySql表命名的.java文件,然后打成JAR包,给sqoop提交给hadoop MR来解析Hive表中的数据。那可以根据报的错误,找到对应的行,改写该文件,编译,重新打包,sqoop可以通过 -jar-file --class-name 组合让我们指定运行自己的jar包中的某个class。来解析该hive表中的每行数据。脚本如下:一个完整的例子如下: 
./bin/sqoop export --connect "jdbc:mysql://cmg3.chamago.com/ehub?useUnicode=true&characterEncoding=utf-8" 
--username gavin.peng --password gavin.peng --table bi_weekly_sales_item 
--export-dir /hive/warehouse/hive_bi_weekly_sales_item --input-fields-terminated-by '\t' 
--input-null-string '\\N' --input-null-non-string '\\N' 
--class-name com.chamago.sqoop.codegen.bi_weekly_sales_item 
--jar-file /tmp/sqoop-chamago/bi_weekly_sales_item.jar 
上面--jar-file 参数指定jar包的路径。--class-name 指定jar包中的class 
这样就可以解决所有解析异常了。 

Sqoop命令行说明

Common arguments:
   --connect <jdbc-uri> Specify JDBC connect
                                                string
   --connection-manager <class-name> Specify connection manager
                                                class name
   --connection-param-file <properties-file> Specify connection
                                                parameters file
   --driver <class-name> Manually specify JDBC
                                                driver class to use
   --hadoop-home <hdir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --hadoop-mapred-home <dir> Override
                                                $HADOOP_MAPRED_HOME_ARG
   --help Print usage instructions
-P Read password from console
   --password <password> Set authentication
                                                password
   --password-file <password-file> Set authentication
                                                password file path
   --relaxed-isolation Use read-uncommitted
                                                isolation for imports
   --skip-dist-cache Skip copying jars to
                                                distributed cache
   --username <username> Set authentication
                                                username
   --verbose Print more information
                                                while working
 
Import control arguments:
   --append Imports data
                                                              in append
                                                              mode
   --as-avrodatafile Imports data
                                                              to Avro data
                                                              files
   --as-sequencefile Imports data
                                                              to
                                                              SequenceFile
                                                              s
   --as-textfile Imports data
                                                              as plain
                                                              text
                                                              (default)
   --boundary-query <statement> Set boundary
                                                              query for
                                                              retrieving
                                                              max and min
                                                              value of the
                                                              primary key
   --columns <col,col,col...> Columns to
                                                              import from
                                                              table
   --compression-codec <codec> Compression
                                                              codec to use
                                                              for import
   --delete-target-dir Imports data
                                                              in delete
                                                              mode
   --direct Use direct
                                                              import fast
                                                              path
   --direct-split-size <n> Split the
                                                              input stream
                                                              every 'n'
                                                              bytes when
                                                              importing in
                                                              direct mode
-e,--query <statement> Import
                                                              results of
                                                              SQL
                                                              'statement'
   --fetch-size <n> Set number
                                                              'n' of rows
                                                              to fetch
                                                              from the
                                                              database
                                                              when more
                                                              rows are
                                                              needed
   --inline-lob-limit <n> Set the
                                                              maximum size
                                                              for an
                                                              inline LOB
-m,--num-mappers <n> Use 'n' map
                                                              tasks to
                                                              import in
                                                              parallel
   --mapreduce-job-name <name> Set name for
                                                              generated
                                                              mapreduce
                                                              job
   --split-by <column-name> Column of
                                                              the table
                                                              used to
                                                              split work
                                                              units
   --table <table-name> Table to
                                                              read
   --target-dir <dir> HDFS plain
                                                              table
                                                              destination
   --validate Validate the
                                                              copy using
                                                              the
                                                              configured
                                                              validator
   --validation-failurehandler <validation-failurehandler> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationFa
                                                              ilureHandler
   --validation-threshold <validation-threshold> Fully
                                                              qualified
                                                              class name
                                                              for
                                                              ValidationTh
                                                              reshold
   --validator <validator> Fully
                                                              qualified
                                                              class name
                                                              for the
                                                              Validator
   --warehouse-dir <dir> HDFS parent
                                                              for table
                                                              destination
   --where <where clause> WHERE clause
                                                              to use
                                                              during
                                                              import
-z,--compress Enable
                                                              compression
 
Incremental import arguments:
   --check-column <column> Source column to check for incremental
                                  change
   --incremental <import-type> Define an incremental import of type
                                  'append' or 'lastmodified'
   --last-value <value> Last imported value in the incremental
                                  check column
 
Output line formatting arguments:
   --enclosed-by <char> Sets a required field enclosing
                                      character
   --escaped-by <char> Sets the escape character
   --fields-terminated-by <char> Sets the field separator character
   --lines-terminated-by <char> Sets the end-of-line character
   --mysql-delimiters Uses MySQL's default delimiter set:
                                      fields: , lines: \n escaped-by: \
                                      optionally-enclosed-by: '
   --optionally-enclosed-by <char> Sets a field enclosing character
 
Input parsing arguments:
   --input-enclosed-by <char> Sets a required field encloser
   --input-escaped-by <char> Sets the input escape
                                            character
   --input-fields-terminated-by <char> Sets the input field separator
   --input-lines-terminated-by <char> Sets the input end-of-line
                                            char
   --input-optionally-enclosed-by <char> Sets a field enclosing
                                            character
 
Hive arguments:
   --create-hive-table Fail if the target hive
                                               table exists
   --hive-database <database-name> Sets the database name to
                                               use when importing to hive
   --hive-delims-replacement <arg> Replace Hive record \0x01
                                               and row delimiters (\n\r)
                                               from imported string fields
                                               with user-defined string
   --hive-drop-import-delims Drop Hive record \0x01 and
                                               row delimiters (\n\r) from
                                               imported string fields
   --hive-home <dir> Override $HIVE_HOME
   --hive-import Import tables into Hive
                                               (Uses Hive's default
                                               delimiters if none are
                                               set.)
   --hive-overwrite Overwrite existing data in
                                               the Hive table
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --hive-table <table-name> Sets the table name to use
                                               when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
 
HBase arguments:
   --column-family <family> Sets the target column family for the
                               import
   --hbase-bulkload Enables HBase bulk loading
   --hbase-create-table If specified, create missing HBase tables
   --hbase-row-key <col> Specifies which input column to use as the
                               row key
   --hbase-table <table> Import to <table> in HBase
 
HCatalog arguments:
   --hcatalog-database <arg> HCatalog database name
   --hcatalog-home <hdir> Override $HCAT_HOME
   --hcatalog-table <arg> HCatalog table name
   --hive-home <dir> Override $HIVE_HOME
   --hive-partition-key <partition-key> Sets the partition key to
                                               use when importing to hive
   --hive-partition-value <partition-value> Sets the partition value to
                                               use when importing to hive
   --map-column-hive <arg> Override mapping for
                                               specific column to hive
                                               types.
 
HCatalog import specific options:
   --create-hcatalog-table Create HCatalog before import
   --hcatalog-storage-stanza <arg> HCatalog storage stanza for table
                                      creation
 
Accumulo arguments:
   --accumulo-batch-size <size> Batch size in bytes
   --accumulo-column-family <family> Sets the target column family for
                                         the import
   --accumulo-create-table If specified, create missing
                                         Accumulo tables
   --accumulo-instance <instance> Accumulo instance name.
   --accumulo-max-latency <latency> Max write latency in milliseconds
   --accumulo-password <password> Accumulo password.
   --accumulo-row-key <col> Specifies which input column to
                                         use as the row key
   --accumulo-table <table> Import to <table> in Accumulo
   --accumulo-user <user> Accumulo user name.
   --accumulo-visibility <vis> Visibility token to be applied to
                                         all rows imported
   --accumulo-zookeepers <zookeepers> Comma-separated list of
                                         zookeepers (host:port)
 
Code generation arguments:
   --bindir <dir> Output directory for compiled
                                         objects
   --class-name <name> Sets the generated class name.
                                         This overrides --package-name.
                                         When combined with --jar-file,
                                         sets the input class.
   --input-null-non-string <null-str> Input null non-string
                                         representation
   --input-null-string <null-str> Input null string representation
   --jar-file <file> Disable code generation; use
                                         specified jar
   --map-column-java <arg> Override mapping for specific
                                         columns to java types
   --null-non-string <null-str> Null non-string representation
   --null-string <null-str> Null string representation
   --outdir <dir> Output directory for generated
                                         code
   --package-name <name> Put auto-generated classes in
                                         this package


 

 


linux下用sqoop导入mysql有个表最多只可以导入10580条记录,怎更改才可以让它放多点

可以和我讨论mysql数据库的,你现在能手动插入数据记录吗?不能导入的时候有什么提示呢?
 

一个sqoop 导入数据库的问题 不知道什原因 我把权限都给了 怎提示我权限不对

应该是访问mysql权限的问题。
 

相关内容

    暂无相关文章