【甘道夫】使用sqoop-1.4.4.bin__hadoop-2.0.4-alpha将Oracle11g数据导入HBase0.96


环境: Hadoop2.2.0 Hbase0.96 sqoop-1.4.4.bin__hadoop-2.0.4-alpha.tar.gz Oracle11g jdk1.7 Ubuntu14 Server
这里关于环境吐槽一句: 最新版本的Sqoop1.99.3功能太弱,只支持导入数据到HDFS,没有别的任何选项,太土了!(如有不同意见欢迎讨论给出解决方案)
命令: sqoop import --connect jdbc:oracle:thin:@192.168.0.147:1521:ORCLGBK --username ZHAOBIAO --P --table CMS_NEWS_0625 --hbase-create-table --hbase-table 147patents --column-family patentinfo

注意几点: 1.Oracle的表名必须大写 2.用户名必须大写字母 3. 原来打算使用以下参数创建组合行键 --hbase-row-key create_time,publish_time,operate_time,title
但总是报错: Error: java.io.IOException: Could not insert row with null value for row-key column: OPERATE_TIME         at org.apache.sqoop.hbase.ToStringPutTransformer.getPutCommand(ToStringPutTransformer.java:125)         at org.apache.sqoop.hbase.HBasePutProcessor.accept(HBasePutProcessor.java:142)         at org.apache.sqoop.mapreduce.DelegatingOutputFormat$DelegatingRecordWriter.write(DelegatingOutputFormat.java:128)         at org.apache.sqoop.mapreduce.DelegatingOutputFormat$DelegatingRecordWriter.write(DelegatingOutputFormat.java:92)         at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:634)         at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)         at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)         at org.apache.sqoop.mapreduce.HBaseImportMapper.map(HBaseImportMapper.java:38)         at org.apache.sqoop.mapreduce.HBaseImportMapper.map(HBaseImportMapper.java:31)         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)         at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:339)         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)         at java.security.AccessController.doPrivileged(Native Method)         at javax.security.auth.Subject.doAs(Subject.java:415)         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
然而,该将参数去掉后,正常执行了,行键是原表的主键id。 该问题待解决!

相关内容