怎么使用地图red export import删除hbase表数据-数据库教程-爱易网页

怎么使用地图red export import删除hbase表数据

日期：2014-05-16　浏览次数：20578 次

如何使用mapred export import删除hbase表数据

背景：

hbase的删除功能比较弱，只能单行删除，而且必须指定rowkey。

遇到问题：

今天遇到一个需求，用户导入了大量错误的数据，数据的rowkey开头都是110102，需要删除这些垃圾记录，用hbase shell删除实在不科学。

解决方案：

用hbase的mapreduce工具进行export和import，在export过程中filter掉不需要的数据。

首先说明下表的schema：

{NAME => 'freeway.service', FAMILIES => [{NAME => 'service_span_colfam', BLOOMFILTER => 'ROW', VERSIONS => '1', MIN_VERSIONS => '0', TTL => '604800', IN_MEMORY => 'true'}]}

我们使用hbase的export工具在export时filter掉不需要的数据，这边export支持正则表达式。我们看下export的usage：

Usage: Export [-D <property=value>]* <tablename> <outputdir> [<versions> [<starttime> [<endtime>]] [^[regex pattern] or [Prefix] to filter]]

  Note: -D properties will be applied to the conf used. 
  For example: 
   -D mapred.output.compress=true
   -D mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec
   -D mapred.output.compression.type=BLOCK
  Additionally, the following SCAN properties can be specified
  to control/limit what is exported..
   -D hbase.mapreduce.scan.column.family=<familyName>

tablename和outputdir是必须的，后面是版本号，starttime，endtime，filter的正则表达式。

我们这里版本就一个，starttime设为0，endtime设为很大的数，保证把所有数据都拿到。后面正则表达式要用单引号包住以防Linux的bach解析里面的问号

hbase org.apache.hadoop.hbase.mapreduce.Driver export freeway.service hdfs://ns/usr/op1/freeway.service 1 0 999999999999999 '^^(?!110102)'

现在这张表的数据就存在hdfs上的一个sequencefile里了。

现在删除原表，再创建一次。

然后import filter后的数据到新的表中：

hbase org.apache.hadoop.hbase.mapreduce.Driver import freeway.service hdfs://ns/usr/op1/freeway.service/part-m-00000

scan看下，woo，好了

最后看下源码：

可以看到filter需要^开头，而且是满足filter条件的保留下来（CompareOp.EQUAL）.这就是我们正则表达式是匹配非110102开头的rowkey的原因

Filter exportFilter = null;
    String filterCriteria = (args.length > 5) ? args[5]: null;
    if (filterCriteria == null) return null;
    if (filterCriteria.startsWith("^")) {
      String regexPattern = filterCriteria.substring(1, filterCriteria.length());
      exportFilter = new RowFilter(CompareOp.EQUAL, new RegexStringComparator(regexPattern));
    } else {
      exportFilter = new PrefixFilter(Bytes.toBytes(filterCriteria));
    }
    return exportFilter;

免责声明： 本文仅代表作者个人观点，与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实，对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺，请读者仅作参考，并请自行核实相关内容。

怎么使用地图red export import删除hbase表数据

相关资料更多>

推荐阅读更多>