日期:2014-05-16  浏览次数:20394 次

Hbase使用filter快速高效查询
本博客是hbase使用filter快速高效查询的方法,我会慢慢补齐

几大Filters
1、Comparision Filters
     1.1  RowFilter
1.2 FamilyFilter
     1.3 QualifierFilter
     1.4 ValueFilter
     1.5 DependentColumnFilter
2、Dedicated Filters
     2.1 SingleColumnValueFilter
     2.2 SingleColumnValueExcludeFilter
     2.3 PrefixFilter
     2.4 PageFilter
     2.5 KeyOnlyFilter
     2.6 FirstKeyOnlyFilter
     2.7 TimestampsFilter
     2.8 RandomRowFilter
3、Decorating Filters
     3.1  SkipFilter
     3.2 WhileMatchFilters

 

一个简单的示例 SingleColumnValueFilter

 

 public static void selectByFilter(String tablename,List<String> arr) throws IOException{  
        HTable table=new HTable(hbaseConfig,tablename);  
        FilterList filterList = new FilterList();  
        Scan s1 = new Scan();  
        for(String v:arr){ // 各个条件之间是“与”的关系  
            String [] s=v.split(",");  
            filterList.addFilter(new SingleColumnValueFilter(Bytes.toBytes(s[0]),  
                                                             Bytes.toBytes(s[1]),  
                                                             CompareOp.EQUAL,Bytes.toBytes(s[2])  
                                                             )  
            );  
            // 添加下面这一行后,则只返回指定的cell,同一行中的其他cell不返回  
//          s1.addColumn(Bytes.toBytes(s[0]), Bytes.toBytes(s[1]));  
        }  
        s1.setFilter(filterList);  
        ResultScanner ResultScannerFilterList = table.getScanner(s1);  
        for(Result rr=ResultScannerFilterList.next();rr!=null;rr=ResultScannerFilterList.next()){  
            for(KeyValue kv:rr.list()){  
                System.out.println("row : "+new String(kv.getRow()));  
                System.out.println("column : "+new String(kv.getColumn()));  
                System.out.println("value : "+new String(kv.getValue()));  
            }  
        }  
    }  


MultipleColumnPrefixFilter

api上介绍如下

This filter is used for selecting only those keys with columns that matches a particular prefix. For example, if prefix is 'an', it will pass keys will columns like 'and', 'anti' but not keys with columns like 'ball', 'act'. 

构造方法如下

public MultipleColumnPrefixFilter(byte[][] prefixes)

传入多个prefix
源码里说明如下

public MultipleColumnPrefixFilter(final byte [][] prefixes) {
     if (prefixes != null) {
       for (int i = 0; i < prefixes.length; i++) {
         if (!sortedPrefixes.add(prefixes[i]))
           throw new IllegalArgumentException ("prefixes must be distinct");
       }
     }
   }

示例代码如下:是我从网上找的,看了,没啥难理解的,

+public class TestMultipleColumnPrefixFilter {
+
+  private final static HBaseTestingUtility TEST_UTIL = new
+      HBaseTestingUtility();
+
+  @Test
+  public void testMultipleColumnPrefixFilter() throws IOException {
+    String family = "Family";
+    HTableDescriptor htd = new HTableDescriptor("TestMultipleColumnPrefixFilter");
+    htd.addFamily(new HColumnDescriptor(family));
+    // HRegionInfo info = new HRegionInfo(htd, null, null, false);
+    HRegionInfo info = new HRegionInfo(htd.getName(), null, null, false);
+    HRegion region = HRegion.createHRegion(info, HBaseTestingUtility.
+        getTestDir(), TEST_UTIL.getConfiguration(), htd);
+
+    List<String> rows = generateRandomWords(100