日期:2014-05-16  浏览次数:20425 次

Hbase 源码分析之 Regionserver上的 Get 全流程
当regionserver收到来自客户端的Get请求时,调用接口
public Result get(byte[] regionName, Get get)
{
...
HRegion region = getRegion(regionName);
return region.get(get, getLockFromId(get.getLockId()));
...
}

我们看HRegion.get接口,其首先会做family检测,保证Get中的family与Table的相符,然后通过RegionScanner.next来返回result

而Scanner是Hbase读流程中的主要类,先做一个大概描述:
从Scanner的scan范围来分有RegionScanner,StoreScanner,MemstoreScanner,HFileScanner;根据名称很好理解他们的作用,而他们之间的关系:RegionScanner由一个或多个StoreScanner组成,StoreScanner由MemstoreScanner和HFileScanner组成;

再看RegionScanner类的构造形成过程:
List<KeyValueScanner> scanners = new ArrayList<KeyValueScanner>();
for (Map.Entry<byte[], NavigableSet<byte[]>> entry :
scan.getFamilyMap().entrySet())
{
        Store store = stores.get(entry.getKey());
        scanners.add(store.getScanner(scan, entry.getValue()));
}
      this.storeHeap = new KeyValueHeap(scanners, comparator);

这段代码为RegionScanner类内部属性storeHeap初始化,其内容就是Region下面所有StoreScanner的和;storeHeap是一个KeyValueHeap,从字面可以理解result就是从中获取的

接着看store.getScanner(scan, entry.getValue())即StoreScanner类的构造形成过程:
//StoreScanner is a scanner for both the memstore and the HStore files
 List<KeyValueScanner> scanners = new LinkedList<KeyValueScanner>();
    // First the store file scanners
    if (memOnly == false) {
      List<StoreFileScanner> sfScanners = StoreFileScanner
      .getScannersForStoreFiles(store.getStorefiles(), cacheBlocks, isGet);

      // include only those scan files which pass all filters
      for (StoreFileScanner sfs : sfScanners) {
        if (sfs.shouldSeek(scan, columns)) {
          scanners.add(sfs);
        }
      }
    }
    // Then the memstore scanners
    if ((filesOnly == false) && (this.store.memstore.shouldSeek(scan))) {
        scanners.addAll(this.store.memstore.getScanners());
    }
return scanners;

一般情况下StoreScanner中添加了HFileScanner和MemStoreScanner;
StoreFileScanner的内部属性包括HFileScanner和Hfile.Reader,在添加前会根据timestamp,columns,bloomfilter过滤掉一部分

Scanner构造完毕以后,当最上层的RegionScanner.next时,首先会先从MemStoreScanner中获取,如果没有或者版本数不足,则再从HfileScanner中获取,而从HfileScanner获取时,先查看是否在blockcache中,如果MISS则再从底层的HDFS中获取block,并根据设置决定是否将Block cache到LruBlockCache中