日期:2014-05-16  浏览次数:20346 次

同时查询最大最小值的效率问题
有时候我们需要同时查询一部分数据中的最大和最小值,如何才能最快取得我们要的结果呢?

我们知道索引已经对数据进行了排序,所以这时走索引效率是最高的,下面看一个例子

--在一个有将近两百万数据的表中创建日期索引,需要同时取出日期的最大值和最小值:

SQL> select COUNT(*) from USER_OBJECTS_TMP t;
 
  COUNT(*)
----------
   1933312

CREATE INDEX user_objects_create_dt ON user_objects_tmp(created) TABLESPACE tbs_lubinsu_idx;

SQL> desc user_objects_tmp
Name           Type          Nullable Default Comments 
-------------- ------------- -------- ------- -------- 
OBJECT_NAME    VARCHAR2(128) Y                         
SUBOBJECT_NAME VARCHAR2(30)  Y                         
OBJECT_ID      NUMBER        Y                         
DATA_OBJECT_ID NUMBER        Y                         
OBJECT_TYPE    VARCHAR2(19)  Y                         
CREATED        DATE          Y                         
LAST_DDL_TIME  DATE          Y                         
TIMESTAMP      VARCHAR2(19)  Y                         
STATUS         VARCHAR2(7)   Y                         
TEMPORARY      VARCHAR2(1)   Y                         
GENERATED      VARCHAR2(1)   Y                         
SECONDARY      VARCHAR2(1)   Y                         ;

--在CREATED字段创建了索引,如果直接使用MAX和MIN函数来获取最大最小值,如下所示:
--消耗时间:Elapsed: 00:00:00.54,从执行计划我们可以看出改语句并没有走索引,因为这里没有指定谓词。

SQL> set timing on
SQL> set autotrace traceonly
SQL> set linesize 200  
SQL> SELECT MIN(created), MAX(created) AS MIN FROM user_objects_tmp;

Elapsed: 00:00:00.54

Execution Plan
----------------------
Plan hash value: 3066201300

---------------------------------------------------
| Id  | Operation          | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------
|   0 | SELECT STATEMENT   |                  |     1 |     9 |  1718   (4)| 00:00:21 |
|   1 |  SORT AGGREGATE    |                  |     1 |     9 |            |          |
|   2 |   TABLE ACCESS FULL| USER_OBJECTS_TMP |  1600K|    13M|  1718   (4)| 00:00:21 |
---------------------------------------------------

Note
-----
   - dynamic sampling used for this statement


Statistics
----------------------
          5  recursive calls
          0  db block gets
       8639  consistent gets
        659  physical reads
          0  redo size
        481  bytes sent via SQL*Net to client
        385  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

--如果指定谓词之后,则结果如下,并没有因此而提高效率,
--因为同时指定最大MAX最小MIN值并不能走索引的 INDEX FULL SCAN (MIN/MAX) 扫描,
--对于INDEX FAST FULL SCAN的读取方式是:从段头开始,读取包含位图块,ROOT BLOCK,所有的BRANCH BLOCK,LEAF BLOCK,
--读取的顺序完全有物理存储位置决定,并采取多块读,每次读取DB_FILE_MULTIBLOCK_READ_COUNT个块。
--查询某个表记录总数的时候,往往基于PRIMARY KEY的INDEX FAST FULL SCAN是最有效的,但是在这里并不是最合适的。
--INDEX FULL SCAN的读取方式是:先定位到索引树的根节点,然后分支,再读取叶子节点,根据叶子节点的双向链表顺序读取,
--这种读取方式是顺序的并且也是经过排序的,所以我们希望走这种方式:

SQL> SELECT MIN(created), MAX(created) AS MIN FROM user_objects_tmp a WHERE a.created IS NOT NULL;


Elapsed: 00:00:00.62

Execution Plan
----------------------
Plan hash value: 3784617757

------------------------------------------------------------
| Id  | Operation             | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------
|   0 | SELECT STATEMENT      |                        |     1 |     9 |  1661   (1)| 00:00:20 |
|   1 |  SORT AGGREGATE       |                        |     1 |     9 |            |          |
|*  2 |   INDEX FAST FULL SCAN| USER_OBJECTS_CREATE_DT |  1600K|    13M|  1661   (1)| 00:00:20 |
------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("A"."CREATED" IS NO