日期:2014-05-16  浏览次数:20479 次

count(*)和sum(1) 的效率
count(*)和sum(*)的结果有时候是一样的,所以有时候开放在写存储过程的时候会用到sum当count使用

这样有不有问题呢?我们来讨论下这2个的效率。

SQL> oradebug event 10046 trace name context forever,level 12;
Statement processed.
SQL>  select count(*) from business.PRPLCERTIFYIMGTEMPBAK;

  COUNT(*)
----------
 103287157

Elapsed: 00:00:10.01
SQL> select sum(1) from business.PRPLCERTIFYIMGTEMPBAK;

    SUM(1)
----------
 103287157

Elapsed: 00:00:13.27
SQL> oradebug event 10046 trace name context off;
Statement processed.
SQL> oradebug tracefile_name

从10046的信息,我们很容易就发现。sum的效率不足体现在cpu上,I/O完全一样。

select count(*) 
from
 business.PRPLCERTIFYIMGTEMPBAK


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.02          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2      9.98       9.75     203828     203874          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4      9.98       9.77     203828     203874          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=203874 pr=203828 pw=0 time=9750091 us)
103287157   INDEX FAST FULL SCAN IDX_SERIALNO (cr=203874 pr=203828 pw=0 time=309861672 us)(object id 155232)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file scattered read                      12791        0.00          0.65
  SQL*Net message from client                     2        3.47          3.47
********************************************************************************

select sum(1) 
from
 business.PRPLCERTIFYIMGTEMPBAK


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        2     13.26      12.95     203828     203874          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        4     13.26      12.95     203828     203874          0           1

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: SYS

Rows     Row Source Operation
-------  ---------------------------------------------------
      1  SORT AGGREGATE (cr=203874 pr=203828 pw=0 time=12957830 us)
103287157   INDEX FAST FULL SCAN IDX_SERIALNO (cr=203874 pr=203828 pw=0 time=206574428 us)(object id 155232)


Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       2        0.00          0.00
  db file scattered read                      12791        0.00          0.63
  SQL*Net message from client                     2       10.56         10.56

为什么sum的效率不如count呢。那就要了解count和sum的算法了
举例说明他的算法:


在排序统计的时候
sum的算法为1+1+1+1+1+1=6,sum是未知的,需要额外的CPU每一步都需要计算
count是1+2+3=6,就是说count的计算公式是定的,发现一行,加1,那么就是1,下一次是从2开始加
显然sum的效率不如count。

所以开发童鞋们。在遇到类似的时候尽量使用count,而不用sum。