日期:2014-05-18  浏览次数:20468 次

求数据结构算法
如下表记录
序号 面值
1 100.0000
2 100.0000
3 100.0000
4 100.0000
7 100.0000
8 100.0000
19 100.0000
22 100.0000
25 100.0000
10 200.0000
11 200.0000
13 200.0000
14 300.0000
15 300.0000
17 300.0000
想生成
起始序号 终止序号 面值
1 4 100
7 8 100
19 19 100
22 22 100
25 25 100
10 11 200
13 13 200
14 15 300
17 17 300
在SQLSERVER中有什么好的算法?我现在用存储过程,如果有上万条记录的话很慢

------解决方案--------------------
SQL code

孤岛问题解决方案1:使用子查询和排名计算
step 1:找出间断之后的点,为他们分配行号(这是孤岛的起点)
step 2:找出间断之前的点,为他们分配行号(这是孤岛的终点)
step 3:以行号相等作为条件,匹配孤岛的起点和终点

--实现代码:
    with startpoints as
    (
      select id,row_number()over(order by id) as rownum
           from tbl as a where not exists(
        select 1 from tbl as b where b.id=a.id-1) 
     /*
     此查询语句单独运行的结果:
     id    rownum
     2    1
     11    2
     27    3
     33    4
     42    5
     */
    ),
    endpoinds as
    (
      select id,row_number()over(order by id) as rownum
          from tbl as a where not exists(
        select 1 from tbl as b where b.id=a.id+1)
   /*
     此查询语句单独运行的结果:
     id    rownum
     3    1
     13    2
     27    3
     35    4
     42    5
    */
    )
    select s.id as start_range,e.id as end_range
    from startpoints as s
    inner join endpoinds as e
    on e.rownum=s.rownum
--运行结果:   
/*
 start_range    end_range
 2    3
 11    13
 27    27
 33    35
 42    42
*/

孤岛问题解决方案2:使用基于子查询的组标识符

--直接给出代码:

with d as
(
  select id,(select min(b.id) from tbl b where b.id>=a.id
      and not exists (select * from tbl c where c.id=b.id+1)) as grp
  from tbl a
)
select min(id) as start_range,max(id) as end_range
from d group by grp
/*
start_range    end_range
2    3
11    13
27    27
33    35
42    42
*/


孤岛问题解决方案3:使用基于子查询的组标识符:

step 1:按照id顺序计算行号:
   select id ,row_number()over(order by id) as rownum from tbl
/*
id    rownum
2    1
3    2
11    3
12    4
13    5
27    6
33    7
34    8
35    9
42    10
*/
step 2:生成id和行号的差:
   select id,id-row_number()over(order by id) as diff from tbl
/*
id    diff
2    1
3    1
11    8
12    8
13    8
27    21
33    26
34    26
35    26
42    32
*/
这里解释一下这样做的原因;
   因为在孤岛范围内,这两个序列都以相同的时间间隔来保持增长,所以
   这时他们的差值保持不变。只要遇到一个新的孤岛,他们之间的差值就
   会增加。这样做的目的为何,第三步将为你说明。
step 3:分别取出第二个查询中生成的相同的diff的值的最大id和最小id
    with t as(
      select id,id-row_number()over(order by id) as diff from tbl
    )
    select min(id) as start_range,max(id) as end_range from t
       group by diff
/*
start_range    end_range
2    3
11    13
27    27
33    35
42    42
*/

求孤岛问题,低三种方法效率较前两种较高,具有比较强的技巧性
希望在实际运用中采纳。
*/


孤岛问题,参考我的博客内容

------解决方案--------------------
SQL code

;with ach as
(
    select 1 as id,100.0000 as num union all
    select 2 ,100.0000 union all
    select 3 ,100.0000 union all
    select 4 ,100.0000 union all
    select 7 ,100.0000 union all
    select 8 ,100.0000 union all
    select 19 ,100.0000 union all
    select 22 ,100.0000 union all
    select 25 ,100.0000 union all
    select 10 ,200.0000 union all
    select 11 ,200.0000 union all
    select 13 ,200.0000 union all
    select 14 ,300.0000 union all
    select 15 ,300.0000 union all
    select 17 ,300.0000
),art as
(
    select row_number() over (order by getdate()) as rnt,
        id,num
    from ach
),cte as
(
    select *,rno=row_number() over (partition by num order by rnt)
    from art
)

select min(id) minid,max(id) maxid,num
from cte
group by num,id-rno

/**************************

minid       maxid       num
----------- ----------- ---------------------------------------
1           4           100.0000
7           8           100.0000
10          11          200.0000
13          13          200.0000
19          19          100.0000
14          15          300.0000
22          22          100.0000
17          17          300.0000
25          25