利用Oracle分析函数实现多行数据合并为一行
demo场景,以oracle自带库中的表emp为例:
select ename,deptno from emp order by deptno;
ENAME DEPTNO
CLARK 10
KING 10
MILLER 10
SMITH 20
ADAMS 20
FORD 20
SCOTT 20
JONES 20
ALLEN 30
BLAKE 30
MARTIN 30
JAMES 30
TURNER 30
WARD 30
现在想要将同一部门的人给合并成一行记录,如何做呢?如下:
ENAME DEPTNO
CLARK,KING,MILLER 10
ADAMS,FORD,JONES,SCOTT,SMITH 20
ALLEN,BLAKE,JAMES,MARTIN,TURNER,WARD 30
通常我们都是自己写函数或在程序中处理,这里我们利用oracle自带的分析函数row_number()和sys_connect_by_path来进行sql语句层面的多行到单行的合并,并且效率会非常高。
基本思路:
1、对deptno进行row_number()按ename排位并打上排位号
select deptno,ename,row_number() over(partition by deptno order by deptno,ename) rank
from emp order by deptno,ename;
DEPTNO ENAME RANK
10 CLARK 1
10 KING 2
10 MILLER 3
20 ADAMS 1
20 FORD 2
20 JONES 3
20 SCOTT 4
20 SMITH 5
30 ALLEN 1
30 BLAKE 2
30 JAMES 3
30 MARTIN 4
30 TURNER 5
30 WARD 6
可看出,经过row_number()后,部门人已经按部门和人名进行了排序,并打上了一个位置字段rank
2、利用oracle的递归查询connect by进行表内递归,并通过sys_connect_by_path进行父子数据追溯串的构造,这里要针对ename字段进行构造,使之合并在一个字段内(数据很多,只截取部分)
select deptno,ename,rank,level as curr_level,
ltrim(sys_connect_by_path(ename,','),',') ename_path from (
select deptno,ename,row_number() over(partition by deptno order by deptno,ename) rank
from emp order by deptno,ename) connect by deptno = prior deptno and rank-1 = prior rank;
各部门递归后的数据量都是:(1+n)/2 * n 即:deptno=10 数据量:(1+3)/2 * 3 = 6;
deptno=20 数据量:(1+5)/2 * 5 = 15; deptno=30 数据量:(1+6)/2 * 6 = 21;
DEPTNO ENAME RANK CURR_LEVEL ENAME_PATH
10 CLARK 1 1 CLARK
10 KING 2 2 CLARK,KING
10 MILLER 3 3 CLARK,KING,MILLER
10 KING 2 1 KING
10 MILLER 3 2 KING,MILLER
10 MILLER 3 1 MILLER
DEPTNO ENAME RANK CURR_LEVEL ENAME_PATH
20 ADAMS 1 1 ADAMS
20 FORD 2 2 ADAMS,FORD
20 JONES 3 3 ADAMS,FORD,JONES
20 SCOTT 4 4 ADAMS,FORD,JONES,SCOTT
20 SMITH 5 5 ADAMS,FORD,JONES,SCOTT,SMITH
20 FORD 2 1 FORD
20 JONES 3 2 FORD,JONES
20 SCOTT 4 3 FORD,JONES,SCOTT
20 SMITH 5 4 FORD,JONES,SCOTT,SMITH
20 JONES 3 1 JONES
20 SCOTT 4 2 JONES,SCOTT
20 SMITH 5 3 JONES,SCOTT,SMITH
20 SCOTT 4 1 SCOTT
20 SMITH 5 2 SCOTT,SMITH
20 SMITH 5 1 SMITH
这里我们仅列出deptno=10、20的,至此我们应该能否发现一些线索了,即每个部门中,curr_level最高的那行,有我们所需要的数据。那后面该怎么办,取出那个数据?对了,继续用row_number()进行排位标记,然后再按排位标记取出即可。
3、 对deptno继续进行row_number()按curr_level排位
select deptno,ename_path,row_number() over(partition by deptno order by deptno,curr_level desc) ename_path_rank from (select deptno,ename,rank,level as curr_level,
ltrim(sys_connect_by_path(ename,','),',') ename_path from (
select deptno,ename,row_number() over(partition by deptno order by deptno,ename) rank
from emp order by deptno,ename) connect by deptno = prior deptno and rank-1 = prior rank);
DEPTNO ENAME_PATH ENAME_PATH_RANK
10 CLARK,KING,MILLER 1
10 CLARK,KING 2
10 KING,MILLER 3
10 CLARK 4
10 KING 5
10 MILLER 6
DEPTNO ENAME_PATH ENAME_PATH_RANK
20 ADAMS,FORD,JONES,SCOTT,SMITH 1
20 ADAMS,FORD,JONES,SCOTT 2
20 FORD,JONES,SCOTT,SMITH 3
20 ADAMS,FORD,JONES 4
20 FORD,JONES,SCOTT 5
20 JONES,SCOTT,SMITH 6
20 ADAMS,FORD 7
20 FORD,JONES 8
20 SCOTT,SMITH 9
20 JONES,SCOTT 10
20 ADAMS 11
20 JONES 12
20 SMITH 13
20 SCOTT 14
20 FORD 15
这里还是仅列出deptno为10、20的,至此应该很明了了,在进行一次查询,取ename_path_rank为1的即可获得我们想要的结果。
4、获取想要排位的数据,即得