日期:2014-05-16  浏览次数:21175 次

如何不用hash比较实现数据仓库拉链算法
数据仓库中有种拉链算法来实现历史数据的存储,一般是通过hash比较来实现的,现在实际问题如下,请教各位,如果不用传统的拉链算法,还有什么别的办法实现这个功能。
  member表,字段 member_id,member_status,start_date,end_date
  A,enabled,2010-03-04,2010-03-05
  A,diabled,2010-03-06,2010-03-08
  A,tbd, 2010-03-09,2010-04-15
  A,delete, 2010-04-16,2010-06-07
  A,enabled,2010-06-08,3000-12-31
  B,diabled,2010-03-02,2010-03-08
  B,enabled,2010-03-09,2010-04-01
  B,unknown,2010-06-08,2010-04-10
  B,enabled,2010-04-11,2010-04-30
  B,tbd, 2010-05-01,3000-12-31
  product表,字段 product_id,member_id,product_status,start_date,end_date
  001,A,online, 2010-03-04,2010-03-21
  001,A,wait, 2010-03-22,2010-03-26
  001,A,enabled,2010-03-26,2010-04-11
  001,A,online, 2010-04-12,2010-06-05
  001,A,online, 2010-06-06,3000-12-31
  002,B,delete, 2010-03-02,2010-03-19
  002,B,online, 2010-03-20,2010-04-12
  002,B,delete, 2010-04-13,2010-06-03
  002,B,enabled,2010-06-04,3000-12-31

意思就是每个会员每段时间的状态不一样,产品每段时间状态也不一样,现在通过member_id关联得到产品所属会员的状态信息,同样以时间段来区分product_status和member_status,就像两条时间轴交叉。结果要求如下:
  product_id,product_status,member_id,member_status,start_date,end_date
  001,online,A,enabled,2010-03-04,2010-03-05
  001,online,A,diabled,2010-03-06,2010-03-08
  001,online,A,tbd, 2010-03-09,2010-03-21
  001,wait, A,tbd, 2010-03-22,2010-03-26
  ……

请问这个该如何实现?
  建表脚本如下:
create table tab_member(member_id varchar2(20),member_status varchar2(20),start_date date,end_date date);
create table tab_product(product_id varchar2(20),member_id varchar2(20),product_status varchar2(20),start_date date,end_date date);
   
insert into tab_member values('A','enabled',to_date('2010-03-04','yyyy-mm-dd'),to_date('2010-03-05','yyyy-mm-dd'));
insert into tab_member values('A','diabled',to_date('2010-03-06','yyyy-mm-dd'),to_date('2010-03-08','yyyy-mm-dd'));
insert into tab_member values('A','tbd' ,to_date('2010-03-09','yyyy-mm-dd'),to_date('2010-04-15','yyyy-mm-dd'));
insert into tab_member values('A','delete' ,to_date('2010-04-16','yyyy-mm-dd'),to_date('2010-06-07','yyyy-mm-dd'));
insert into tab_member values('A','enabled',to_date('2010-06-08','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_member values('B','diabled',to_date('2010-03-02','yyyy-mm-dd'),to_date('2010-03-08','yyyy-mm-dd'));
insert into tab_member values('B','enabled',to_date('2010-03-09','yyyy-mm-dd'),to_date('2010-04-01','yyyy-mm-dd'));
insert into tab_member values('B','unknown',to_date('2010-06-08','yyyy-mm-dd'),to_date('2010-04-10','yyyy-mm-dd'));
insert into tab_member values('B','enabled',to_date('2010-04-11','yyyy-mm-dd'),to_date('2010-04-30','yyyy-mm-dd'));
insert into tab_member values('B','tbd' ,to_date('2010-05-01','yyyy-mm-dd'),to_date('3000-12-31','yyyy-mm-dd'));
insert into tab_product values('001','A','online' , to_date('2010-03-04','yyyy-mm-dd'),to_date('2010-03-21','yyyy-mm-dd'));
insert into tab_product values('001','A','wait' , to_date('2010-03-22','yyyy-mm-dd'),to_date('2010-03-26','yyyy-mm-dd'));
insert into tab_product values('001','A','enabled', to_date('2010-03-26','yyyy-mm-dd'),to_date('2010-04-11','yyy