日期:2014-05-17  浏览次数:21053 次

关于大数据量的数据库设计问题
从全国的2万个网点汇总数据到中央数据库,每天数据总量大约100亿条左右,每调数据大约为2K。
数据库打算用oracle,要考虑到以后的报表和查询问题,数据库服务器如何设计,数据库如何设计,表如何设计,是要一个网点一个表设计比较好,还是每天一个表好?是否要用到RAC?请各位高手指点。

------解决方案--------------------
用动态语句吧,每天建立一张表。这样虽然麻烦,但后期的维护等的效率都高。
如果一个分区建立一张表,那么表只会的慢慢的增大。这对后期不利
------解决方案--------------------
不导,直接用分区视图?
各个网点两份数据,做读写分离,一份供正常业务用,同步一份供中央服务器做分区视图用。各网点负责自己的数据安全。
若因为网点过多实现不方便,当然还可以再分解任务,网点汇总到市级,市级汇总到省级,中央服务器用分区视图关联省级。
------解决方案--------------------
从时间上计算:1000000万/2万=500000,每天每个网点50万数据,500000/12/3600=11.57,就算11吧,每个网点每秒钟产生11笔数据
从空间上计算: 2K*100亿=20000000000K=20000000M=20000G=20T,每天20T数据
这个数量级上应该不是数据库干的事情,这数据进了数据库就废了,你可以问问Google和Facebook,他们用的什么技术,Hadoop

------解决方案--------------------
-- 用Oracle 11.2.0.3 版本的 按天间隔分区 + 网点子分区加以解决
-- Oracle 11.2.0.1 版本并行查询有问题,经常造成查询结果不正确。所以建议用Oracle 11.2.0.3 版本

-- 类似建表示例如下:
 CREATE TABLE BIEE.DW_ADS_IMP_DAY_WQ
(
DATE_TIME GENERATED ALWAYS AS (TO_DATE(TO_CHAR("DATE_ID"),'YYYYMMDD')) VIRTUAL VISIBLE,
DATE_ID NUMBER(8),
SITE_ID number(1),
CHANNEL_ID number(7),
SUB_CHANNEL_ID number(7),
PROVINCE_ID NUMBER(6),
CITY_ID  number(6),
TYPE_ID number(2),
SLOT_ID number(4),
CREATIVE_ID number,
IMP number,
CLICK number,
IMPOVER number
)
PARTITION BY RANGE (date_time) INTERVAL(NUMTODSINTERVAL(1,'day'))
STORE IN (adv_m01_tbs,adv_m02_tbs,adv_m03_tbs,adv_m04_tbs,adv_m05_tbs,adv_m06_tbs,adv_m07_tbs,adv_m08_tbs,adv_m09_tbs,adv_m10_tbs,adv_m11_tbs,adv_m12_tbs) 
SUBPARTITION BY LIST (SLOT_ID)
  SUBPARTITION TEMPLATE
  ( 
  SUBPARTITION p_01 VALUES(-14,1,16,31,46,61,76,91,106,121,136,151,166,181,196,211,226,241,256,271,286,301,316,331,346,361,376,391,406,421,436,451,466,481,496,511,526,541,556,571,586,601,616,631,646,661,676,691,706,721,736,751,766,781,796,811,826,841,856,871,886,901,916,931,946,961,976,991,1006),
  SUBPARTITION p_02 VALUES(-13,2,17,32,47,62,77,92,107,122,137,152,167,182,197,212,227,242,257,272,287,302,317,332,347,362,377,392,407,422,437,452,467,482,497,512,527,542,557,572,587,602,617,632,647,662,677,692,707,722,737,752,767,782,797,812,827,842,857,872,887,902,917,932,947,962,977,992,1007),
  SUBPARTITION p_03 VALUES(-12,3,18,33,48,63,78,93,108,123,138,153,168,183,198,213,228,243,258,273,288,303,318,333,348,363,378,393,408,423,438,453,468,483,498,513,528,543,558,573,588,603,618,633,648,663,678,693,708,723,738,753,768,783,798,813,828,843,858,873,888,903,918,933,948,963,978,993,1008),
  SUBPARTITION p_04 VALUES(-11,4,19,34,49,64,79,94,109,124,139,154,169,184,199,214,229,244,259,274,289,304,319,334,349,364,379,394,409,424,439,454,469,484,499,514,529,544,559,574,589,604,619,634,649,664,679,694,709,724,739,754,769,784,799,814,829,844,859,874,889,904,919,934,949,964,979,994,1009),
  SUBPARTITION p_05 VALUES(-10,5,20,35,50,65,80,95,110,125,140,155,170,185,200,215,230,245,260,275,290,305,320,335,350,365,380,395,410,425,440,455,470,485,500,515,530,545,560,575,590,605,620,635,650,665,680,695,710,725,740,755,770,785,800,815,830,845,860,875,890,905,920,935,950,965,980,995,1010),
  SUBPARTITION p_06 VALUES(-9,6,21,36,51,66,81,96,111,126,141,156,171,186,201,216,231,246,261,276,291,306,321,336,351,366,381,396,411,426,441,456,471,486,501,516,531,546,561,576,591,606,621,636,651,666,681,696,711,726,741,756,771,786,801,816,831,846,861,