日期:2014-05-16 浏览次数:20502 次
MongoDB官网转载:http://www.mongodb.org/display/DOCS/Schema+Design
IntroductionWith Mongo, you do less "normalization" than you would perform designing a relational schema because there are no server-side "joins". Generally, you will want one database collection for each of your top level objects. You do not want a collection for every "class" - instead, embed objects. For example, in the diagram below, we have two collections, students and courses. The student documents embed address documents and the "score" documents, which have references to the courses. ? ? ? Compare this with a relational schema, where you would almost certainly put the scores in a separate table, and have a foreign-key relationship back to the students. Embed vs. ReferenceThe key question in Mongo schema design is "does this object merit its own collection, or rather should it embed in objects in other collections?" In relational databases, each sub-item of interest typically becomes a separate table (unless denormalizing for performance). In Mongo, this is not recommended - embedding objects is much more efficient. Data is then colocated on disk; client-server turnarounds to the database are eliminated. So in general the question to ask is, "why would I not want to embed this object?" So why are references slow? Let's consider our students example. If we have a student object and perform: ? print( student.address.city ); ? This operation will always be fast as address is an embedded object, and is always in RAM if student is in RAM. However for ? print( student.scores[0].for_course.name ); ? if this is the first access to scores[0], the shell or your driver must execute the query // pseudocode for driver or framework, not user code
student.scores[0].for_course = db.courses.findOne({_id:_course_id_to_find_});
? Thus, each reference traversal is a query to the database. Typically, the collection in question is indexed on _id. The query will then be reasonably fast. However, even if all data is in RAM, there is a certain latency given the client/server communication from appserver to database. In general, expect 1ms of time for such a query on a ram cache hit. Thus if we were iterating 1,000 students, looking up one reference per student would be quite slow - over 1 second to perform even if cached. However, if we only need to look up a single item, the time is on the order of 1ms, and completely acceptable for a web page load. (Note that if already in db cache, pulling the 1,000 students might actually take much less than 1 second, as the results return from the database in large batches.) Some
免责声明: 本文仅代表作者个人观点,与爱易网无关。其原创性以及文中陈述文字和内容未经本站证实,对本文以及其中全部或者部分内容、文字的真实性、完整性、及时性本站不作任何保证或承诺,请读者仅作参考,并请自行核实相关内容。
|