Database design: one huge table or separate tables?

后端未结

关注

 13  1927

Currently I am designing a database for use in our company. We are using SQL Server 2008. The database will hold data gathered from several customers. The goal of the databa

相关标签:

13条回答

野趣味

2020-12-09 03:56

Splitting tables for performance reasons is called sharding. Also, a database schema can be more or less normalized. A normalized schema has separate tables with relations between them, and data is not duplicated.

0 讨论(0)
发布评论:

提交评论
- 加载中...
孤独总比滥情好

2020-12-09 03:57

One table, then worry about performance. That is, assuming you are collecting the exact same information for each customer. That way, if you have to add/remove/modify a column, you are only doing it in one place.

0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2020-12-09 03:58

Partioning is definately something to look into. I had a database that had 2 tables sharded. Each table contained around 30-35million records. I have since merged this into one large table and assigned some good indexes. So far, I've not had to partition this table as it's working a treat, but I'm keep partitioning in mind. One thing that I have noticed, compared to when the data was sharded, and that's the data import. It is now slower, but I can live with that as the Import tool can be re-written ;o)

0 讨论(0)
发布评论:

提交评论
- 加载中...
抹茶落季

2020-12-09 03:59

You can also create supplemental tables that hold already calculated details on historical information if there are common queries.

0 讨论(0)
发布评论:

提交评论
- 加载中...
感情败类

2020-12-09 04:01

Datawarehouses are supposed to be big (the clue is in the name). Twenty million rows is about medium by warehousing standards, although six hundred million can be considered large.

The thing to bear in mind is that such large tables have a different physics, like black holes. So tuning them takes a different set of techniques. The other thing is, users of a datawarehouse must understand that they are dealing with huge amounts of data, and so they must not expect sub-second response (or indeed sub-minute) for every query.

Partitioning can be useful, especially if you have clear demarcations such as, as in your case, CUSTOMER. You have to be aware that partitioning can degrade the performance of queries which cut across the grain of the partitioning key. So it is not a silver bullet.

0 讨论(0)
发布评论:

提交评论
- 加载中...
陌清茗

2020-12-09 04:04

Since you've tagged your question as 'datawarehouse' as well I assume you know some things about the subject. Depending on your goals you could go for a star-schema (a multidemensional model with a fact and dimensiontables). Store all fastchanging data in 1 table (per subject) and the slowchaning data in another dimension/'snowflake' tables.

An other option is the DataVault method by Dan Lindstedt. Which is a bit more complex but provides you with full flexibility.

http://danlinstedt.com/category/datavault/

0 讨论(0)
发布评论:

提交评论
- 加载中...