问题
Which of the following db design would be preferable for an internal messaging system.
Three tables:
MessageThread(models.Model):
- subject
- timestamp
- creator
Message(models.Model):
- thread (pk)
- content
- timestamp
- sender
MessageRecipient
- message_id (pk)
- recipient (pk)
- status (read, unread, deleted)
Two tables:
Message
- thread_id
- subject
- content
- timestamp
- sender (fk)
MessageRecipient
- message_id (fk)
- recipient (fk)
- status (read, unread, deleted)
What would be the advantages of one over another? Thank you.
回答1:
Strengths of the first
The first schema obeys better normalization rules, and so is probably better in most cases.
Having a thread_id
, which is basically a natural key, that isn't a FK to another table is probably asking for trouble. It will be very difficult to enforce that it is unique when you want it to be, and the same when you want it to be. For this reason, I would encourage the first suggested schema.
Strengths of the second
Your second schema allows the subject to be altered for each message in the thread. If this is a feature you want, you can't use the first option, as you've written it (but see below).
Other options
Message
- id
- parent (fk to Message.id)
- subject
- content
- timestamp
- sender (fk)
MessageRecipient
- message_id (fk)
- recipient (fk)
- status (read, unread, deleted)
Instead of having a thread_id
concept, you can intead have a parent
concept. Then every reply will point to the original message's record. This allows threading, without a 'thread' table. Another possible advantage of this, is it allows thread trees as well. Simply put, you can represent much more complicated relationships between messages and replies this way. If you don't care about that, then this won't be a bonus for your application.
If you don't care about the threading advantages I just mentioned, I would probably recommend a hybrid of your two schemas:
MessageThread(models.Model):
- id
Message(models.Model):
- thread (pk)
- subject
- content
- timestamp
- sender
MessageRecipient
- message_id (pk)
- recipient (pk)
- status (read, unread, deleted)
This is similar to first schema, except that I moved the 'subject' column from the MessageThread
to the Message
table, to allow the subject to change as the thread progresses... I'm simply using the MessageThread table to act as a constraint on the thread ID used in Message (which overcomes the limitations I mentioned at the beginning of my answer). You may have additional meta data you want to include in the MessageThread table as well, but I'll leave that up to you and your application.
回答2:
A separate MesageThread
table can come useful if later on you want to add some additional thread properties, like 'locked', 'sticky' or 'important'. Choosing a more complicated model just for the sake of possibly adding additional features in the future is usually not a good idea, though.
First model (the one with MessageThread table) guarantees that all the messages in the thread have the same subject, in second model every message in the thread can have a different subject. This can be a good thing or a bad thing, depending on how do you want the messaging to work.
First model makes possible to declare message.thread_id
column as a foreign key, so you cannot insert a message without valid thread reference. With second model, you don't have that guarantee. This may cause some bugs later on.
I don't think that MessageThread.timestamp
and MessageThread.creator
columns in first model are really needed; aren't those the same as timestamp and creator of the first message in thread? Such redundancy may have negative consequences.
I'd go with the first model, but I would drop creator and timestamp fields from the MessageThread
.
来源:https://stackoverflow.com/questions/7926388/comparing-two-db-designs-for-internal-messaging