Should I divide a table by OneToOneField if the number of columns is too many?

风格不统一 提交于 2020-01-05 07:22:08

问题


I have a student model that already has too many fields including the name, nationality, address, language, travel history, etc of the student. It is as below:

class Student(Model):
    user = OneToOneField(CustomUser, on_delete=CASCADE)
    #  Too many other fields

A student has much more information I store in other tables with a OneToOne relationship with the student model such as:

class StudentIelts(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ielts = BooleanField(default=False,)
    # 8 other fields for IELTS including the scores and the date
    # and file field for uploading the IELTS result

# I have other models for Toefl, GMAT, GRE, etc that 
# are related to the student model in the same manner through 
# a OneToOne relationship such as:

class StudentIBT(Model):

    student = OneToOneField(Student, on_delete=CASCADE)
    has_ibt = BooleanField(default=False,)
    # other fields

Should I merge the tables into one table or the current database schema is good?

The reason I chose this schema is because I was not comfortable working with a table with too many columns. The point is that for every student, there should be a table for IELTS and other models and, as a result, the number of rows in Student table is the same as the number of rows in the IELTS table, as an example.


回答1:


This is a hard question to answer, with a lot of different opinions, but I would say you are correct in splitting up your relationship into two separate models.

However, there are several considerations to take into account.

When looking at this from a database design perspective, there is hardly any reason to split up your database tables. Whenever there is a one-to-one relationship that is always there, you should merge it into one table. The amount of columns hardly matters, unless you are optimising your database.

An answer from this question sums up the actual physical reasons to split up a 1-to-1 relationship quite nicely:

  • You might want to cluster or partition the two "endpoint" tables of a 1:1 relationship differently.
  • If your DBMS allows it, you might want to put them on different physical disks (e.g. more performance-critical on an SSD and the other on a cheap HDD).
  • You have measured the effect on caching and you want to make sure the "hot" columns are kept in cache, without "cold" columns "polluting" it.
  • You need a concurrency behavior (such as locking) that is "narrower" than the whole row. This is highly DBMS-specific.
  • You need different security on different columns, but your DBMS does not support column-level permissions.
  • Triggers are typically table-specific. While you can theoretically have just one table and have the trigger ignore the "wrong half" of the row, some databases may impose additional limits on what a trigger can and cannot do. For example, Oracle doesn't let you modify the so called "mutating" table from a row-level trigger - by having separate tables, only one of them may be mutating so you can still modify the other from your trigger (but there are other ways to work-around that).

Databases are very good at manipulating the data, so I wouldn't split the table just for the update performance, unless you have performed the actual benchmarks on representative amounts of data and concluded the performance difference is actually there and significant enough (e.g. to offset the increased need for JOINing).

Django's standpoint

If you look at the way Django is designed, there is some merit to splitting up your table into one-to-one relationships. One of Django's design philosophies is 'Loose coupling'. Which in Django's ecosystem means that separate applications shouldn't have to know about each other to function properly. In you case, it could be argued that a Student model shouldn't have to know anything about it's IELTS tests, because if you separate those two, the Student model could be reused in some other application. Also, some functionality that does some kind of analysis over IELTS tests, shouldn't have to 'know' anything about the student that took this test.

Do use this design pattern with some caution though. A good question to ask yourself would be not necessarily "How may columns do I have in my model?", because sometimes there is a good reason to have a lot of data in one model. So answering yes to this question alone would not necessarily merit splitting up your tables. A better question to ask yourself would be "Do I want to separate responsibilities/functionality of these two types of data?", which could be for any reason, like reusability or security.



来源:https://stackoverflow.com/questions/58180005/should-i-divide-a-table-by-onetoonefield-if-the-number-of-columns-is-too-many

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!