How to design a generic database whose layout may change over time?

为君一笑 提交于 2019-12-03 07:00:43

"This all seems very tricky to me, but I am database n00b - maybe it is simple to you gurus?"

Nope, it really is tricky. Fundamentally what you're describing is not a database application, it is a database application builder. In fact, it sounds as if you want to code something like Google App Engine or a web version of MS Access. Writing such a tool will take a lot of time and expertise.

Google has implemented flexible schemas by using its BigTable platform. It allows you to flex the schema pretty much at will. The catch is, this flexibility makes it very hard to write queries like "position = 'senior salesman' and salary > 50k".

So I don't think the NoSQL approach is what you need. You want to build an application which generates and maintains RDBMS schemas. This means you need to design a metadata repository from which you can generate dynamic SQL to build and change the users' schemas and also generate the front end.

Things your metadata schema needs to store

For schema generation:

  • foreign key relationships (an EMPLOYEE works in a DEPARTMENT)
  • unique business keys (there can be only one DEPARTMENT called "Sales")
  • reference data (permitted values of EMPLOYEE.POSITION)
  • column data type, size, etc
  • whether column is optional (i.e NULL or NOT NULL)
  • complex business rules (employee bonuses cannot exceed 15% of their salary)
  • default value for columns

For front-end generation

  • display names or labels ("Wages", "Salary")
  • widget (drop down list, pop-up calendar)
  • hidden fields
  • derived fields
  • help text, tips
  • client-side validation (associated JavaScript, etc)

That last points to the potential complexity in your proposal: a regular form designer like Joe Soap is not going to be able to formulate the JS to (say) validate that an input value is between X and Y, so you're going to have to derive it using templated rules.

These are by no means exhaustive lists, it's just off the top of my head.

For primary keys I suggest you use a column of GUID datatype. Timestamps aren't guaranteed to be unique, although if you run your database on an OS which goes to six places (i.e. not Windows) it's unlikely you'll get clashes.

last word

'My first thought was to use the control name to identify each column, then I realized that the user can edit the form and rename, so that maybe "name" becomes "employee" or "wages" becomes ":salary". I am leaning towards a unique number for each.'

I have built database schema generators before. They are hard going. One thing which can be tough is debugging the dynamic SQL. So make it easier on yourself: use real names for tables and columns. Just because the app user now wants to see a form titled HEADCOUNT it doesn't mean you have to rename the EMPLOYEES table. Hence the need to separate the displayed label from the schema object name. Otherwise you'll find yourself trying to figure out why this generated SQL statement failed:

update table_11123
set col_55542 = 'HERRING'
where col_55569 = 'Bootle'
/

That way madness lies.

In essence, you are asking how to build an application without specifications. Relational databases were not designed so that you can do this effectively. The common approach to this problem is an Entity-Attribute-Value design and for the type of system in which you want to use it, the odds of failure are nearly 100%.

It makes no sense for example, that the column called "Name" could become "Salary". How would a report where you want the total salary work if the salary values could have "Fred", "Bob", 100K, 1000, "a lot"? Databases were not designed to let anyone put anything anywhere. Successful database schemas require structure which means effort with respect to specifications on what needs to be stored and why.

Therefore, to answer your question, I would rethink the problem. The entire approach of trying to make an app that can store anything in the universe is not a recipe for success.

Like Thomas said, rational database is not good at your problem. However, you may want to take a look at NoSQL dbs like MongoDB.

See this article: http://www.simple-talk.com/opinion/opinion-pieces/bad-carma/ for someone else's experience of your problem.

This is for A) & B), and is not something I have done but thought it was an interesting idea that Reddit put to use, see this link (look at Lesson 3):

http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html

Not sure about the database but for charts instead of using PHP for the charts, I recommend looking into using javascript (http://www.reynoldsftw.com/2009/02/6-jquery-chart-plugins-reviewed/). Advantages to this are some of the processing is offloaded to the client side for chart displays and they can be interactive.

The other respondents are correct that you should be very cautious with this approach because it is more complex and less performant than the traditional relational model - but I've done this type of thing to accommodate departmental differences at work, and it worked fine for the amount of use it got.

Basically I set it up like this, first - a table to store some information about the Form the user wants to create (obviously, adjust as you need):

--************************************************************************
-- Create the User_forms table
--************************************************************************
create table User_forms
    (
    form_id            integer identity,
    name               varchar(200),
    status             varchar(1),
    author             varchar(50),
    last_modifiedby    varchar(50),
    create_date        datetime,
    modified_date      datetime
    )

Then a table to define the fields to be presented on the form including any limits and the order and page they are to be presented (my app presented the fields as a multi-page wizard type of flow).

-

-************************************************************************
-- Create the field configuration table to hold the entry field configuration
--************************************************************************
create table field_configuration
    (
    field_id                integer identity,
    form_id                 SMALLINT,
    status                  varchar(1),
    fieldgroup              varchar(20),
    fieldpage               integer,
    fieldseq                integer,
    fieldname               varchar(40),
    fieldwidth              integer,
    description             varchar(50),
    minlength               integer,
    maxlength               integer,
    maxval                  varchar(13),
    minval                  varchar(13),
    valid_varchars             varchar(20),
    empty_ok                varchar(1),
    all_caps                varchar(1),
    value_list              varchar(200),
    ddl_queryfile           varchar(100),
    allownewentry           varchar(1),
    query_params            varchar(50),
    value_default           varchar(20)
    );

Then my perl code would loop through the fields in order for page 1 and put them on the "wizard form" ... and the "next" button would present the page 2 fields in order etc.

I had javascript functions to enforce the limits specified for each field as well ...

Then a table to hold the values entered by the users:

--************************************************************************
-- Field to contain the values
--************************************************************************
create table form_field_values
    (
    session_Id        integer identity,
    form_id           integer,
    field_id          integer,
    value             varchar(MAX)
    );

That would be a good starting point for what you want to do, but keep an eye on performance as it can really slow down any reports if they add 1000 custom fields. :-)

I agree with Mark, using others experience can prevent many unforeseen mistakes especially for " database n00b ", see this article from a guy working at a company that specializing in generic back end

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!