How to normalize a SQL Database

霸气de小男生 提交于 2019-12-28 12:13:35

问题


I was wondering if anyone had any suggestions on how to normalize a database. Now, I don't mean designing the structure, I mean how to actually move the database data from an old structure to the new, normalized structure. I know I could write something like a PHP script, but I was wondering if there was a way to do it in SQL. Specifically MySQL.

**EDIT: Has anyone tried something like SwisSQL? It's a migration tool, but i'm not sure if it will do what I'm asking.


回答1:


Here is an Example of Normalizing Tables in a Script. I advise you do something like this

e.g Table: tbl_tmpData
Date, ProductName, ProductCode, ProductType, MarketDescription, Units, Value
2010-01-01, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 1', 20, 20.00
2010-01-02, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 2', 40, 40.00
2010-01-03, 'Arnotts Biscuits', '01', 'Biscuit', 'Store 3', 40, 40.00
2010-01-01, 'Cola', '02', 'Drink', 'Store 1', 40, 80.00
2010-01-02, 'Cola', '02', 'Drink', 'Store 2', 20, 40.00
2010-01-03, 'Cola', '02', 'Drink', 'Store 2', 60, 120.00
2010-01-01, 'Simiri Gum', '03', 'Gum', 'Store 1', 40, 80.00
2010-01-02, 'Simiri Gum', '03', 'Gum', 'Store 2', 20, 40.00
2010-01-03, 'Simiri Gum', '03', 'Gum', 'Store 3', 60, 120.00

You would Create Your Date Table first:

CREATE TABLE tbl_Date
(
DateID int PRIMARY KEY IDENTITY(1,1)
 ,DateValue datetime
)

INSERT INTO tbl_Date (DateValue)
SELECT DISTINCT Date
FROM tbl_Data
WHERE Date NOT IN (SELECT DISTINCT DateValue FROM tbl_Date)

you would then Create your Market Table

CREATE TABLE tbl_Market
(
MarketID int PRIMARY KEY IDENTITY(1,1)
 ,MarketName varchar(200)
)

INSERT INTO tbl_Market (MarketName)
SELECT DISTINCT MarketDescription
FROM tbl_tmpData
WHERE MarketName NOT IN (SELECT DISTINCT MarketDescription FROM tbl_Market)

you would then Create your ProductType Table

CREATE TABLE tbl_ProductType
(
ProductTypeID int PRIMARY KEY IDENTITY(1,1)
 ,ProductType varchar(200)
)

INSERT INTO tbl_ProductType (ProductType)
SELECT DISTINCT ProductType
FROM tbl_tmpData
WHERE ProductType NOT IN (SELECT DISTINCT ProductType FROM tbl_ProductType)

you would then Create your Product Table

CREATE TABLE tbl_Product
(
ProductID int PRIMARY KEY IDENTITY(1,1)
, ProductCode varchar(100)
, ProductDescription varchar(300)
 ,ProductType int
)

INSERT INTO tbl_Product (ProductCode, ProductDescription, ProductType)
SELECT DISTINCT tmp.ProductCode,tmp.ProductName, pt.ProductType
FROM tbl_tmpData tmp
INNER JOIN tbl_ProductType pt ON tmp.ProductType = pt.ProductType
WHERE ProductCode NOT IN (SELECT DISTINCT ProductCode FROM tbl_Product)

you would then Create your Data Table

CREATE TABLE tbl_Data
(
DataID int PRIMARY KEY IDENTITY(1,1)
, DateID varchar(100)
, ProductID varchar(100)
, MarketID varchar(300)
 ,Units decimal(10,5)
 , value decimal(10,5)
)


INSERT INTO tbl_Data (ProductID, MarketID, Units, Value)
SELECT t.DateID
       , p.ProductID
       , m.MarketID
       , SUM(tmp.Units)
       , SUM(tmp.VALUE)
FROM tbl_tmpData tmp
INNER JOIN tbl_Date t ON tmp.Date = t.DateValue
INNER JOIN tbl_Product p ON tmp.ProductCode = p.ProductCode
INNER JOIN tbl_Market m ON tmp.MarketDescription = m.MarketName 
GROUP BY t.DateID, p.ProductID, m.MarketID
ORDER BY t.DateID, p.ProductID, m.MarketID



回答2:


Download the MySQL Workbench from the MySql Website and then connect your MySQL instance to the Workbench UI Utitily.

Once that is completed.

Write a Script that turns your Data into the desired structure that you would like.




回答3:


I have recently done this and have some insights to how the general procedure can be executed.

  1. Start by modelling your data. When you start with a database that is not normalized you need to create a proper model which you want to transfer your data to. This includes identifying atomic objects that should live in its own tables. Identify duplicated data and determine where that should go. Also identify all relationships that exists on you data structure.

    An optional step. The database usually go together with an interface that probably also needs updating. Look at that design also in this step and decide if there are any isolated parts that can wait, both in the data structure and interface program. How much should be included is determined by practical aspects such as time and budget. Maybe some part does not need modification just yet.

    It can also be an option to start completely from scratch, skip backwards compatibility and let there be two parallel systems

  2. Write a script that adds all new columns and tables that the normalized data requires.

  3. Write another script that transfers the non-normalized data to the new normalized data structure. This is the most tricky part I would say, and can be rather messy, depending on in how bad shape the old data is.

  4. Enforce all constraints from the model on the new normalized data by adding constraints to the new tables and columns. This is also best done in a script. Her you will see if your data migration succeeded. If it did, you will be able to add all the constraints. If it failed, some constraint will fail, and you will have to go back and look at what failed.

  5. Finally, make yet another script that deletes all columns and tables that were removed in the new model. By doing this, you will easily identify all places in the interface which needs updating. Anything talking to anything in these columns and tables will have to be updated in the interface.

Some general tips is to do all development against a, maybe reduced, copy of the database. E.g. in MySQL you can do an SQL Dump using for example Workbench and test you scripts on that. You will probably need a few iterations on the database before the migration works. In connection, also do the actual migration on a copy of the database, not to break anything in production.



来源:https://stackoverflow.com/questions/5135889/how-to-normalize-a-sql-database

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!