Database table design with duplication of data

主宰稳场 提交于 2019-12-24 18:43:45

问题


I was having some problem when trying to design for my firebase database structure. Basically, 1 account can have many receipts, 1 receipts can have many items. Here is the JSON:

receipts {
    accountID1 : {
        receiptID1 : {
            date : "07/07/2017"
            store : {
                storeName : "store1"
                storeAddr : "addr1"
            }
            currency : {
                currencyName : "currency1"
                currentcySymbol : "$"
            }
            totalAmount : "50.00"
            items : {
                itemID1 : true,
                itemID2 : true,
            }
        }
        receiptID2 : {
            date : "08/07/2017"
                store : {
                    storeName : "store1"
                    storeAddr : "addr1"
                }
                currency : {
                    currencyName : "currency1"
                    currentcySymbol : "$"
                }
                totalAmount : "20.00"
                items : {
                    itemID3 : true,
                    itemID4 : true,
                }
        }
    }
},
items {
        itemID1 : {
            type : "food"
            name : "snack"
            unitprice : "10.00"
            quantity : "2"
        }
        itemID2 : { 
            type : "entertainment"
            name : "gaming equipment"
            unitprice : "150.00"
            quantity : "1"
        }
        itemID3 : { 
            type : "food"
            name : "fruit juice"
            unitprice : "4.00"
            quantity : "1"
        } 
        itemID4 : {
            type : "entertainment"
            name : "gaming equipment"
            unitprice : "150.00"
            quantity : "1"
        }
},
itemIDsByType {
    food : {
        itemID1 : true,
        itemID3 : true,
    }
    entertainment: {
        itemID2 : true,
        itemID4 : true,
    }
}

I realized there is a duplication problem under the items child. For instance, account A purchase item A in receipt 1. Then, account A purchase the same item again in receipt 2. Under the receipts child, yes that will not cause any interference.

However, by looking at items child, specifically for itemID2 and itemID4, they are same item but belonged to different receipt. These two records are duplicated, and let's say for large set of data, I think this design might cause a problem.

Any ideas on how to restructure the database design in order to remove the duplication problem mentioned above?

I have actually come out with another design but it is less-flatten:

receipts {
    accountID1 : {
        receiptID1 : {
            date : "07/07/2017"
            merchantName : "NTUC"
            branch : {
                branchName : "Marsiling"
                branchAddress : "Blk 167, Marsiling"
            }
            currency : {
                currencyName : "currency1"
                currencySymbol : "$"
            }
            totalAmount : "50.00"
        }

        receiptID2 : {
            date : "08/07/2017"
            merchantName : "NTUC"
            branch : {
                branchName : "Marsiling"
                branchAddress : "Blk 167, Marsiling"
            }
            currency : {
                currencyName : "currency1"
                currencySymbol : "$"
            }
            totalAmount : "20.00"
        }
    }
},

itemLists {
    receiptID1 : {
        items : {
            itemID1 : {
                type : "food"
                name : "snack"
                unitprice : "10.00"
                quantity : "2"
            }

            itemID2 : { 
                type : "entertainment"
                name : "gaming equipment"
                unitprice : "150.00"
                quantity : "1"
            }

            itemID3 : { 
                type : "food"
                name : "fruit juice"
                unitprice : "4.00"
                quantity : "1"
            } 
        }
    }

    receiptID2 : { 
        items : {
            itemID4 : {
                type : "entertainment"
                name : "gaming equipment"
                unitprice : "150.00"
                quantity : "1"
            }
        }
    }
},
itemIDsByType {
        food : {
            itemID1 : true,
            itemID3 : true,
        }
        entertainment: {
            itemID2 : true,
            itemID4 : true,
        }
},
merchants {
    merchantID1 : {
        merchantName : "NTUC"
        branches : {
            branchID1 : {
                branchName : "Marsiling"
                branchAddress : "Blk 167, Marsiling"
            }
            branchID2 : {
                branchName : "Woodlands"
                branchAddress : "Blk 161, Woodlands"
            }
        }
    }
}

As for this design, each of the items are grouped under each receiptID so that could eliminate the duplication problem mentioned above. But I find it less-flatten, I am trying to juggle between flatten design with duplicate data or less-flatten design with no duplicated data. Which one would be better for a large set of data?


回答1:


Let's start with a master items list. This list is ALL of the items available for sale.

item_0
  name: "burger"
item_1
  name: "taco"
item_2
  name: "hot dog"
item_3
  name: "fries"
item_4
  name: "refried beans"

Then the receipts node which stores info about the receipt, date, time, customer name etc. Note there are no references to the items as they are not directly needed, but could be added for convenience.

receipt_0
   customer: "Frank"
   timestamp: 170716093623
receipt_1
   customer: "Bill"
   timestamp: 170716094515

and finally the details of the items on each receipt.

receipt_items:
   -Y89jasjdiasd:
      item_id: item_0
      price: 5.00
      qty: 1
      receipt: receipt_0
   -YHJis9asdasd:
      item_id: item_3
      price: 1.50
      qty: 1
      receipt: receipt_0
   -Yn9kasdpaosd:
      item_id: item_1
      price: 2.00
      qty: 3
      receipt: receipt_1
   -Yllois9040ka:
      item_id: item_4
      price: 1.50
      qty: 1
      receipt: receipt_1

As you can see, Frank got a burger and fries on receipt_0 and Bill got 3 tacos (!) and a side of refried beans on receipt_1

With this structure, you can get the details of each receipt, customer, date etc. Or query the receipt_items node for a receipt_id and get the details of the items on it - item, price, qty etc.

You can also query the receipt_items node for a specific item; then sum up the quantities for say.. the most popular, or the average selling price.

This eliminates duplicate items AND data and provides a queryable, denormalized structure.

As mentioned above, you could add a child node to each receipt to store the receipt_items but since the receipt_items is queryable it may not be needed. It could be used to order the items on the receipt..

Note: the child node keys in receipt_items are created with childByAutoId.



来源:https://stackoverflow.com/questions/45118731/database-table-design-with-duplication-of-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!