问题
I was having some problem when trying to design for my firebase database structure. Basically, 1 account can have many receipts, 1 receipts can have many items. Here is the JSON:
receipts {
accountID1 : {
receiptID1 : {
date : "07/07/2017"
store : {
storeName : "store1"
storeAddr : "addr1"
}
currency : {
currencyName : "currency1"
currentcySymbol : "$"
}
totalAmount : "50.00"
items : {
itemID1 : true,
itemID2 : true,
}
}
receiptID2 : {
date : "08/07/2017"
store : {
storeName : "store1"
storeAddr : "addr1"
}
currency : {
currencyName : "currency1"
currentcySymbol : "$"
}
totalAmount : "20.00"
items : {
itemID3 : true,
itemID4 : true,
}
}
}
},
items {
itemID1 : {
type : "food"
name : "snack"
unitprice : "10.00"
quantity : "2"
}
itemID2 : {
type : "entertainment"
name : "gaming equipment"
unitprice : "150.00"
quantity : "1"
}
itemID3 : {
type : "food"
name : "fruit juice"
unitprice : "4.00"
quantity : "1"
}
itemID4 : {
type : "entertainment"
name : "gaming equipment"
unitprice : "150.00"
quantity : "1"
}
},
itemIDsByType {
food : {
itemID1 : true,
itemID3 : true,
}
entertainment: {
itemID2 : true,
itemID4 : true,
}
}
I realized there is a duplication problem under the items
child. For instance, account A purchase item A in receipt 1. Then, account A purchase the same item again in receipt 2. Under the receipt
s child, yes that will not cause any interference.
However, by looking at items
child, specifically for itemID2
and itemID4
, they are same item but belonged to different receipt. These two records are duplicated, and let's say for large set of data, I think this design might cause a problem.
Any ideas on how to restructure the database design in order to remove the duplication problem mentioned above?
I have actually come out with another design but it is less-flatten:
receipts {
accountID1 : {
receiptID1 : {
date : "07/07/2017"
merchantName : "NTUC"
branch : {
branchName : "Marsiling"
branchAddress : "Blk 167, Marsiling"
}
currency : {
currencyName : "currency1"
currencySymbol : "$"
}
totalAmount : "50.00"
}
receiptID2 : {
date : "08/07/2017"
merchantName : "NTUC"
branch : {
branchName : "Marsiling"
branchAddress : "Blk 167, Marsiling"
}
currency : {
currencyName : "currency1"
currencySymbol : "$"
}
totalAmount : "20.00"
}
}
},
itemLists {
receiptID1 : {
items : {
itemID1 : {
type : "food"
name : "snack"
unitprice : "10.00"
quantity : "2"
}
itemID2 : {
type : "entertainment"
name : "gaming equipment"
unitprice : "150.00"
quantity : "1"
}
itemID3 : {
type : "food"
name : "fruit juice"
unitprice : "4.00"
quantity : "1"
}
}
}
receiptID2 : {
items : {
itemID4 : {
type : "entertainment"
name : "gaming equipment"
unitprice : "150.00"
quantity : "1"
}
}
}
},
itemIDsByType {
food : {
itemID1 : true,
itemID3 : true,
}
entertainment: {
itemID2 : true,
itemID4 : true,
}
},
merchants {
merchantID1 : {
merchantName : "NTUC"
branches : {
branchID1 : {
branchName : "Marsiling"
branchAddress : "Blk 167, Marsiling"
}
branchID2 : {
branchName : "Woodlands"
branchAddress : "Blk 161, Woodlands"
}
}
}
}
As for this design, each of the items are grouped under each receiptID so that could eliminate the duplication problem mentioned above. But I find it less-flatten, I am trying to juggle between flatten design with duplicate data or less-flatten design with no duplicated data. Which one would be better for a large set of data?
回答1:
Let's start with a master items list. This list is ALL of the items available for sale.
item_0
name: "burger"
item_1
name: "taco"
item_2
name: "hot dog"
item_3
name: "fries"
item_4
name: "refried beans"
Then the receipts node which stores info about the receipt, date, time, customer name etc. Note there are no references to the items as they are not directly needed, but could be added for convenience.
receipt_0
customer: "Frank"
timestamp: 170716093623
receipt_1
customer: "Bill"
timestamp: 170716094515
and finally the details of the items on each receipt.
receipt_items:
-Y89jasjdiasd:
item_id: item_0
price: 5.00
qty: 1
receipt: receipt_0
-YHJis9asdasd:
item_id: item_3
price: 1.50
qty: 1
receipt: receipt_0
-Yn9kasdpaosd:
item_id: item_1
price: 2.00
qty: 3
receipt: receipt_1
-Yllois9040ka:
item_id: item_4
price: 1.50
qty: 1
receipt: receipt_1
As you can see, Frank got a burger and fries on receipt_0 and Bill got 3 tacos (!) and a side of refried beans on receipt_1
With this structure, you can get the details of each receipt, customer, date etc. Or query the receipt_items node for a receipt_id and get the details of the items on it - item, price, qty etc.
You can also query the receipt_items node for a specific item; then sum up the quantities for say.. the most popular, or the average selling price.
This eliminates duplicate items AND data and provides a queryable, denormalized structure.
As mentioned above, you could add a child node to each receipt to store the receipt_items but since the receipt_items is queryable it may not be needed. It could be used to order the items on the receipt..
Note: the child node keys in receipt_items are created with childByAutoId.
来源:https://stackoverflow.com/questions/45118731/database-table-design-with-duplication-of-data