So I\'m using mongodb and I\'m unsure if I\'ve got the correct / best database collection design for what I\'m trying to do.
There can be many items, and a user can
You're on the right track to creating a performant NoSQL schema design, and I think you're asking the right questions as to how to properly lay things out.
Here's my understanding of your application:
It looks like Groups can both have many Followers (mapping users to groups) and many Items, but Items may not necessarily be in many Groups (although it is possible). And from your given use-case example, it sounds like retrieving all the Groups an Item is in and all the Items in a Group will be some common read operations.
In your current schema design, you've implemented a model between mapping users to groups as followers and items to groups as item_groups. This works alright until you mention the problem with more complex queries:
I think with the following design a real performance issue could be when I want to get all of the groups that a user is following for a specific item (based off of the user_id and item_id)
I think a few things could help you out in this situation:
FollowerSchema.index({ group: 1, user: 1 }, { unique: true });
Item_GroupsSchema.index({ group: 1, item: 1 }, { unique: true });
Using an index on these fields will create some overhead when writing to the collection, but it sounds like reading from the collection will be a more common interaction so it'll be worth it (I'd suggest reading more up on index performance).
Since a User probably won't be following thousands of groups, I think it'd be worthwhile to include in the user model an array of groups the user is following. This will help you out with that complex query when you want to find all instances of an item in groups that a user is currently following, since you'll have the list of groups right there. You'll still have the implementation where your using $in: groups
, but it'll be with one less query to the collection.
As I mentioned before, it seems like items may not necessarily be in that many groups (just like users won't necessarily be following thousands of groups). If the case may commonly be that an item is in maybe a couple hundred groups, I'd consider just adding an array to the item model for each group that it gets added to. This would increase your performance when reading all the groups an item is in, a query you mentioned would be a common one. Note: You'd still use the Item_Groups model to retrieve all the items in a group by querying on the (now indexed) group_id.