Select All Posts With All Their Tags

女生的网名这么多〃 提交于 2019-12-06 12:13:09

This being purely an exercise, let me preface this by saying that most likely the amount of data being duplicated isn't a big deal. Although if the posts are very large in size and there are lots of them, it does start to make more sense to avoid duplication.

Further, using C# Linq-to-Sql or Entity Framework, the object relationships will be worked out for you and your Post entity will have a List<Tag> property that you can access.

However if you want to roll your own type of thing, one option that involves just one DB round trip and no duplication of data is to write a stored proc that gets you back 2 recordsets (2 separate select statements) - one with Post content, and one with Tag content.

It would then be pretty simple to create a C# class that represents a Post and just has a List<Tag> and pull it from the stored proc results.

Create Procedure GetPostTags
As

-- We will use the GotTags column here to loop through and get tabs later
Declare @Posts Table (
    PostID varchar(50), 
    PostTitle varchar(50), 
    PostContent varchar(50),
    GotTags bit default 0
)

/* Assuming you care about the ID's, this will get you all of 
   the tags without duplicating any post content */
Declare @PostTags Table (
    PostID int,
    TagID int,
    TagName varchar(50)
)

-- Populate posts from the main table
Insert Into @Posts (PostID, PostTitle, PostContent)
Select * From Posts

-- Now loop through and get the tags for each post. 
Declare @CurrentPostID int
Set @CurrentPostID = (Select Top 1 PostID From @Posts Where GotTags = 0)
While @CurrentPostID Is Not Null
    Begin
        Insert Into @PostTags (PostId, TagID, TagName)
        Select pt.postid, pt.tagid, t.name
        From Tags t 
            Join PostTags pt
                On t.id = pt.tagid
        Where pt.postid = @CurrentPostID

        -- Set next loop
        Update @Posts Set GotTags = 1 Where PostID = @CurrentPostID
        Set @CurrentPostID = (Select Top 1 PostID From @Posts Where GotTags = 0)
    End

-- Return 2 recordsets, which are related by the PostID column found in both sets
Select * from @Posts
Select * From @PostTags

I prefer this type of solution over concatenating strings into one string and then splitting them later; it makes it easier to work with the data this way, allows to be more object oriented in C#, and lets you keep track of Tag ID's easier in case tags need to be removed or added to/from a post, you don't need to find a tag or match by name since you already have the ID.

Obviously many websites do this, I want to know the best way..

The best way: there isn't one.

and how it should be done in the real world?

Entity Framework would build out the query as you suggested and materialize the objects you need. Yes there is duplicate data, but more often then not, the duplicate data is better with related data then trying to relate the information back together again. The advantage is more readable code, and easier to query in a c# like language, with related records and change tracking (by default).

Dapper can do the same thing - A parent object with it's children objects. It's faster, but it doesn't have change tracking, and the statements aren't c# like, they are (as far as I've seen) direct SQL, which makes writing dynamic queries a much harder.

But surely there has to be a better way?

I don't know what better is. Is it more efficent, less memory overhead, less network packets/size, more maintainable, more readable?

Is there a way of doing this without replicating the post information for each tag row that is returned?

Yes, you could write a stored procedure to return multiple recordsets, materialize your objects, and wire them up manually.

This sounds like you are trying to optimize something you don't have a problem with..

I'd write a query to return multiple recordsets. I wouldn't worry about over-optimizing until you do some performance testing.

I'm not sure about Dapper's recent support for one-to-many or many-to-many queries, but you might want to check out the new features in Insight.Database 4.0. There's a pre-release in nuget now.

Check out the pre-release docs. I'd love some feedback.

https://github.com/jonwagner/Insight.Database/wiki/Proposed-4.0-Changes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!