How do I avoid multiple queries with :include in Rails?

后端 未结 2 1162
时光说笑
时光说笑 2020-12-08 17:02

If I do this

post = Post.find_by_id(post_id, :include => :comments)

two queries are performed (one for post data and and another for the

相关标签:
2条回答
  • 2020-12-08 17:28

    If you use this behaviour of eagerly-loaded associations, you'll get a single (and efficient) query.

    Here is an example:

    • Say you have the following model (where :user is the foreign reference):

      class Item < ActiveRecord::Base
        attr_accessible :name, :user_id
        belongs_to :user
      end
      
    • Then executing this (note: the where part is crucial as it tricks Rails to produce that single query):

      @items = Item.includes(:user).where("users.id IS NOT NULL").all
      

      will result in a single SQL query (the syntax below is that of PostgreSQL):

        SELECT "items"."id" AS t0_r0, "items"."user_id" AS t0_r1, 
                "items"."name" AS t0_r2, "items"."created_at" AS t0_r3,
                "items"."updated_at" AS t0_r4, "users"."id" AS t1_r0, 
                "users"."email" AS t1_r1, "users"."created_at" AS t1_r4, 
                "users"."updated_at" AS t1_r5 
        FROM "measurements" 
        LEFT OUTER JOIN "users" ON "users"."id" = "items"."user_id" 
        WHERE (users.id IS NOT NULL)
    
    0 讨论(0)
  • 2020-12-08 17:34

    No, there is not. This is the intended behavior of :include, since the JOIN approach ultimately comes out to be inefficient.

    For example, consider the following scenario: the Post model has 3 fields that you need to select, 2 fields for Comment, and this particular post has 100 comments. Rails could run a single JOIN query along the lines of:

    SELECT post.id, post.title, post.author_id, comment.id, comment.body
    FROM posts
    INNER JOIN comments ON comment.post_id = post.id
    WHERE post.id = 1
    

    This would return the following table of results:

     post.id | post.title | post.author_id | comment.id | comment.body
    ---------+------------+----------------+------------+--------------
           1 | Hello!     |              1 |          1 | First!
           1 | Hello!     |              1 |          2 | Second!
           1 | Hello!     |              1 |          3 | Third!
           1 | Hello!     |              1 |          4 | Fourth!
    ...96 more...
    

    You can see the problem already. The single-query JOIN approach, though it returns the data you need, returns it redundantly. When the database server sends the result set to Rails, it will send the post's ID, title, and author ID 100 times each. Now, suppose that the Post had 10 fields you were interested in, 8 of which were text blocks. Eww. That's a lot of data. Transferring data from the database to Rails does take work on both sides, both in CPU cycles and RAM, so minimizing that data transfer is important for making the app run faster and leaner.

    The Rails devs crunched the numbers, and most applications run better when using multiple queries that only fetch each bit of data once rather than one query that has the potential to get hugely redundant.

    Of course, there comes a time in every developer's life when a join is necessary in order to run complex conditions, and that can be achieved by replacing :include with :joins. For prefetching relationships, however, the approach Rails takes in :include is much better for performance.

    0 讨论(0)
提交回复
热议问题