Difference between size, length and count in complicated ActiveRecord case

拜拜、爱过 提交于 2019-12-14 01:18:47

问题


[10] pry(main)> r.respondents.select(:name).uniq.size

(1.1ms)  SELECT DISTINCT COUNT("respondents"."name") FROM "respondents" 
INNER JOIN "values" ON "respondents"."id" = "values"."respondent_id" WHERE 
"values"."round_id" = 37 => 495

[11] pry(main)> r.respondents.select(:name).uniq.length

Respondent Load (1.1ms)  SELECT DISTINCT name FROM "respondents" 
INNER JOIN "values" ON "respondents"."id" = "values"."respondent_id" WHERE
"values"."round_id" = 37 => 6

Why the difference in what each query returns?


回答1:


.count #=> this always triggers a SELECT COUNT(*) on the database

.size #=> if the collection has been loaded, defers to Enumerable#size, else does the SELECT COUNT(*)

.length #=> always loads the collection and then defers to Enumerable#size



回答2:


r.respondents.select(:name).uniq returns an ActiveRecord::Relation object, which overrides size.

See: http://api.rubyonrails.org/classes/ActiveRecord/Relation.html#method-i-size

Calling size on such an object checks to see if the object is "loaded."

# Returns size of the records.
def size
  loaded? ? @records.length : count
end

If it is "loaded", it returns the length of the @records array. Otherwise, it calls count, which, without arguments, will "return a count of all the rows for the model."

So why this behavior? An AR::Relation is only "loaded" if either to_a or explain is called on it first:

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation.rb

The why is explained in a comment above the load method:

# Causes the records to be loaded from the database if they have not
# been loaded already. You can use this if for some reason you need
# to explicitly load some records before actually using them. The
# return value is the relation itself, not the records.
#
#   Post.where(published: true).load # => #<ActiveRecord::Relation>
def load
  unless loaded?
    # We monitor here the entire execution rather than individual SELECTs
    # because from the point of view of the user fetching the records of a
    # relation is a single unit of work. You want to know if this call takes
    # too long, not if the individual queries take too long.
    #
    # It could be the case that none of the queries involved surpass the
    # threshold, and at the same time the sum of them all does. The user
    # should get a query plan logged in that case.
    logging_query_plan { exec_queries }
  end

  self
end

So, perhaps using AR::Relation#size is a measure of the size of the potential complexity of queries on this relation, where length falls back to a count of the returned records.




回答3:


While converting Rails 3.2 to 4.1 it seems AR::Relation#size is different. Previously it returned the number of "rows" whereas (in my case) it now returned a Hash. Changing to use #count seems to give the same result as #size in 3.2. I'm being a bit vague here since running tests in 'rails console' on 4.1 did not give the same results when running via 'rails server' on 4.1



来源:https://stackoverflow.com/questions/11905364/difference-between-size-length-and-count-in-complicated-activerecord-case

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!