DEPRECATION WARNING: Dangerous query method: Random Record in ActiveRecord >= 5.2

前端 未结 3 1800
长发绾君心
长发绾君心 2020-11-29 12:13

So far, the \"common\" way to get a random record from the Database has been:

# Postgress
Model.order(\"RANDOM()\").first 

# MySQL
Model.order(\"RAND()\").f         


        
3条回答
  •  情书的邮戳
    2020-11-29 12:40

    With many records, and not many deleted records, this may be more efficient. In my case I have to use .unscoped because default scope uses a join. If your model doesn't use such a default scope, you can omit the .unscoped wherever it appears.

    Patient.unscoped.count #=> 134049
    
    class Patient
      def self.random
        return nil unless Patient.unscoped.any?
        until @patient do
          @patient = Patient.unscoped.find rand(Patient.unscoped.last.id)
        end
        @patient
      end
    end
    
    #Compare with other solutions offered here in my use case
    
    puts Benchmark.measure{10.times{Patient.unscoped.order(Arel.sql('RANDOM()')).first }}
    #=>0.010000   0.000000   0.010000 (  1.222340)
    Patient.unscoped.order(Arel.sql('RANDOM()')).first
    Patient Load (121.1ms)  SELECT  "patients".* FROM "patients"  ORDER BY RANDOM() LIMIT 1
    
    puts Benchmark.measure {10.times {Patient.unscoped.offset(rand(Patient.unscoped.count)).first }}
    #=>0.020000   0.000000   0.020000 (  0.318977)
    Patient.unscoped.offset(rand(Patient.unscoped.count)).first
    (11.7ms)  SELECT COUNT(*) FROM "patients"
    Patient Load (33.4ms)  SELECT  "patients".* FROM "patients"  ORDER BY "patients"."id" ASC LIMIT 1 OFFSET 106284
    
    puts Benchmark.measure{10.times{Patient.random}}
    #=>0.010000   0.000000   0.010000 (  0.148306)
    
    Patient.random
    (14.8ms)  SELECT COUNT(*) FROM "patients"
    #also
    Patient.unscoped.find rand(Patient.unscoped.last.id)
    Patient Load (0.3ms)  SELECT  "patients".* FROM "patients"  ORDER BY "patients"."id" DESC LIMIT 1
    Patient Load (0.4ms)  SELECT  "patients".* FROM "patients" WHERE "patients"."id" = $1 LIMIT 1  [["id", 4511]]
    

    The reason for this is because we're using rand() to get a random ID and just do a find on that single record. However the greater the number of deleted rows (skipped ids) the more likely the while loop will execute multiple times. It might be overkill but could be worth a the 62% increase in performance and even higher if you never delete rows. Test if it's better for your use case.

提交回复
热议问题