ActiveRecord: Alternative to find_in_batches?

蓝咒 提交于 2020-02-20 07:26:38

问题


I have a query that loads thousands of objects and I want to tame it by using find_in_batches:

Car.includes(:member).where(:engine => "123").find_in_batches(batch_size: 500) ...

According to the docs, I can't have a custom sorting order: http://www.rubydoc.info/docs/rails/4.0.0/ActiveRecord/Batches:find_in_batches

However, I need a custom sort order of created_at DESC. Is there another method to run this query in chunks like it does in find_in_batches so that not so many objects live on the heap at once?


回答1:


Hm I've been thinking about a solution for this (I'm the person who asked the question). It makes sense that find_in_batches doesn't allow you to have a custom order because lets say you sort by created_at DESC and specify a batch_size of 500. The first loop goes from 1-500, the second loop goes from 501-1000, etc. What if before the 2nd loop occurs, someone inserts a new record into the table? That would be put onto the top of the query results and your results would be shifted 1 to the left and your 2nd loop would have a repeat.

You could argue though that created_at ASC would be safe then, but it's not guaranteed if your app specifies a created_at value.

UPDATE:

I wrote a gem for this problem: https://github.com/EdmundMai/batched_query

Since using it, the average memory of my application has HALVED. I highly suggest anyone having similar issues to check it out! And contribute if you want!




回答2:


The slower manual way to do this, is to do something like this:

count = Cars.includes(:member).where(:engine => "123").count
count = count/500
count += 1 if count%500 > 0
last_id = 0
while count > 0
    ids = Car.includes(:member).where("engine = "123" and id > ?", last_id).order(created_at: :desc).limit(500).ids #which plucks just the ids`   
    cars = Cars.find(ids)
    #cars.each or #cars.update_all
    #do your updating 
    last_id = ids.last
    count -= 1
end 



回答3:


Can you imagine how find_in_batches with sorting will works on 1M rows or more? It will sort all rows every batch.

So, I think will be better to decrease number of sort calls. For example for batch size equal to 500 you can load IDs only (include sorting) for N * 500 rows and after it just load batch of objects by these IDs. So, such way should decrease have queries with sorting to DB in N times.



来源:https://stackoverflow.com/questions/30510180/activerecord-alternative-to-find-in-batches

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!