ActiveRecord find_each combined with limit and order

后端 未结 13 2083
温柔的废话
温柔的废话 2020-12-02 11:41

I\'m trying to run a query of about 50,000 records using ActiveRecord\'s find_each method, but it seems to be ignoring my other parameters like so:



        
相关标签:
13条回答
  • 2020-12-02 12:23

    Using Kaminari or something other it will be easy.

    Create batch loader class.

    module BatchLoader
      extend ActiveSupport::Concern
    
      def batch_by_page(options = {})
        options = init_batch_options!(options)
    
        next_page = 1
    
        loop do
          next_page = yield(next_page, options[:batch_size])
    
          break next_page if next_page.nil?
        end
      end
    
      private
    
      def default_batch_options
        {
          batch_size: 50
        }
      end
    
      def init_batch_options!(options)
        options ||= {}
        default_batch_options.merge!(options)
      end
    end
    

    Create Repository

    class ThingRepository
      include BatchLoader
    
      # @param [Integer] per_page
      # @param [Proc] block
      def batch_changes(per_page=100, &block)
        relation = Thing.active.order("created_at DESC")
    
        batch_by_page do |next_page|
          query = relation.page(next_page).per(per_page)
          yield query if block_given?
          query.next_page
        end
      end
    end
    

    Use the repository

    repo = ThingRepository.new
    repo.batch_changes(5000).each do |g|
      g.each do |t|
        #...
      end
    end
    
    0 讨论(0)
  • 2020-12-02 12:25

    You can try ar-as-batches Gem.

    From their documentation you can do something like this

    Users.where(country_id: 44).order(:joined_at).offset(200).as_batches do |user|
      user.party_all_night!
    end
    
    0 讨论(0)
  • 2020-12-02 12:27

    Retrieving the ids first and processing the in_groups_of

    ordered_photo_ids = Photo.order(likes_count: :desc).pluck(:id)
    
    ordered_photo_ids.in_groups_of(1000, false).each do |photo_ids|
      photos = Photo.order(likes_count: :desc).where(id: photo_ids)
    
      # ...
    end
    

    It's important to also add the ORDER BY query to the inner call.

    0 讨论(0)
  • 2020-12-02 12:27

    Rails 6.1 adds support for descending order in find_each, find_in_batches and in_batches.

    0 讨论(0)
  • 2020-12-02 12:29

    find_each uses find_in_batches under the hood.

    Its not possible to select the order of the records, as described in find_in_batches, is automatically set to ascending on the primary key (“id ASC”) to make the batch ordering work.

    However, the criteria is applied, what you can do is:

    Thing.active.find_each(batch_size: 50000) { |t| puts t.id }
    

    Regarding the limit, it wasn't implemented yet: https://github.com/rails/rails/pull/5696


    Answering to your second question, you can create the logic yourself:

    total_records = 50000
    batch = 1000
    (0..(total_records - batch)).step(batch) do |i|
      puts Thing.active.order("created_at DESC").offset(i).limit(batch).to_sql
    end
    
    0 讨论(0)
  • 2020-12-02 12:36

    I was looking for the same behaviour and thought up of this solution. This DOES NOT order by created_at but I thought I would post anyways.

    max_records_to_retrieve = 50000
    last_index = Thing.count
    start_index = [(last_index - max_records_to_retrieve), 0].max
    Thing.active.find_each(:start => start_index) do |u|
        # do stuff
    end
    

    Drawbacks of this approach: - You need 2 queries (first one should be fast) - This guarantees a max of 50K records but if ids are skipped you will get less.

    0 讨论(0)
提交回复
热议问题