How to find items with *all* matching categories

天大地大妈咪最大 提交于 2019-12-14 02:37:43

问题


I have two models, Item and Category, joined by a join table. I would like to query Item to find only items that match a list of categories. My models look like:

class Item < ActiveRecord::Base
  has_and_belongs_to_many :categories
end

class Category < ActiveRecord::Base
  has_and_belongs_to_many :items
end

I can easily find items that match ANY of the list of categories. The following will return items that belong to category 1, 2 or 3.

Item.includes(:categories).where(categories: {id:[1,2,3]})

I would only like to find items that belong to all 3 categories. What is the best way to accomplish this using ActiveRecord?

Do I need to fall back to writing the where condition myself and if so, what is the correct syntax for PostgreSQL? I've tried various flavors of "WHERE ALL IN (1,2,3)", but just get syntax errors.

UPDATE:

Based on the accepted answer to Find Products matching ALL Categories (Rails 3.1) I can get pretty close.

category_ids = [7,10,12,13,52,1162]

Item.joins(:categories).
  where(categories: {id: category_ids}).
  group('items.id').
  having("count(categories_items.category_id) = #{category_ids.size}")

Unfortunately, when chaining .count or .size I get back a Hash instead of a record count:

{189 => 6, 3067 => 6, 406 => 6}

I can count the keys in the resulting hash to get the real record count, but this is a really inelegant solution.


回答1:


ActiveRecord

For ActiveRecord, you could put a method like this in your Item class:

def self.with_all_categories(category_ids)
  select(:id).distinct.
    joins(:categories).
    where('categories.id' => category_ids).
    group(:id).
    having('count(categories.id) = ?', category_ids.length)
end

Then you can filter your queries like so:

category_ids = [1,2,3]
Item.where(id: Item.with_all_categories(category_ids))

You could also make use of scopes to make it a little more friendly:

class Item
  scope :with_all_categories, ->(category_ids) { where(id: Item.ids_with_all_categories(category_ids)) }

  def self.ids_with_all_categories(category_ids)
    select(:id).distinct.
      joins(:categories).
      where('categories.id' => category_ids).
      group(:id).
      having('count(categories.id) = ?', category_ids.length)
  end
end

Item.with_all_categories([1,2,3])

Both will produce this SQL

SELECT "items".*
FROM "items"
WHERE "items"."id" IN
  (SELECT DISTINCT "items"."id"
   FROM "items"
   INNER JOIN "categories_items" ON "categories_items"."item_id" = "items"."id"
   INNER JOIN "categories" ON "categories"."id" = "categories_items"."category_id"
   WHERE "categories"."id" IN (1, 2, 3)
   GROUP BY "items"."id" 
   HAVING count(categories.id) = 3)

You don't technically need the distinct part of that subquery, but I'm not sure whether with or without would be better for performance.

SQL

There's a couple approaches in raw SQL

SELECT *
FROM items
WHERE items.id IN (
  SELECT item_id
  FROM categories_items
  WHERE category_id IN (1,2,3)
  GROUP BY item_id
  HAVING COUNT(category_id) = 3
)

That will work in SQL Server - the syntax might be slightly different in Postgres. Or

SELECT *
FROM items
WHERE items.id IN (SELECT item_id FROM categories_items WHERE category_id = 1)
  AND items.id IN (SELECT item_id FROM categories_items WHERE category_id = 2)
  AND items.id IN (SELECT item_id FROM categories_items WHERE category_id = 3)



回答2:


How about this code

Item.all.joins(:categories).where(categories: { id: [1, 2, 3] })

SQL is

SELECT
    "items" . *
FROM
    "items" INNER JOIN "categories_items"
        ON "categories_items" . "item_id" = "items" . "id" INNER JOIN "categories"
        ON "categories" . "id" = "categories_items" . "category_id"
WHERE
    "categories" . "id" IN (
        1
        ,2
        ,3
    )



回答3:


I can't say for sure but this might work

categories = Category.find(1,2,3)
items = Item.includes(:categories)
items.select{|item| (categories-item.categories).blank?}

or just

Item.all.select{|item| (Category.find(1,2,3)-item.categories).blank?}



回答4:


Just tried Alex's amazing suggestion with has_many :through settings, it generated a surprising result: when I looked for items with EXACTLY [6,7,8] categories, it also return items matching all 6,7,8 categories AND more, ie. items with [6,7,8,9] categories.

Technically it's correct result based on the code, because the having clause there is to process the query results of the where clause, therefore all possible counting results of the having clause from Alex's code would be 1 or 2 or 3, but may not 4 or more.

To overcame such scenario, I added a category counter cache and prescreened the category counts before the having clause, so it just returned items with and only with [6,7,8] categories (no extra).

  def self.with_exact_categories(category_ids)    
    self.
      joins(:categories).
      where('categories.id': category_ids).
      where('items.categories_count = ?', category_ids.length).
      group('items.id').
      having('count(categories.id) = ?', category_ids.length)
  end

For prescreening category counts, I don't know how to use aggregation functions in the where clause, but still very happy to learn that the counter cache is still working in Rails 4.21. here is my model settings:

class Item < ActiveRecord::Base
  has_many :categories_items
  has_many :categories, through: :categories_items
end

class CategoriesItem < ActiveRecord::Base
  belongs_to :category
  belongs_to :item, counter_cache: :categories_count
end

class Category < ActiveRecord::Base
  has_many :categories_items, dependent: :destroy
  has_many :items, through: :categories_items, dependent: :destroy
end

class AddCategoriesCountToItems < ActiveRecord::Migration
  def change
    add_column :items, :categories_count, :integer, default: 0
  end
end


来源:https://stackoverflow.com/questions/28733170/how-to-find-items-with-all-matching-categories

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!