问题
I have the following model defined with Flask-SQLAlchemy:
"""models.py"""
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy()
skill_candidate = db.Table(
    'SkillCandidate',
    db.Column('skill_id', db.String, db.ForeignKey('skill.id')),
    db.Column('candidate_id', db.Integer, db.ForeignKey('candidate.id')))
class Candidate(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    skills = db.relationship("Skill", secondary=skill_candidate)
class Skill(db.Model):
    id = db.Column(db.String, primary_key=True)
    name = db.Column(db.String, nullable=False, unique=True)
What am trying to achieve is the following : I want to return all the candidates who possess skills provided in a list input (even ideally, a list of skill_id)
I tried the following :
def get_skilled_candidates(skill_ids):
    return Candidate.query.join(skill_candidate).\
       filter(and_(*[skill_candidate.c.skill_id == skill_id for skill_id in skill_ids])).\
            all()
The aim was to filter all candidates for every skill and compose it with a and_ statement
It works well if I use a list of 1 item (it returns all candidates that possess the skill) but does not if I add more skills in the input list (even tho I have candidates in base that fit the criteria)
回答1:
As noted in the comments, what you'd need is a FORALL operation (universal quantifier), or relational division.
FORALL x ( p(x) )
can be expressed as
NOT ( EXISTS x ( NOT ( p(x) ) ) )
which is a bit unwieldy and hard to reason about, if you don't know about FORALL and their relationship. Given your models it could look like:
def get_skilled_candidates(skill_ids):
    # Form a temporary derived table using unions
    skills = db.union_all(*[
        db.select([db.literal(sid).label('skill_id')])
        for sid in skill_ids]).alias()
    return Candidate.query.\
        filter(
            ~db.exists().select_from(skills).where(
                ~db.exists().
                    where(db.and_(skill_candidate.c.skill_id == skills.c.skill_id,
                                  skill_candidate.c.candidate_id == Candidate.id)).
                    correlate_except(skill_candidate))).\
        all()
There are of course other ways to express the same query, such as:
def get_skilled_candidates(skill_ids):
    return Candidate.query.\
        join(skill_candidate).\
        filter(skill_candidate.c.skill_id.in_(skill_ids)).\
        group_by(Candidate.id).\
        having(db.func.count(skill_candidate.c.skill_id.distinct()) ==
               len(set(skill_ids))).\
        all()
which essentially checks by count that all skill ids were matched.
If using Postgresql you could also do:
from sqlalchemy.dialects.postgresql import array_agg
def get_skilled_candidates(skill_ids):
    # The double filtering may seem redundant, but the WHERE ... IN allows
    # the query to use indexes, while the HAVING ... @> does the final filtering.
    return Candidate.query.\
        join(skill_candidate).\
        filter(skill_candidate.c.skill_id.in_(skill_ids)).\
        group_by(Candidate.id).\
        having(array_agg(skill_candidate.c.skill_id).contains(skill_ids)).\
        all()
This is somewhat equivalent with the partly Python solution from the other answer.
Also, the aggregate EVERY could be used:
def get_skilled_candidates(skill_ids):
    # Form a temporary derived table using unions
    skills = db.union_all(*[
        db.select([db.literal(sid).label('skill_id')])
        for sid in skill_ids]).alias()
    # Perform a CROSS JOIN between candidate and skills
    return Candidate.query.\
        join(skills, db.true()).\
        group_by(Candidate.id).\
        having(db.func.every(
            db.exists().
                where(db.and_(skill_candidate.c.skill_id == skills.c.skill_id,
                              skill_candidate.c.candidate_id == Candidate.id)).
                correlate_except(skill_candidate))).\
        all()
回答2:
You could query all candidates with any of the skills in your list and then filter the result with a list comprehension. This may not be as performant as the relational division approach mentioned by @IljaEverilä, but it certainly simplifies the query aspect.
skill_ids = ['id_1', 'id_2']
candidates = session.query(Candidate).\
    filter(Candidate.skills.any(Skill.id.in_(skill_ids)).\
    all()
candidates = [
    c for c in candidates
    if set(s.id for s in c.skills).issuperset(skill_ids)
]
来源:https://stackoverflow.com/questions/58080691/sqlalchemy-filter-on-list-attribute