SQLAlchemy: How to use group_by() correctly (only_full_group_by)?

问题

I'm trying to use the group_by() function of SQLAlchemy with the mysql+mysqlconnector engine:

rows = session.query(MyModel) \
        .order_by(MyModel.published_date.desc()) \
        .group_by(MyModel.category_id) \
        .all()

It works fine with SQLite, but for MySQL I get this error:

[42000][1055] Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column '...' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by

I know how to solve it in plain SQL, but I'd like to use the advantages of SQLAlchemy.

What's the proper solution with SQLAlchemy?

Thanks in advance

回答1:

One way to form the greatest-n-per-group query with well defined behaviour would be to use a LEFT JOIN, looking for MyModel rows per category_id that have no matching row with greater published_date:

my_model_alias = aliased(MyModel)

rows = session.query(MyModel).\
    outerjoin(my_model_alias,
              and_(my_model_alias.category_id == MyModel.category_id,
                   my_model_alias.published_date > MyModel.published_date)).\
    filter(my_model_alias.id == None).\
    all()

This will work in about any SQL DBMS. In SQLite 3.25.0 and MySQL 8 (and many others) you could use window functions to achieve the same:

sq = session.query(
        MyModel,
        func.row_number().
            over(partition_by=MyModel.category_id,
                 order_by=MyModel.published_date.desc()).label('rn')).\
    subquery()

my_model_alias = aliased(MyModel, sq)

rows = session.query(my_model_alias).\
    filter(sq.c.rn == 1).\
    all()

Of course you could use GROUP BY as well, if you then use the results in a join:

max_pub_dates = session.query(
        MyModel.category_id,
        func.max(MyModel.published_date).label('published_date')).\
    group_by(MyModel.category_id).\
    subquery()

rows = session.query(MyModel).\
    join(max_pub_dates,
         and_(max_pub_dates.category_id == MyModel.category_id,
              max_pub_dates.published_date == MyModel.published_date)).\
    all()

来源：https://stackoverflow.com/questions/58620448/sqlalchemy-how-to-use-group-by-correctly-only-full-group-by

标签

python

mysql

sql

sqlalchemy

greatest-n-per-group