What is the correct way to use distinct on (Postgres) with SqlAlchemy?

点点圈 提交于 2021-02-18 16:56:03

问题


I want to get all the columns of a table with max(timestamp) and group by name.

What i have tried so far is: normal_query ="Select max(timestamp) as time from table"

event_list = normal_query \
            .distinct(Table.name)\
            .filter_by(**filter_by_query) \
            .filter(*queries) \
            .group_by(*group_by_fields) \
            .order_by('').all()

the query i get :

SELECT  DISTINCT ON (schema.table.name) , max(timestamp)....

this query basically returns two columns with name and timestamp.

whereas, the query i want :

SELECT DISTINCT ON (schema.table.name) * from table order by ....

which returns all the columns in that table.Which is the expected behavior and i am able to get all the columns, how could i right it down in python to get to this statement?.Basically the asterisk is missing. Can somebody help me?


回答1:


What you seem to be after is the DISTINCT ON ... ORDER BY idiom in Postgresql for selecting greatest-n-per-group results (N = 1). So instead of grouping and aggregating just

event_list = Table.query.\
    distinct(Table.name).\
    filter_by(**filter_by_query).\
    filter(*queries).\
    order_by(Table.name, Table.timestamp.desc()).\
    all()

This will end up selecting rows "grouped" by name, having the greatest timestamp value.

You do not want to use the asterisk most of the time, not in your application code anyway, unless you're doing manual ad-hoc queries. The asterisk is basically "all columns from the FROM table/relation", which might then break your assumptions later, if you add columns, reorder them, and such.

In case you'd like to order the resulting rows based on timestamp in the final result, you can use for example Query.from_self() to turn the query to a subquery, and order in the enclosing query:

event_list = Table.query.\
    distinct(Table.name).\
    filter_by(**filter_by_query).\
    filter(*queries).\
    order_by(Table.name, Table.timestamp.desc()).\
    from_self().\
    order_by(Table.timestamp.desc()).\
    all()


来源:https://stackoverflow.com/questions/57253307/what-is-the-correct-way-to-use-distinct-on-postgres-with-sqlalchemy

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!