distinct-on

PostgreSQL: Grouping then filtering table, with condition for nonexistence

こ雲淡風輕ζ 提交于 2020-01-16 04:15:07
问题 In PostgreSQL, I have a table that, abstractly, looks like this: ╔═══╦═══╦═══╦═══╗ ║ A ║ B ║ C ║ D ║ ╠═══╬═══╬═══╬═══╣ ║ x ║ 0 ║ y ║ 0 ║ ║ x ║ 0 ║ x ║ 1 ║ ║ x ║ 1 ║ y ║ 0 ║ ║ x ║ 1 ║ z ║ 1 ║ ║ y ║ 0 ║ z ║ 0 ║ ║ y ║ 0 ║ x ║ 0 ║ ║ y ║ 1 ║ y ║ 0 ║ ╚═══╩═══╩═══╩═══╝ I want to transform it in a query into this: ╔═══╦═══╦══════╗ ║ A ║ B ║ D ║ ╠═══╬═══╬══════╣ ║ x ║ 0 ║ 1 ║ ║ x ║ 1 ║ null ║ ║ y ║ 0 ║ null ║ ║ y ║ 1 ║ 0 ║ ╚═══╩═══╩══════╝ …such that: The input table’s rows are grouped by A and B, and

Join on multiple tables using distinct on

左心房为你撑大大i 提交于 2020-01-06 11:48:36
问题 create table emp ( emp_id serial primary key, emp_no integer, emp_ref_no character varying(15), emp_class character varying(15) ); create table emp_detail ( emp_detail_id serial primary key, emp_id integer, class_no integer, created_at timestamp without time zone, constraint con_fk foreign key(emp_id) references emp(emp_id) ); create table class_detail ( class_id serial primary key, emp_id integer, class_no integer, col1 JSONB, created_at timestamp without time zone default now(), constraint

distinct() function (not select qualifier) in postgres

三世轮回 提交于 2019-12-23 07:58:18
问题 I just came across a SQL query, specifically against a Postgres database, that uses a function named "distinct". Namely: select distinct(pattern) as pattern, style, ... etc ... from styleview where ... etc ... Note this is NOT the ordinary DISTINCT qualifier on a SELECT -- at least it's not the normal syntax for the DISTINCT qualifier, note the parentheses. It is apparently using DISTINCT as a function, or maybe this is some special syntax. Any idea what this means? I tried playing with it a

How to order distinct tuples in a PostgreSQL query

那年仲夏 提交于 2019-12-07 11:03:49
问题 I'm trying to submit a query in Postgres that only returns distinct tuples. In my sample query, I do not want duplicate entries where an entry exists multiple times for a cluster_id/feed_id combination. If I do a simple: select distinct on (cluster_info.cluster_id, feed_id) cluster_info.cluster_id, num_docs, feed_id, url_time from url_info join cluster_info on (cluster_info.cluster_id = url_info.cluster_id) where feed_id in (select pot_seeder from potentials) and num_docs > 5 and url_time >

How do I take a DISTINCT ON subquery that is ordered by a separate column, and make it fast?

时光毁灭记忆、已成空白 提交于 2019-12-06 07:49:42
问题 (AKA - With a query and data very similar to question "Selecting rows ordered by some column and distinct on another", how can I get it to run fast). Postgres 11. I have table prediction with (article_id, prediction_date, predicted_as, article_published_date) that represents the output from a classifier over a set of articles. New articles are frequently added to a separate table (Represented by the FK article_id ), and new predictions are added as we tune our classifier. Sample data: | id |

How to order distinct tuples in a PostgreSQL query

China☆狼群 提交于 2019-12-05 15:10:54
I'm trying to submit a query in Postgres that only returns distinct tuples. In my sample query, I do not want duplicate entries where an entry exists multiple times for a cluster_id/feed_id combination. If I do a simple: select distinct on (cluster_info.cluster_id, feed_id) cluster_info.cluster_id, num_docs, feed_id, url_time from url_info join cluster_info on (cluster_info.cluster_id = url_info.cluster_id) where feed_id in (select pot_seeder from potentials) and num_docs > 5 and url_time > '2012-04-16'; I get just that, but I'd also like to group according to num_docs . So, when I do the

How do I take a DISTINCT ON subquery that is ordered by a separate column, and make it fast?

萝らか妹 提交于 2019-12-04 14:35:38
(AKA - With a query and data very similar to question " Selecting rows ordered by some column and distinct on another ", how can I get it to run fast). Postgres 11. I have table prediction with (article_id, prediction_date, predicted_as, article_published_date) that represents the output from a classifier over a set of articles. New articles are frequently added to a separate table (Represented by the FK article_id ), and new predictions are added as we tune our classifier. Sample data: | id | article_id | predicted_as | prediction_date | article_published_date | 1009381 | 362718 | negative |

Selecting rows ordered by some column and distinct on another

南笙酒味 提交于 2019-11-27 20:09:58
Related to - PostgreSQL DISTINCT ON with different ORDER BY I have table purchases (product_id, purchased_at, address_id) Sample data: | id | product_id | purchased_at | address_id | | 1 | 2 | 20 Mar 2012 21:01 | 1 | | 2 | 2 | 20 Mar 2012 21:33 | 1 | | 3 | 2 | 20 Mar 2012 21:39 | 2 | | 4 | 2 | 20 Mar 2012 21:48 | 2 | The result I expect is the most recent purchased product (full row) for each address_id and that result must be sorted in descendant order by the purchased_at field: | id | product_id | purchased_at | address_id | | 4 | 2 | 20 Mar 2012 21:48 | 2 | | 2 | 2 | 20 Mar 2012 21:33 | 1 |

Selecting rows ordered by some column and distinct on another

佐手、 提交于 2019-11-26 20:10:54
问题 Related to - PostgreSQL DISTINCT ON with different ORDER BY I have table purchases (product_id, purchased_at, address_id) Sample data: | id | product_id | purchased_at | address_id | | 1 | 2 | 20 Mar 2012 21:01 | 1 | | 2 | 2 | 20 Mar 2012 21:33 | 1 | | 3 | 2 | 20 Mar 2012 21:39 | 2 | | 4 | 2 | 20 Mar 2012 21:48 | 2 | The result I expect is the most recent purchased product (full row) for each address_id and that result must be sorted in descendant order by the purchased_at field: | id |

PostgreSQL DISTINCT ON with different ORDER BY

大城市里の小女人 提交于 2019-11-26 03:28:09
I want to run this query: SELECT DISTINCT ON (address_id) purchases.address_id, purchases.* FROM purchases WHERE purchases.product_id = 1 ORDER BY purchases.purchased_at DESC But I get this error: PG::Error: ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY expressions Adding address_id as first ORDER BY expression silences the error, but I really don't want to add sorting over address_id . Is it possible to do without ordering by address_id ? Documentation says: DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate