postgresql-performance

How can I speed up this PostgreSQL UPDATE FROM sql query? It currently takes days to finish running

只谈情不闲聊 提交于 2021-02-10 14:55:18
问题 How can I speed up the PostgreSQL UPDATE FROM sql query below? It currently takes days to finish running. UPDATE import_parts ip SET part_part_id = pp.id FROM parts.part_parts pp WHERE pp.upc = ip.upc AND (ip.status is null or ip.status != '6'); And why does it takes days to run in the first place? Most of the time, I manually kill the query because it takes too long to run like more than 24 hours. Last time it successfully finished running, it took almost 38 hours. import_parts table has

Count on join of big tables with conditions is slow

孤街浪徒 提交于 2021-02-10 06:11:54
问题 This query had reasonable times when the table was small. I'm trying to identify what's the bottleneck, but I'm not sure how to analyze the EXPLAIN results. SELECT COUNT(*) FROM performance_analyses INNER JOIN total_sales ON total_sales.id = performance_analyses.total_sales_id WHERE (size > 0) AND total_sales.customer_id IN ( SELECT customers.id FROM customers WHERE customers.active = 't' AND customers.visible = 't' AND customers.organization_id = 3 ) AND total_sales.product_category_id IN (

Postgresql 9.4 query gets progressively slower when joining TSTZRANGE with &&

感情迁移 提交于 2021-02-07 12:19:45
问题 I am running a query that gets progressively slower as records are added. Records are added continuously via an automated process (bash calling psql). I would like to correct this bottle neck; however, I don't know what my best option is. This is the output from pgBadger: Hour Count Duration Avg duration 00 9,990 10m3s 60ms <---ignore this hour 02 1 60ms 60ms <---ignore this hour 03 4,638 1m54s 24ms <---queries begin with table empty 04 30,991 55m49s 108ms <---first full hour of queries

Best performance in sampling repeated value from a grouped column

人走茶凉 提交于 2021-02-06 15:15:44
问题 This question is about the functionality of first_value(), using another function or workaround. It is also about "little gain in performance" in big tables. To use eg. max() in the explained context below, demands spurious comparisons. Even if fast, it imposes some additional cost. This typical query SELECT x, y, count(*) as n FROM t GROUP BY x, y; needs to repeat all columns in GROUP BY to return more than one column. A syntactic sugar to do this, is to use positional references: SELECT x,

Compound index with three keys, what happens if I query skipping the middle one?

和自甴很熟 提交于 2020-08-07 06:55:26
问题 With PostgreSQL, I want to use a compound index on three columns A, B, C . B is the created_at datetime, and occasionally I might query without B . What happens if I compound index on (A, B, C) but then query with conditions on A and C , but not B ? (That is, A and C but want it over all time, not just some specific time range?) Is Postgres smart enough to still use the (A, B, C) compound index but just skip B? 回答1: Postgres can use non-leading columns in a b-tree index, but in a far less

Compound index with three keys, what happens if I query skipping the middle one?

人走茶凉 提交于 2020-08-07 06:53:59
问题 With PostgreSQL, I want to use a compound index on three columns A, B, C . B is the created_at datetime, and occasionally I might query without B . What happens if I compound index on (A, B, C) but then query with conditions on A and C , but not B ? (That is, A and C but want it over all time, not just some specific time range?) Is Postgres smart enough to still use the (A, B, C) compound index but just skip B? 回答1: Postgres can use non-leading columns in a b-tree index, but in a far less

How to force evaluation of subquery before joining / pushing down to foreign server

断了今生、忘了曾经 提交于 2020-05-15 09:30:29
问题 Suppose I want to query a big table with a few WHERE filters. I am using Postgres 11 and a foreign table; foreign data wrapper (FDW) is clickhouse_fdw . But I am also interested in a general solution. I can do so as follows: SELECT id,c1,c2,c3 from big_table where id=3 and c1=2 My FDW is able to do the filtering on the remote foreign data source, ensuring that the above query is quick and doesn't pull down too much data. The above works the same if I write: SELECT id,c1,c2,c3 from big_table

Best way to get distinct count from a query joining two tables (multiple join possibilities)

巧了我就是萌 提交于 2020-04-18 05:48:05
问题 I have 2 tables, table Actions & table Users . Actions -> Users is many-one association. Table Actions (has thousands of rows) id uuid name type created_by org_id Table Users (has a max of hundred rows) id username org_id org_name I am trying to get the best join query to obtain a count with a WHERE clause. I need the count of distinct created_by s from table Actions with an org_name in Table Users that contains 'myorg'. Also, ( Actions.created_by = Users.username ) I currently have the below

Improve PostgresSQL aggregation query performance

╄→гoц情女王★ 提交于 2020-02-23 10:10:22
问题 I am aggregating data from a Postgres table, the query is taking approx 2 seconds which I want to reduce to less than a second. Please find below the execution details: Query select a.search_keyword, hll_cardinality( hll_union_agg(a.users) ):: int as user_count, hll_cardinality( hll_union_agg(a.sessions) ):: int as session_count, sum(a.total) as keyword_count from rollup_day a where a.created_date between '2018-09-01' and '2019-09-30' and a.tenant_id = '62850a62-19ac-477d-9cd7-837f3d716885'

Improve PostgresSQL aggregation query performance

試著忘記壹切 提交于 2020-02-23 10:09:39
问题 I am aggregating data from a Postgres table, the query is taking approx 2 seconds which I want to reduce to less than a second. Please find below the execution details: Query select a.search_keyword, hll_cardinality( hll_union_agg(a.users) ):: int as user_count, hll_cardinality( hll_union_agg(a.sessions) ):: int as session_count, sum(a.total) as keyword_count from rollup_day a where a.created_date between '2018-09-01' and '2019-09-30' and a.tenant_id = '62850a62-19ac-477d-9cd7-837f3d716885'