What is the most performant way to rewrite a large IN clause?

感情迁移 提交于 2019-12-23 20:07:45

问题


I wrote an API using go and gorm that runs calculations on our database and returns the results.

I just hit the parameter limit for an IN condition when using an aggregate. Example query:

SELECT SUM(total_amount) from Table where user_id in(...70k parameters) group by user_id

One of my current edge cases has > 65535 user ids so my Postgres client is throwing an error:

got 66037 parameters but PostgreSQL only supports 65535 parameters

I'm not sure what the best way to approach this is. One that will handle the large amount of parameters for this edge case while not affecting my typical use case. Do I chunk the ids and iterate through multiple queries storing it in memory until I have all the data I need? Use ANY(VALUES)...

Obviously from the query I have very limited knowledge of Postgres so any help would be incredibly appreciated.


回答1:


You can replace user_id IN (value [, ...]) with one of:

user_id IN (subquery)
user_id = ANY (subquery)
user_id = ANY (array expression)

Neither subqueries nor arrays exhibit the same limitation. The shortest input syntax would be:

user_id = ANY ('{1,2,3}'::int[])  -- make array type match type of user_id

Details and more options:

  • How to use ANY instead of IN in a WHERE clause with Rails?

Or you might create a (temporary) table tmp_usr(user_id int), import to it, maybe with SQL COPY or psql \copy instead of INSERT for best performance with very big sets, and then join to the table like:

SELECT SUM(total_amount)
FROM   tbl
JOIN   tmp_usr USING (user_id)
GROUP  BY user_id;

BTW, GROUP BY user_id without including user_id in the SELECT list looks suspicious. May be a simplified example query.



来源:https://stackoverflow.com/questions/52712022/what-is-the-most-performant-way-to-rewrite-a-large-in-clause

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!