PostgreSQL: NOT IN versus EXCEPT performance difference (edited #2)

前端 未结 5 535
萌比男神i
萌比男神i 2020-12-15 04:01

I have two queries that are functionally identical. One of them performs very well, the other one performs very poorly. I do not see from where the performance difference ar

5条回答
  •  南方客
    南方客 (楼主)
    2020-12-15 04:40

    If id and position_id are both indexed (either on their own or first column in a multi-column index), then two index scans are all that are necessary - it's a trivial sorted-merge based set algorithm.

    Personally I think PostgreSQL simply doesn't have the optimization intelligence to understand this.

    (I came to this question after diagnosing a query running for over 24 hours that I could perform with sort x y y | uniq -u on the command line in seconds. Database less than 50MB when exported with pg_dump.)

    PS: more interesting comment here:

    more work has been put into optimizing EXCEPT and NOT EXISTS than NOT IN, because the latter is substantially less useful due to its unintuitive but spec-mandated handling of NULLs. We're not going to apologize for that, and we're not going to regard it as a bug.

    What it comes down to is that except is different to not in with respect to null handling. I haven't looked up the details, but it means PostgreSQL (aggressively) doesn't optimize it.

提交回复
热议问题