PostgreSQL Joining Between Two Values

前端 未结 1 736
渐次进展
渐次进展 2020-12-22 03:46

I have the following tables and am trying to look up county codes for a list of several hundred thousand cities.

create table counties (
  zip_code_from  cha         


        
1条回答
  •  一整个雨季
    2020-12-22 04:03

    Months later, this has cropped its head again, and I decided to test some of my theories.

    The original query:

    select
      ci.city, ci.zip_code, co.fips_code
    from
      cities ci
      join counties co on
        ci.zip_code between co.from_zip_code and co.thru_zip_code
    

    Does in fact implement a cartesian. The query returns 34,000 rows and takes 597 seconds.

    If I "pre-explode" the zip code ranges into discrete records:

    with exploded_zip as (
      select
        generate_series (
          cast (from_zip_code as int),
          cast (thru_zip_code as int)
        )::text as zip_code,
        *
      from counties
    )
    select
      ci.city, ci.zip_code, co.fips_code
    from
      cities ci
      join exploded_zip co on
        ci.zip_code = co.zip_code
    

    The query returns the exact same rows but finishes in 2.8 seconds.

    So it seems the bottom line is that using a between in a join (or any inequality) is a really bad idea.

    0 讨论(0)
提交回复
热议问题