Postgres using an index for one table but not another

前端 未结 2 1138
误落风尘
误落风尘 2020-12-22 06:53

I have three tables in my app, call them tableA, tableB, and tableC. tableA has fields for tableB_id and

2条回答
  •  春和景丽
    2020-12-22 07:31

    For starters, your LEFT JOIN is counteracted by the predicate on the left table and is forced to act like an [INNER] JOIN. Replace with:

    SELECT *
    FROM   tableA a
    JOIN   tableB b ON b.id = a.tableB_id
    WHERE  lower(b.foo) = lower(my_input);
    

    Or, if you actually want the LEFT JOIN to include all rows from tableA:

    SELECT *
    FROM   tableA a
    LEFT   JOIN tableB b ON b.id = a.tableB_id
                        AND lower(b.foo) = lower(my_input);
    

    I think you want the first one.

    An index on (lower(foo::text)) like you posted is syntactically invalid. You better post the verbatim output from \d tbl in psql like I commented repeatedly. A shorthand syntax for a cast (foo::text) in an index definition needs more parentheses, or use the standard syntax: cast(foo AS text):

    • Create index on first 3 characters (area code) of phone field?

    But that's also unnecessary. You can just use the data type (character varying(255)) of foo. Of course, the data type character varying(255) rarely makes sense in Postgres to begin with. The odd limitation to 255 characters is derived from limitations in other RDBMS which do not apply in Postgres. Details:

    • Refactor foreign key to fields

    Be that as it may. The perfect index for this kind of query would be a multicolumn index on B - if (and only if) you get index-only scans out of this:

    CREATE INDEX "tableB_lower_foo_id" ON tableB (lower(foo), id);
    

    You can then drop the mostly superseded index "index_tableB_on_lower_foo". Same for tableC.
    The rest is covered by the (more important!) indices in table A on tableB_id and tableC_id.

    If there are multiple rows in tableA per tableB_id / tableC_id, then either one of these competing commands can swing the performance to favor the respective query by physically clustering related rows together:

    CLUSTER tableA USING "index_tableA_on_tableB_id";
    CLUSTER tableA USING "index_tableA_on_tableC_id";
    

    You can't have both. It's either B or C. CLUSTER also does everything a VACUUM FULL would do. But be sure to read the details first:

    • Optimize Postgres timestamp query range

    And don't use mixed case identifiers, sometimes quoted, sometimes not. This is very confusing and is bound to lead to errors. Use legal, lower-case identifiers exclusively - then it doesn't matter if you double-quote them or not.

提交回复
热议问题