How do I speed up counting rows in a PostgreSQL table?

前端 未结 6 912
悲哀的现实
悲哀的现实 2020-12-02 18:37

We need to count the number of rows in a PostgreSQL table. In our case, no conditions need to be met, and it would be perfectly acceptable to get a row estimate if that sig

相关标签:
6条回答
  • 2020-12-02 19:01

    Count is slow for big tables, so you can get a close estimate this way:

    SELECT reltuples::bigint AS estimate 
    FROM pg_class 
    WHERE relname='tableName';
    

    and its extremely fast, results are not float, but still a close estimate.

    • reltuples is a column from pg_class table, it holds data about "number of rows in the table. This is only an estimate used by the planner. It is updated by VACUUM, ANALYZE, and a few DDL commands such as CREATE INDEX" (manual)
    • The catalog pg_class catalogs tables and most everything else that has columns or is otherwise similar to a table. This includes indexes (but see also pg_index), sequences, views, composite types, and some kinds of special relation (manual)
    • "Why is "SELECT count(*) FROM bigtable;" slow?" : http://wiki.postgresql.org/wiki/FAQ#Why_is_.22SELECT_count.28.2A.29_FROM_bigtable.3B.22_slow.3F
    0 讨论(0)
  • 2020-12-02 19:12

    You can ask for the exact value of the count in the table by simply using trigger AFTER INSERT OR DELETE Something like this

    CREATE TABLE  tcounter(id serial primary key,table_schema text, table_name text, count serial);
    
    insert into tcounter(table_schema, table_name,count) select 'my_schema', 'my_table', count(*) from my_schema.my_table;
    

    and use trigger

    CREATE OR REPLACE FUNCTION ex_count()
    RETURNS trigger AS
    $BODY$
    BEGIN
        IF (TG_OP='INSERT') THEN
          UPDATE tcounter set count = count + 1 where table_schema = TG_TABLE_SCHEMA::TEXT and table_name = TG_TABLE_NAME::TEXT;
        ELSIF  (TG_OP='DELETE') THEN
          UPDATE tcounter set count = count - 1 where table_schema = TG_TABLE_SCHEMA::TEXT and table_name = TG_TABLE_NAME::TEXT;
        END IF;
    RETURN NEW;
    END$BODY$
    LANGUAGE plpgsql VOLATILE
    COST 100;
    
    CREATE TRIGGER tg_counter  AFTER INSERT OR DELETE
      ON my_schema.my_table  FOR EACH ROW  EXECUTE PROCEDURE ex_count();
    

    And ask for count

    select * from tcounter where table_schema =  'my_schema' and table_name = 'my_table'
    

    it means you select count(*) once for initialize first record

    0 讨论(0)
  • 2020-12-02 19:13

    For a quick estimate:

    SELECT reltuples FROM pg_class WHERE oid = 'my_schema.my_table'::regclass;
    

    This is superior to the queries presented so far - including the advice in the Postgres Wiki on slow counting. (Updated that by now.):
    relname is not unique in pg_class. There can be multiple tables with the same relname in multiple schemas of the database. That's regularly the case in my installations.

    And a query on pg_stat_user_tables is many times slower, as that's a view on a couple of tables.

    If you do not schema-qualify the table name, a cast to regclass observes the current search_path to pick the best match. And if the table does not exist (or cannot be seen) in any of the schemas in the search_path you get an error message.

    Details on Object Identifier Types in the manual.

    Related answer with new options:

    • Fast way to discover the row count of a table in PostgreSQL
    0 讨论(0)
  • 2020-12-02 19:19

    You can get an estimate from the system table "pg_stat_user_tables".

    select schemaname, relname, n_live_tup 
    from pg_stat_user_tables 
    where schemaname = 'your_schema_name'
    and relname = 'your_table_name';
    
    0 讨论(0)
  • 2020-12-02 19:20

    If your database is small, you can get an estimate of all your tables like @mike-sherrill-cat-recall suggested. This command will list all the tables though.

    SELECT schemaname,relname,n_live_tup 
    FROM pg_stat_user_tables 
    ORDER BY n_live_tup DESC;
    

    Output would be something like this:

     schemaname |      relname       | n_live_tup
    ------------+--------------------+------------
     public     | items              |      21806
     public     | tags               |      11213
     public     | sessions           |       3269
     public     | users              |        266
     public     | shops              |        259
     public     | quantities         |         34
     public     | schema_migrations  |         30
     public     | locations          |          8
    (8 rows)
    
    0 讨论(0)
  • 2020-12-02 19:24

    Aside from running COUNT() on an indexed field (which hopefully 'id' is) - the next best thing would be to actually cache the row count in some table using a trigger on INSERT. Naturally, you'll be checking the cache instead.

    For an approximation you can try this (from https://wiki.postgresql.org/wiki/Count_estimate):

    select reltuples from pg_class where relname='tablename';
    
    0 讨论(0)
提交回复
热议问题