Why is Select Count(*) slower than Select * in hive

前端 未结 3 493
Happy的楠姐
Happy的楠姐 2020-12-28 16:12

When i am running queries in VirtualBox Sandbox with hive. I feel Select count(*) is too much slower than the Select *.

Can an

3条回答
  •  渐次进展
    2020-12-28 17:02

    This is because the DB is using clustered primary keys so the query searches each row for the key individually, row by agonizing row, not from an index.

    • Run optimize table. This will ensure that the data pages are physically stored in sorted order. This could conceivably speed up a range scan on a clustered primary key.

    • create an additional non-primary index on just the change_event_id column. This will store a copy of that column in index pages which be much faster to scan. After creating it, check the explain plan to make sure it's using the new index

提交回复
热议问题