AWS Redshift : DISTKEY / SORTKEY columns should be compressed?

问题

Let me ask something about column compression on AWS Redshift. Now we're verifying what can be made better performance using appropriate diststyle, sortkeys and column compression.

If my understanding is correct, the column compression can help to reduce IO cost. I tried "analyze compression table_name;". And mostly Redshift suggests to use 'zstd' or 'lzo' as compression method for our columns.

In general speaking, may I ask the columns set as DISTKEY/SORTKEY should be also compressed like other columns?

I'm totally new to Redshift and any advice would be appreciated.

Sincerly.

回答1:

DISTKEY can be compressed but the first SORTKEY column should be uncompressed (ENCODE raw). If you have multiple sort keys (compound) the other sort key columns can be compressed.

Also, generally recommend using a commonly filtered date/timestamp column (if one exists) as the first sort key column in a compound sort key.

Finally, if you are joining between very large tables try using the same dist and sort keys on both tables so Redshift can use a faster merge join.

来源：https://stackoverflow.com/questions/52625508/aws-redshift-distkey-sortkey-columns-should-be-compressed

标签

amazon-redshift

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!