What does the column skew_sorkey1 in Amazon Redshift's svv_table_info imply?

断了今生、忘了曾经 提交于 2020-01-02 07:31:31

问题


Redshift's documentation (http://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html) states that the definition of the column skew_sortkey1 is - Ratio of the size of the largest non-sort key column to the size of the first column of the sort key, if a sort key is defined. Use this value to evaluate the effectiveness of the sort key.

What does this imply? What does it mean if this value is large? or alternatively small?

Thanks!


回答1:


A large skew_sortkey1 value means that the ratio of the size of largest non-sort key column to the first column of sort key is large which means row offsets in one disk block for the sort key corresponds to more disk blocks in the data column.

For example lets say skew_sortkey1 value is 5 for a table. Now the row offsets in one disk block for the sort key corresponds to 5 disk blocks for other data columns. Zone map stores the min and max value for the sort key disk block, so when you query this table with a where clause on sort key redshift identifies the sort key block which contains this data (block min < where clause value < block_max) and fetches the row offsets for that column. Now since the skew_sortkey1 is 5, it has to fetch 5 blocks for the data columns before filtering the records to the desired ones.

So to conclude having a high skew_sortkey1 value is not desirable.




回答2:


Sortkeys define the order in which each field in a row of a table get stored in a disk block of redshift. This means that column data belonging to a sort key region gets stored together in a single disk block (1 MB size) . Since redshift applies compression to different columns, sortkey columns would have a potential advantage of storing similar data with in the same disk-block, leading to good compression and less space. The samething cannot be said about other non-sortkey columns.

The column skew_sortkey1 in the table SVV_TABLE_INFO exactly tries to capture that. By that number, it can be perceived whether the selection of a sortkey has helped in yielding good compression.



来源:https://stackoverflow.com/questions/35857481/what-does-the-column-skew-sorkey1-in-amazon-redshifts-svv-table-info-imply

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!