Amazon Redshift Foreign Keys - Sort or Interleaved Keys

喜欢而已 提交于 2020-01-16 10:32:02

问题


We plan to import OLTP Relational tables into AWS Redshift. The CustomerTransaction table joins to multiple lookup tables. I only included 3, but we have more.

What should Sort Key be on Customer Transaction Table? In regular SQL server, we have nonclustered indexes on the foreign keys in CustomerTransaction table. For AWS Redshift, Should I use compound sort keys or interleaved sort on foreign key columns in CustomerTransaction? What is the best indexing strategy for this table design. Thanks,

create table.dbo CustomerTransaction
{
    CustomerTransactionId bigint primary key identity(1,1),
    ProductTypeId bigint,   -- foreign keys to Product Type Table
    StatusTypeID bigint         -- Foreign keys to StatusTypeTable
    DateOfPurchase date,
    PurchaseAmount float,
    ....
}

create table dbo.ProductType
{
    CustomerTransactionId bigint primary key identity(1,1),
    ProductName varchar(255),
    ProductDescription varchar(255)
    .....
}

create table dbo.StatusType
{
    StatusTypeId bigint primary key identity(1,1),
    StatusTypeName varchar(255),
    StatusDescription varchar(255)
    .....

}


回答1:


The general rules of thumb are:

  • Set the DISTKEY based on what you commonly GROUP BY
  • Set the SORTKEY based on what you commonly use in WHERE statements
  • Avoid Interleaved Sort Keys (they are only optimal in rare circumstances and require frequent VACUUM)

From Choose the Best Distribution Style - Amazon Redshift:

  • Distribute the fact table and one dimension table on their common columns
  • Choose the largest dimension based on the size of the filtered data set
  • Choose a column with high cardinality in the filtered result set
  • Change some dimension tables to use ALL distribution

So, it is not easy to recommend a particular DISTKEY and SORTKEY because it depends on how you use the tales. Merely seeing the DDL is not sufficient to recommend the best way to optimize the tables.

Other references:

  • Amazon Redshift Best Practices for Designing Tables
  • Top 10 Performance Tuning Techniques for Amazon Redshift | AWS Big Data Blog


来源:https://stackoverflow.com/questions/50538013/amazon-redshift-foreign-keys-sort-or-interleaved-keys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!