问题
We plan to import OLTP Relational tables into AWS Redshift. The CustomerTransaction table joins to multiple lookup tables. I only included 3, but we have more.
What should Sort Key be on Customer Transaction Table? In regular SQL server, we have nonclustered indexes on the foreign keys in CustomerTransaction table. For AWS Redshift, Should I use compound sort keys or interleaved sort on foreign key columns in CustomerTransaction? What is the best indexing strategy for this table design. Thanks,
create table.dbo CustomerTransaction
{
CustomerTransactionId bigint primary key identity(1,1),
ProductTypeId bigint, -- foreign keys to Product Type Table
StatusTypeID bigint -- Foreign keys to StatusTypeTable
DateOfPurchase date,
PurchaseAmount float,
....
}
create table dbo.ProductType
{
CustomerTransactionId bigint primary key identity(1,1),
ProductName varchar(255),
ProductDescription varchar(255)
.....
}
create table dbo.StatusType
{
StatusTypeId bigint primary key identity(1,1),
StatusTypeName varchar(255),
StatusDescription varchar(255)
.....
}
回答1:
The general rules of thumb are:
- Set the
DISTKEY
based on what you commonlyGROUP BY
- Set the
SORTKEY
based on what you commonly use inWHERE
statements - Avoid Interleaved Sort Keys (they are only optimal in rare circumstances and require frequent
VACUUM
)
From Choose the Best Distribution Style - Amazon Redshift:
- Distribute the fact table and one dimension table on their common columns
- Choose the largest dimension based on the size of the filtered data set
- Choose a column with high cardinality in the filtered result set
- Change some dimension tables to use ALL distribution
So, it is not easy to recommend a particular DISTKEY
and SORTKEY
because it depends on how you use the tales. Merely seeing the DDL is not sufficient to recommend the best way to optimize the tables.
Other references:
- Amazon Redshift Best Practices for Designing Tables
- Top 10 Performance Tuning Techniques for Amazon Redshift | AWS Big Data Blog
来源:https://stackoverflow.com/questions/50538013/amazon-redshift-foreign-keys-sort-or-interleaved-keys