amazon-redshift | 易学教程

convert MM/DD/YYYY to YYYYMMDD in redshift

阅读更多关于 convert MM/DD/YYYY to YYYYMMDD in redshift

问题 I have a requirement to convert MM/DD/YYYY to YYYYMMDD in amazon redshift database. My result of this query gives me some weird result. Can some one please help me. select to_date ('07/17/2017','YYYYMMDD'); 0007-07-20 回答1: If you just wish to convert the hard-coded string into a DATE : select to_date('07/17/2017', 'MM/DD/YYYY') If you have a column already formatted as DATE , then use: to_char(fieldname, 'YYYYMMDD') Combining the two concepts: select to_char(to_date('07/17/2017', 'MM/DD/YYYY'

Redshift - Calculate monthly active users

阅读更多关于 Redshift - Calculate monthly active users

问题 I have a table which looks like this: Date | User_ID 2017-1-1 | 1 2017-1-1 | 2 2017-1-1 | 4 2017-1-2 | 3 2017-1-2 | 2 ... | .. ... | .. ... | .. ... | .. 2017-2-1 | 1 2017-2-2 | 2 ... | .. ... | .. ... | .. I'd like to calculate the monthly active users over a rolling 30 day period. I know Redshift does not do COUNT(DISTINCT)) windowing. What can I do to get the following output? Date | MAU 2017-1-1 | 3 2017-1-2 | 4 <- We don't want to count user_id 2 twice. ... | .. ... | .. ... | .. 2017-2

Redshift - Calculate monthly active users

阅读更多关于 Redshift - Calculate monthly active users

Does case matter when 'auto' loading data from S3 into a Redshift table? [duplicate]

阅读更多关于 Does case matter when 'auto' loading data from S3 into a Redshift table? [duplicate]

问题 This question already has answers here : Loading JSON data to AWS Redshift results in NULL values (3 answers) Closed 2 years ago . I am loading data from S3 into Redshift using the COPY command, the gzip flag and the 'auto' format, as per this documentation on loading from S3, this documentation for using the 'auto' format in AWS, and this documentation for addressing compressed files. My data is a highly nested JSON format, and I have created the redshift table such that the column names

Does case matter when 'auto' loading data from S3 into a Redshift table? [duplicate]

阅读更多关于 Does case matter when 'auto' loading data from S3 into a Redshift table? [duplicate]

What does the column skew_sorkey1 in Amazon Redshift's svv_table_info imply?

阅读更多关于 What does the column skew_sorkey1 in Amazon Redshift's svv_table_info imply?

问题 Redshift's documentation (http://docs.aws.amazon.com/redshift/latest/dg/r_SVV_TABLE_INFO.html) states that the definition of the column skew_sortkey1 is - Ratio of the size of the largest non-sort key column to the size of the first column of the sort key, if a sort key is defined. Use this value to evaluate the effectiveness of the sort key. What does this imply? What does it mean if this value is large? or alternatively small? Thanks! 回答1: A large skew_sortkey1 value means that the ratio of

What does the column skew_sorkey1 in Amazon Redshift's svv_table_info imply?

阅读更多关于 What does the column skew_sorkey1 in Amazon Redshift's svv_table_info imply?

Difference between RDS and Redshift

阅读更多关于 Difference between RDS and Redshift

问题 Can anyone list down the main differences between Amazon Redshift and RDS? I know both are relational DB's but why choose one over the other ? 回答1: RDS is a managed service for Online Transaction Processing databases (OLTP), i.e. a managed service for the usual MySQL, PostgreSQL, Oracle, MariaDB, Microsoft SQL Server or Aurora (Amazon's own relational database) Redshift is a managed service for data warehousing, i.e. columnar oriented storage, typical for business analytics type of workloads.

Copying only new records from AWS DynamoDB to AWS Redshift

阅读更多关于 Copying only new records from AWS DynamoDB to AWS Redshift

问题 I see there is tons of examples and documentation to copy data from DynamoDB to Redshift, but we are looking at an incremental copy process where only the new rows are copied from DynamoDB to Redshift. We will run this copy process everyday, so there is no need to kill the entire redshift table each day. Does anybody have any experience or thoughts on this topic? 回答1: Dynamo DB has a feature (currently in preview) called Streams: Amazon DynamoDB Streams maintains a time ordered sequence of

Efficient GROUP BY a CASE expression in Amazon Redshift/PostgreSQL

阅读更多关于 Efficient GROUP BY a CASE expression in Amazon Redshift/PostgreSQL

问题 In analytics processing there is often a need to collapse "unimportant" groups of data into a single row in the resulting table. One way to do this is to GROUP BY a CASE expression where unimportant groups are coalesced into a single row via the CASE expression returning a single value, e.g., NULL for the groups. This question is about efficient ways to perform this grouping in Amazon Redshift, which is based on ParAccel: close to PosgreSQL 8.0 in terms of functionality. As an example,