amazon-redshift | 易学教程

Last Non-Null Value in Redshift by Group

阅读更多关于 Last Non-Null Value in Redshift by Group

问题 I am using Redshift and want to receive the last non-Null value by userid. Here is an example dataset: Date UserID Value 4-18-2018 abc 1 4-19-2018 abc NULL 4-20-2018 abc NULL 4-21-2018 abc 8 4-19-2018 def 9 4-20-2018 def 10 4-21-2018 def NULL 4-22-2018 tey NULL 4-23-2018 tey 2 If the new user starts out with a NULL then replace with 0. I want my final dataset to look like this: Date UserID Value 4-18-2018 abc 1 4-19-2018 abc 1 4-20-2018 abc 1 4-21-2018 abc 8 4-19-2018 def 9 4-20-2018 def 10 4

Redshift copy creates different compression encodings from analyze

阅读更多关于 Redshift copy creates different compression encodings from analyze

问题 I've noticed that AWS Redshift recommends different column compression encodings from the ones that it automatically creates when loading data (via COPY) to an empty table. For example, I have created a table and loaded data from S3 as follows: CREATE TABLE Client (Id varchar(511) , ClientId integer , CreatedOn timestamp, UpdatedOn timestamp , DeletedOn timestamp , LockVersion integer , RegionId varchar(511) , OfficeId varchar(511) , CountryId varchar(511) , FirstContactDate timestamp ,

AWS Athena Returning Zero Records from Tables Created from GLUE Crawler input csv from S3

阅读更多关于 AWS Athena Returning Zero Records from Tables Created from GLUE Crawler input csv from S3

问题 Part One : I tried glue crawler to run on dummy csv loaded in s3 it created a table but when I try view table in athena and query it it shows Zero Records returned. But the demo data of ELB in Athena works fine. Part Two (Scenario:) Suppose I Have a excel file and data dictionary of how and what format data is stored in that file , I want that data to be dumped in AWS Redshift What would be best way to achieve this ? 回答1: I experienced the same issue. You need to give the folder path instead

AWS Athena Returning Zero Records from Tables Created from GLUE Crawler input csv from S3

阅读更多关于 AWS Athena Returning Zero Records from Tables Created from GLUE Crawler input csv from S3

How can I find out the size of each column in a Redshift table?

阅读更多关于 How can I find out the size of each column in a Redshift table?

问题 While trying out different compression settings in Redshift it would be very useful to know the size of each column. I know how to get the size of a table, but I want to know the size of each individual column in that table. 回答1: This query will give you the size (MB) of each column. What it does is that it counts the number of data blocks, where each block uses 1 MB, grouped by table and column. SELECT TRIM(name) as table_name, TRIM(pg_attribute.attname) AS column_name, COUNT(1) AS size FROM

How to use ruby to write individual records to a Redshift database?

阅读更多关于 How to use ruby to write individual records to a Redshift database?

问题 Currently, we have a script that parses data and uploads it one record at a time to a mysql database. Recently, we decided to switch to aws redshift. Is there a way I can use my amazon login credentials and my redshift cluster information to upload these records directly to the redshift database? All the guides I'm finding online recommend importing text or csv files from an S3 bucket, but that is not very practical for my use case. Thanks for any help I'm looking to do something like this:

Local development and staging with Amazon Redshift

阅读更多关于 Local development and staging with Amazon Redshift

问题 I like to set up tools and services with production, staging, and local development. I'd like to use Amazon Redshift, and starting at $180 a month seems pretty reasonable for a columnar store database, but do I actually have to think about it as $180 x # of environments / month? Is there any way to have a free staging and local environment for Redshift? It's also nice to be able to do development against a local instance rather than relying on the network. I assume that's not possible with

AWS Redshift Pivot Table all Dimensions

阅读更多关于 AWS Redshift Pivot Table all Dimensions

问题 I am following the method to pivot a large table in redshift: Pivot a table with Amazon RedShift / PostgreSQL However I have a large number of groups to pivot ie, m1 , m2 , ... How can I loop through all distinct values and apply the same logic to each of them and alias the resulting column names? 回答1: If you want to be able to pivot to arbitrary numbers of groups you can combine the groups into a JSON string and then extract the groups you are interested in with the Redshift JSON functions.

Accessing Redshift from Lambda - Avoiding the 0.0.0.0/0 Security Group

阅读更多关于 Accessing Redshift from Lambda - Avoiding the 0.0.0.0/0 Security Group

问题 I am trying to access a Redshift database from a Lambda function. When I add 0.0.0.0/0 to the security group connections in the Redshift interface (as suggested by this article), I am able to connect successfully. From a security perspective, however, I don't feel comfortable using 0.0.0.0/0. Is there a way to only allow Lambda to access Redshift without opening it up to the public internet? I have tried adding the AWS IP ranges, however, this didn't work (as it only allows a limited number

Invalid operation: WITH RECURSIVE is not supported

阅读更多关于 Invalid operation: WITH RECURSIVE is not supported

问题 When I'm running query below I get message: [Amazon](500310) Invalid operation: WITH RECURSIVE is not supported; Can someone explain me why recursive function doesn't work? (I'm working on amazon redshift) WITH RECURSIVE r AS ( SELECT 1 AS i, 1 AS factorial UNION SELECT i+1 AS i, factorial * (i+1) as factorial FROM r WHERE i < 10 ) SELECT * FROM r; 回答1: The official Amazon Redshift documentation: Unsupported PostgreSQL Features: These PostgreSQL features are not supported in Amazon Redshift.