amazon-redshift | 易学教程

Amazon Redshift - Get week wise sales count by category

阅读更多关于 Amazon Redshift - Get week wise sales count by category

问题 I have daily sales data as below. I am trying to group by sales week wise. I tried using group by that gives total count for the period, how could I modify the query to obtain an output as shown below: Expected output: Last N days,Count,Category Last 7 days,225,Category_1 Last 14 days,136,Category_2 Last 7 days,172,Category_1 Last 14 days,321,Category_2 Input data: Date,*Sales*,Category 01-06-2018,10,Category_1 01-06-2018,19,Category_1 03-06-2018,3,Category_1 04-06-2018,13,Category_1 05-06

Find Top 1000 entries along with count and rank from table

阅读更多关于 Find Top 1000 entries along with count and rank from table

问题 I have a table with around 30 billions rows in Redshift with following structure, userid itemid country start_date uid1 itemid1 country1 2018-07-25 00:00:00 uid2 itemid2 country1 2018-07-25 00:00:00 uid3 itemid1 country2 2018-07-25 00:00:00 uid4 itemid3 country1 2018-07-25 00:00:00 uid5 itemid1 country1 2018-07-25 00:00:00 uid1 itemid2 country2 2018-07-25 00:00:00 uid2 itemid2 country2 2018-07-25 00:00:00 Here, I want to find item's are bought by how many unique users and then pick top 1000

Adding a auto incremental column into existing redshift table

阅读更多关于 Adding a auto incremental column into existing redshift table

问题 I have a table in Redshift. I want to add a column which should have incremental values. I dont want to drop the table and create a new one. Please suggest the command to add a column having auto incremental values in redshift table. Thanks !!! 回答1: It is not possible to add an IDENTITY column to an existing table. It might be easiest to create a new table with the new IDENTITY column, and copy the data into it. Note that the values aren't guaranteed to increase monotonically - i.e. there may

How to query user group privileges in postgresql?

阅读更多关于 How to query user group privileges in postgresql?

问题 So in postgresql I can do something like: SELECT has_table_privilege('myuser', 'mytable', 'select') to see whether myuser has select access on mytable . Is there something similar for user groups? Basically, I'd like to be able to submit a query to see if a group has certain privileges on a specified table. Thanks! 回答1: You could make a simple function to query role privileges; CREATE FUNCTION role_has_table_privilege(g NAME, tn NAME, pt NAME) RETURNS boolean AS 'SELECT EXISTS (SELECT 1 FROM

Redshift create all the combinations of any length for the values in one column

阅读更多关于 Redshift create all the combinations of any length for the values in one column

问题 How can we create all the combinations of any length for the values in one column and return the distinct count of another column for that combination? Table: +------+--------+ | Type | Name | +------+--------+ | A | Tom | | A | Ben | | B | Ben | | B | Justin | | C | Ben | +------+--------+ Output Table: +-------------+-------+ | Combination | Count | +-------------+-------+ | A | 2 | | B | 2 | | C | 1 | | AB | 3 | | BC | 2 | | AC | 2 | | ABC | 3 | +-------------+-------+ When the combination

redshift - how to insert into table generated time series

阅读更多关于 redshift - how to insert into table generated time series

问题 I am trying to generate time series in Redshift and insert into table, but no luck. What I have tried so far: insert into date(dateid,date) SELECT to_char(datum, 'YYYYMMDD')::int AS dateid, datum::date AS date FROM ( select '1970-01-01'::date + generate_series(0, 20000) as datum ) tbl; Getting the following error SQL Error [500310] [0A000]: [Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.; Any ideas or workaround ? 回答1:

Deterministic sort order for window functions

阅读更多关于 Deterministic sort order for window functions

Is my sort key being used?

阅读更多关于 Is my sort key being used?

问题 I have a table with a column updated_at , which is a sort key. After running both VACUUM and ANALYZE on the table, this is the query plan I get when filtering on updated_at : EXPLAIN SELECT * FROM my_table WHERE updated_at > '2018-01-01'; QUERY PLAN XN Seq Scan on my_table (cost=0.00..0.00 rows=1 width=723) Filter: (updated_at > '2018-01-01 00:00:00'::timestamp without time zone) My understanding is that the query execution engine is doing a sequential scan of the table despite the sort key

How to verify TCP keep alive on Linux

阅读更多关于 How to verify TCP keep alive on Linux

问题 I want to set tcp keep alive on my linux machine. So what I am doing is running a script if [ `/sbin/sysctl -n net.ipv4.tcp_keepalive_time` != 200 ] ; then /sbin/sysctl -w net.ipv4.tcp_keepalive_time=200; But I still have issues with connections to amazon's redshift. Can someone please help and show me how I can check if tcp keep alive is actually set or not? 回答1: To check if keep alive is active open a connection, don't exchange any data and verify with tcpdump or similar that packets gets

Error in Dataframe writing from R to Redshift

阅读更多关于 Error in Dataframe writing from R to Redshift

问题 I have a dataframe in R with various different data types. While writing the dataframe from R to redshift server, I am getting error only with character and timestamp values. I am adding R code snippet below to give you more idea about the issue. library(lubridate) library(dplyr) dat <- data.frame(id = letters[1:2], x = 2:3, date = now()) dat str(dat) drv <- dbDriver("PostgreSQL") conn <- dbConnect(drv, host="redshift.amazonaws.com", port="5439", dbname="abcd", user="xyz", password="abc") DBI