amazon-redshift | 易学教程

Using Redshift Spectrum to read the data in external table in AWS Redshift

阅读更多关于 Using Redshift Spectrum to read the data in external table in AWS Redshift

问题 I did the below in AWS Redshift cluster to read the Parquet file from S3. create external schema s3_external_schema from data catalog database 'dev' iam_role 'arn:aws:iam::<MyuniqueId>:role/<MyUniqueRole>' create external database if not exists; then CREATE external table s3_external_schema.SUPPLIER_PARQ_1 ( S_SuppKey BIGINT , S_Name varchar(64) , S_Address varchar(64) , S_NationKey int , S_Phone varchar(18) , S_AcctBal decimal(13, 2) , S_Comment varchar(105)) partitioned by (Supplier bigint,

Rewrite code with UNPIVOT from sql to redshift

阅读更多关于 Rewrite code with UNPIVOT from sql to redshift

问题 I have SQL server script that needs to be converted to redshift. I converted part of it Here is part of the script where I have a problem with. SELECT a.*, b.* FROM ( SELECT u.ContactId, u.Description, CONVERT(Float,u.SeatCharge) AS SeatCharge --CAST(u.SeatCharge AS numeric (18,4)) AS SeatCharge, --CONVERT(INT, CASE WHEN IsNumeric(CONVERT(VARCHAR(12), u.SeatCharge)) = 1 then CONVERT(VARCHAR(12), u.SeatCharge) else 0 End) FROM( SELECT md.contactid, CONVERT(float, MAX(CASE WHEN md.fieldid= 7172

redshift convert_timezone does not work

阅读更多关于 redshift convert_timezone does not work

问题 When running Redshift queries using Razor SQL, UTC dates appear to be treated as being in the local timezone, complete with daylight saving times. For example, running SELECT 'first',CONVERT_TIMEZONE('UTC', 'America/New_York', '2016-03-27 06:00:00') UNION SELECT 'second', CONVERT_TIMEZONE('UTC', 'America/New_York', '2016-03-27 07:00:00') returns the same time for each, 2016-03-27 03:00 New York actually changed to daylight saving time on the 13th March and this does work: SELECT 'first'

WHERE EXISTS vs IN in Amazon Redshift

阅读更多关于 WHERE EXISTS vs IN in Amazon Redshift

问题 I run EXPLAIN on two versions of the same query in Amazon Redshift: SELECT t1.column FROM table1 t1 WHERE t1.column IN (SELECT t2.column FROM table2 t2); SELECT t1.column FROM table1 t1 WHERE EXISTS (SELECT 1 FROM table2 t2 WHERE t1.column = t2.column ); They seem to have the same query plan. Does that mean that there is no performance difference between IN and WHERE EXISTS as Redshift somehow optimizes the SQL input before compiling the query? 来源： https://stackoverflow.com/questions/50800120

Why Redshift automatically trims varchar column when joining?

阅读更多关于 Why Redshift automatically trims varchar column when joining?

问题 I encountered unique problem when using Redshift. Please see the below illustrative example: drop table if exists joinTrim_temp1; create table joinTrim_temp1(rowIndex1 int, charToJoin1 varchar(20)); insert into joinTrim_temp1 values(1, 'Sudan' ); insert into joinTrim_temp1 values(2, 'Africa' ); insert into joinTrim_temp1 values(3, 'USA' ); drop table if exists joinTrim_temp2; create table joinTrim_temp2(rowIndex2 int, charToJoin2 varchar(20)); insert into joinTrim_temp2 values(1, 'Sudan ' );

How to generate 12 digit unique number in redshift?

阅读更多关于 How to generate 12 digit unique number in redshift?

问题 I have 3 columns in a table i.e. email_id , rid , final_id . Rules for rid and final_id : If the email_id has a corresponding rid , use rid as the final_id . If the email_id does not have a corresponding rid (i.e. rid is null), generate a unique 12 digit number and insert into final_id field. How to generate 12 digit unique number in redshift? 回答1: From Creating a UUID function in Redshift: By default there is no UUID function in AWS Redshift. However with the Python User-Defined Function you

Assign a Sequence (session ID) to my table based on A value in field

阅读更多关于 Assign a Sequence (session ID) to my table based on A value in field

问题 I am manually assigning a "Session ID" to my results set. I did this by ordering all events and if the time difference between the current and next event is greater than 2 minutes, set the "session" field to "New Session". My results set now looks like this. Table name : tbl_sessions ╔════════════╦═════╦══════════════╗ ║ date ║ ID ║ session ║ ╠════════════╬═════╬══════════════╣ ║ 01/01/2018 ║ 100 ║ Same Session ║ ║ 01/01/2018 ║ 100 ║ Same Session ║ ║ 01/01/2018 ║ 100 ║ Same Session ║ ║ 01/01

I need to know the list of tables on which there are locks currently

阅读更多关于 I need to know the list of tables on which there are locks currently

问题 select A.table_id, last_update, last_commit, lock_owner_pid, lock_status, B.table from stv_locks as A left outer join svv_table_info as B on A.table_id = B.table_id order by last_update asc I used the query above, but am getting null in a name for some table_ids . what are these? 回答1: select distinct(id) table_id ,trim(datname) db_name ,trim(nspname) schema_name ,trim(relname) table_name from stv_locks join stv_tbl_perm on stv_locks.table_id = stv_tbl_perm.id join pg_class on pg_class.oid =

How can I find any non ASCII characters in Redshift database

阅读更多关于 How can I find any non ASCII characters in Redshift database

问题 I've a database table I'd like to return all values where a column contains a non ASCII character anywhere in the string. Is there an easy way to do this? I've tried this select col_name, regexp_instr(col_name,'[^[:ascii:]]') from test_table s where created > sysdate - 1 and regexp_instr(col_name,'[^[:ascii:]]') > 0 limit 5; but get this error: error: Invalid character class name, collating name, or character range. The error occured while parsing the regular expression: '[^[:>>>HERE>>>ascii:

Append results from a query to the same result row in PostgreSQL - Redshift

阅读更多关于 Append results from a query to the same result row in PostgreSQL - Redshift

问题 I have a table, with 3 columns A, B , C - where A is not the primary key. We need to select the B, C pairs for each distinct A(group by A), and append the results at the end of the final result set. Is this possible in sql ? A | B | C a1| b1| c1 a1| b2| c2 a1| b3| c3 a2| b1| c2 a2| b2| c5 I need to get a1 | (c1,b1) ; (c2,b2);(c3;b3) a2 | (c2,b1) ; (c5,b2) as the rows appended at the end. I normally do this via sqlalchemy, and then end up transforming the data in Python, is there a way in