window-functions

How do I create named window partitions (aliases) in PostgreSQL?

懵懂的女人 提交于 2019-12-10 09:26:37
问题 The documentation for PostgreSQL window functions seems to imply you can use the same named window in multiple places in your query. However, I can't figure out how do I create a named window? SELECT first_value(vin) OVER( PARTITION BY vin ) AS w, first_value(make) OVER w FROM inventory.vehicles WHERE lot_Id = 9999 AND make is not null; This is a joke query I'm trying to get the syntax to take, but I'm getting error: ERROR: window "w" does not exist 回答1: The answer was actually in the SELECT

PostgreSQL - column value changed - select query optimization

老子叫甜甜 提交于 2019-12-10 03:56:22
问题 Say we have a table: CREATE TABLE p ( id serial NOT NULL, val boolean NOT NULL, PRIMARY KEY (id) ); Populated with some rows: insert into p (val) values (true),(false),(false),(true),(true),(true),(false); ID VAL 1 1 2 0 3 0 4 1 5 1 6 1 7 0 I want to determine when the value has been changed. So the result of my query should be: ID VAL 2 0 4 1 7 0 I have a solution with joins and subqueries: select min(id) id, val from ( select p1.id, p1.val, max(p2.id) last_prev from p p1 join p p2 on p2.id

Select a row of first non-null values in a sparse table

混江龙づ霸主 提交于 2019-12-10 01:36:56
问题 Using the following table: A | B | C | ts --+------+------+------------------ 1 | null | null | 2016-06-15 10:00 4 | null | null | 2016-06-15 11:00 4 | 9 | null | 2016-06-15 12:00 5 | 1 | 7 | 2016-06-15 13:00 How do I select the first non-null value of each column in a running window of N rows? "First" as defined by the order of timestamps in columns ts . Querying the above table would result in: A | B | C --+---+--- 1 | 9 | 7 回答1: The window function first_value() allows for a rather short

ROW_Count() To Start Over Based On Order

ε祈祈猫儿з 提交于 2019-12-09 15:32:21
问题 Create Table #Test ( ID Int Primary Key Identity, Category VarChar(100) ) Insert into #Test (Category) Values ('Banana'), ('Banana'), ('Banana'), ('Banana'), ('Banana'), ('Banana'), ('Strawberry'), ('Strawberry'), ('Strawberry'), ('Banana'), ('Banana') Select * ,ROW_NUMBER() Over (Partition by Category order by ID) as RowNum From #Test Order by ID So this script returns this: ID Category RowNum 1 Banana 1 2 Banana 2 3 Banana 3 4 Banana 4 5 Banana 5 6 Banana 6 7 Strawberry 1 8 Strawberry 2 9

Pyspark : Custom window function

会有一股神秘感。 提交于 2019-12-09 07:08:44
问题 I am currently trying to extract series of consecutive occurrences in a PySpark dataframe and order/rank them as shown below (for convenience I have ordered the initial dataframe by user_id and timestamp ): df_ini +-------+--------------------+------------+ |user_id| timestamp | actions | +-------+--------------------+------------+ | 217498| 100000001| 'A' | | 217498| 100000025| 'A' | | 217498| 100000124| 'A' | | 217498| 100000152| 'B' | | 217498| 100000165| 'C' | | 217498| 100000177| 'C' | |

How to use window functions in PySpark?

我是研究僧i 提交于 2019-12-09 05:10:56
问题 I'm trying to use some windows functions ( ntile and percentRank ) for a data frame but I don't know how to use them. Can anyone help me with this please? In the Python API documentation there are no examples about it. Specifically, I'm trying to get quantiles of a numeric field in my data frame. I'm using spark 1.4.0. 回答1: To be able to use window function you have to create a window first. Definition is pretty much the same as for normal SQL it means you can define either order, partition

SQL Server 2014 Merging Overlapping Date Ranges

好久不见. 提交于 2019-12-09 03:26:28
问题 I have a table with 200.000 rows in a SQL Server 2014 database looking like this: CREATE TABLE DateRanges ( Contract VARCHAR(8), Sector VARCHAR(8), StartDate DATE, EndDate DATE ); INSERT INTO DateRanges (Contract, Sector, StartDate, Enddate) SELECT '111', '999', '01-01-2014', '03-31-2014' union SELECT '111', '999', '04-01-2014', '06-30-2014' union SELECT '111', '999', '07-01-2014', '09-30-2014' union SELECT '111', '999', '10-01-2014', '12-31-2014' union SELECT '111', '888', '08-01-2014', '08

First and last value of window function in one row in PostgreSQL

自闭症网瘾萝莉.ら 提交于 2019-12-09 03:19:07
问题 I'd like to have first value of one column and last value of second column in one row for a specified partition. For that I created this query: SELECT DISTINCT b.machine_id, batch, timestamp_sta, timestamp_stp, FIRST_VALUE(timestamp_sta) OVER w AS batch_start, LAST_VALUE(timestamp_stp) OVER w AS batch_end FROM db_data.sta_stp AS a JOIN db_data.ll_lu AS b ON a.ll_lu_id=b.id WINDOW w AS (PARTITION BY batch, machine_id ORDER BY timestamp_sta) ORDER BY timestamp_sta, batch, machine_id; But as you

“Cumulative difference” function in R

给你一囗甜甜゛ 提交于 2019-12-08 14:42:25
Is there a pre-existing function to calculate the cumulative difference between consequtive values? Context: this is to estimate the change in altitude that a person has to undergo in both directions on a journey generated by CycleStreet.net . Reproducible example: x <- c(27, 24, 24, 27, 28) # create the data Method 1: for loop for(i in 2:length(x)){ # for loop way if(i == 2) cum_change <- 0 cum_change <- Mod(x[i] - x[i - 1]) + cum_change cum_change } ## 7 Method 2: vectorised diffs <- Mod(x[-1] - x[-length(x)]) # vectorised way sum(diffs) ## 7 Both seem to work. I'm just wondering if there's

MySQL datetime comparison with previous row

雨燕双飞 提交于 2019-12-08 11:56:21
问题 I have a table with two Date columns. DATE1 is sometimes NULL and sometimes contains duplicate values. DATE2 is always populated and unique. My table is sorted by latest DATE2 date. I'd like to create a new date column where DATE1 will be selected unless its value is duplicated from the next row or it's NULL. In this case, I want to take the value of DATE2. I also need two boolean columns that tell me when either of those conditions were met. Let me demonstrate using an example so it's