window-functions | 易学教程

Spark-SQL Window functions on Dataframe - Finding first timestamp in a group

阅读更多关于 Spark-SQL Window functions on Dataframe - Finding first timestamp in a group

问题 I have below dataframe (say UserData). uid region timestamp a 1 1 a 1 2 a 1 3 a 1 4 a 2 5 a 2 6 a 2 7 a 3 8 a 4 9 a 4 10 a 4 11 a 4 12 a 1 13 a 1 14 a 3 15 a 3 16 a 5 17 a 5 18 a 5 19 a 5 20 This data is nothing but user (uid) travelling across different regions (region) at different time (timestamp). Presently, timestamp is shown as 'int' for simplicity. Note that above dataframe will not be necessarily in increasing order of timestamp. Also, there may be some rows in between from different

How to enumerate groups of partitions in my Postgres table with window functions?

阅读更多关于 How to enumerate groups of partitions in my Postgres table with window functions?

问题 Suppose I have a table like this: id | part | value ----+-------+------- 1 | 0 | 8 2 | 0 | 3 3 | 0 | 4 4 | 1 | 6 5 | 0 | 13 6 | 0 | 4 7 | 1 | 2 8 | 0 | 11 9 | 0 | 15 10 | 0 | 3 11 | 0 | 2 I would like to enumerate groups between rows that have part atribute 1. So I would like to get this: id | part | value | number ----+-------+----------------- 1 | 0 | 8 | 1 2 | 0 | 3 | 1 3 | 0 | 4 | 1 4 | 1 | 6 | 0 5 | 0 | 13 | 2 6 | 0 | 4 | 2 7 | 1 | 2 | 0 8 | 0 | 11 | 3 9 | 0 | 15 | 3 10 | 0 | 3 | 3 11 |

Decode maximum number in rows for sql

阅读更多关于 Decode maximum number in rows for sql

问题 I am using the #standardsql in bigquery and trying to code the maksimum ranking of each customer_id as 1 , and the rest of it are 0 This is the query result so far The query for ranking is this ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY booking_date Asc) as ranking What i need is to create another column like this where it decode the maximum ranking of each customerid as 1, and the number below it as 0 just like the below table Thanks 回答1: Based on your sample data, your ranking is

How to group following rows by not unique value

阅读更多关于 How to group following rows by not unique value

问题 I have data like this: table1 _____________ id way time 1 1 00:01 2 1 00:02 3 2 00:03 4 2 00:04 5 2 00:05 6 3 00:06 7 3 00:07 8 1 00:08 9 1 00:09 I would like to know in which time interval I was on which way: desired output _________________ id way from to 1 1 00:01 00:02 3 2 00:03 00:05 6 3 00:06 00:07 8 1 00:08 00:09 I tried to use a window function: SELECT DISTINCT first_value(id) OVER w AS id, first_value(way) OVER w as way, first_value(time) OVER w as from, last_value(time) OVER w as to

Sum across partitions with window functions

阅读更多关于 Sum across partitions with window functions

问题 I have the following problem... Time | A | B | C -- Sum should be 1 a1 b1 c1 a1 + b1 + c1 2 a2 b2 x a2 + b1 + c1 3 a3 x x a3 + b2 + c1 4 x b3 c2 a3 + b3 + c2 Essentially, the sum needs to be across the most recent value in time for each of the three rows. Each data column doesn't necessarily have a value for the current time. I have tried several approaches using window functions and have been unsuccessful. I have written a stored procedure that does what I need, but it is SLOW. CREATE OR

select case with “over partition by”

阅读更多关于 select case with “over partition by”

问题 What's the correct syntax or is it possible to use case in a select and in it partition by? (using sql server 2012) a = unique id b = a string'xf%' c = values d = values e = values select case when b like 'xf%' then (sum(c*e)/100*3423 over (partition by a))end as sumProduct from #myTable this is something i need to solve which is a part of a problem i had previously sumProduct in sql edit : upon request adding some sample data and expected result create table #testing (b varchar (20), a date,

Sum of time difference between rows

阅读更多关于 Sum of time difference between rows

问题 I have a table which records every status change of an entity id recordTime Status ID1 2014-03-01 11:33:00 Disconnected ID1 2014-03-01 12:13:00 Connected ID2 2014-03-01 12:21:00 Connected ID1 2014-03-01 12:24:00 Disconnected ID1 2014-03-01 12:29:00 Connected ID2 2014-03-01 12:40:00 Disconnected ID2 2014-03-01 13:03:00 Connected ID2 2014-03-01 13:13:00 Disconnected ID2 2014-03-01 13:29:00 Connected ID1 2014-03-01 13:30:00 Disconnected I need to calculate the total inactive time i.e time

how to rank over partition in MySql

阅读更多关于 how to rank over partition in MySql

问题 Im new use MySql database, I face the problem that I can solve it if in SQL server Database, but I cant do it in mysql this bellow my case MyTable: Name Price abs 100 abs 200 abs 60 trx 19 trx 20 abs 10 qwe 25 qwe 50 qwe 10 qwe 10 Result Expected: Name Price Rank abs 200 4 abs 100 3 abs 60 2 abs 10 1 qwe 50 4 qwe 25 3 qwe 10 2 qwe 10 1 trx 20 2 trx 19 1 could anyone help me how to make query like index result pict with Mysql 回答1: Using variable you can find Rank . Like this: SELECT Name,

Delete all rows but one with the greatest value per group

阅读更多关于 Delete all rows but one with the greatest value per group

问题 So, I just recently asked a question: Update using a subquery with aggregates and groupby in Postgres and it turns out I was going about my issue with flawed logic. In the same scenario in the question above, instead of updating all the rows to have the max quantity, I'd like to delete the rows that don't have the max quantity (and any duplicate max quantities). Essentially I need to just convert the below to a delete statement that preserves only the largest quantities per item_name. I'm

How I can get Second max salary using “over(partition by)” in oracle SQL?

阅读更多关于 How I can get Second max salary using “over(partition by)” in oracle SQL?

问题 I already get it by doing this query: SELECT * FROM ( SELECT emp_id,salary,row_number() over(order by salary desc) AS rk FROM test_qaium ) where rk=2; But one of my friend ask me to find second MAX salary from employees table must using " over(partition by ) " in oracle sql. Anybody please help me. And clear me the concept of " Partition by " in oracle sql. 回答1: Oracle Setup : CREATE TABLE test_qaium ( emp_id, salary, department_id ) AS SELECT 1, 10000, 1 FROM DUAL UNION ALL SELECT 2, 20000,