window-functions | 易学教程

Cumulative adding with dynamic base in Postgres

阅读更多关于 Cumulative adding with dynamic base in Postgres

问题 I have the following scenario in Postgres (I'm using 9.4.1 ). I have a table of this format: create table test( id serial, val numeric not null, created timestamp not null default(current_timestamp), fk integer not null ); What I then have is a threshold numeric field in another table which should be used to label each row of test . For every value which is >= threshold I want to have that record marked as true but if it is true it should reset subsequent counts to 0 at that point, e.g. Data

MySQL show sum of difference of two values

阅读更多关于 MySQL show sum of difference of two values

问题 Below is my query. SELECT n.`name`,n.`customer_id`,m.`msn`, m.kwh, m.kwh - LAG(m.kwh) OVER(PARTITION BY n.`customer_id` ORDER BY m.`data_date_time`) AS kwh_diff FROM mdc_node n INNER JOIN `mdc_meters_data` m ON n.`customer_id` = m.`cust_id` WHERE n.`lft` = 5 AND n.`icon` NOT IN ('folder') AND m.`data_date_time` BETWEEN NOW() - INTERVAL 30 DAY AND NOW() Which gives me below result I want to sum up the kwh_diff and to show only one-row record not multiple like below name customer_id msn sum_kwh

Get the last element of a window in Spark 2.1.1

阅读更多关于 Get the last element of a window in Spark 2.1.1

问题 I have a dataframe in which I have subcategories, and want the last element of each of these subcategories. val windowSpec = Window.partitionBy("name").orderBy("count") sqlContext .createDataFrame( Seq[(String, Int)]( ("A", 1), ("A", 2), ("A", 3), ("B", 10), ("B", 20), ("B", 30) )) .toDF("name", "count") .withColumn("firstCountOfName", first("count").over(windowSpec)) .withColumn("lastCountOfName", last("count").over(windowSpec)) .show() returns me something strange: +----+-----+-------------

Return duration of an item from its transactions, many to many, SQL

阅读更多关于 Return duration of an item from its transactions, many to many, SQL

问题 Hopefully I can get some help on this. Situation There are two incoming stations and one outgoing station. Items are scanned in and out. I need to know how long an item was in the station. Let's consider 'in station' to be the time between it's incoming date scan and it's outgoing date scan. Problem An item can be (accidentally) scanned multiple times into either station (for this I was thinking of identifying if a scan was made the same day (not looking at hours) then return the earliest

Return duration of an item from its transactions, many to many, SQL

阅读更多关于 Return duration of an item from its transactions, many to many, SQL

Unexpected data at typical recursion

阅读更多关于 Unexpected data at typical recursion

问题 It's hard for me to use words to describe this, so here's the sample: select * into t from (values (10, 'A'), (25, 'B'), (30, 'C'), (45, 'D'), (52, 'E'), (61, 'F'), (61, 'G'), (61, 'H'), (79, 'I'), (82, 'J') ) v(userid, name) Notice how F,G and H have the same userid. Now, consider the following recursive query: with tn as ( select t.userId,t.name, row_number() over (order by userid,newid()) as seqnum from t ), cte as ( select userId, name, seqnum as seqnum from tn where seqnum = 1 union all

Select only rows that has a column changed from the rows before it, given an unique ID

阅读更多关于 Select only rows that has a column changed from the rows before it, given an unique ID

问题 I have a postgreSQL database where I want to record how a specific column changes for each id, over time. Table1: personID | status | unixtime | column d | column e | column f 1 2 213214 x y z 1 2 213325 x y z 1 2 213326 x y z 1 2 213327 x y z 1 2 213328 x y z 1 3 214330 x y z 1 3 214331 x y z 1 3 214332 x y z 1 2 324543 x y z I want to track all the of status over time. So based on this I want a new table, table2 with the following data: personID | status | unixtime | column d | column e |

Apply OFFSET and LIMIT in ORACLE for complex Join Queries?

阅读更多关于 Apply OFFSET and LIMIT in ORACLE for complex Join Queries?

问题 I'm using Oracle 11g and have a complex join query. In this query I really wanted to apply OFFSET and LIMIT in order to be get used in Spring Batch Framework effectively. I went through: How do I limit the number of rows returned by an Oracle query after ordering? and Alternatives to LIMIT and OFFSET for paging in Oracle But things are not very clear to me. My Query SELECT DEPT.ID rowobjid, DEPT.CREATOR createdby, DEPT.CREATE_DATE createddate, DEPT.UPDATED_BY updatedby, DEPT.LAST_UPDATE_DATE

Finding Percentile in Spark-Scala per a group

阅读更多关于 Finding Percentile in Spark-Scala per a group

问题 I am trying to do a percentile over a column using a Window function as below. I have referred here to use the ApproxQuantile definition over a group. val df1 = Seq( (1, 10.0), (1, 20.0), (1, 40.6), (1, 15.6), (1, 17.6), (1, 25.6), (1, 39.6), (2, 20.5), (2 ,70.3), (2, 69.4), (2, 74.4), (2, 45.4), (3, 60.6), (3, 80.6), (4, 30.6), (4, 90.6) ).toDF("ID","Count") val idBucketMapping = Seq((1, 4), (2, 3), (3, 2), (4, 2)) .toDF("ID", "Bucket") //jpp import org.apache.spark.sql.Column import org

Finding Percentile in Spark-Scala per a group

阅读更多关于 Finding Percentile in Spark-Scala per a group