window-functions

Cumulative sum of values by month, filling in for missing months

泪湿孤枕 提交于 2019-12-07 05:34:29
问题 I have this data table and I'm wondering if is possible create a query that get a cumulative sum by month considering all months until the current month . date_added | qty ------------------------------------ 2015-08-04 22:28:24.633784-03 | 1 2015-05-20 20:22:29.458541-03 | 1 2015-04-08 14:16:09.844229-03 | 1 2015-04-07 23:10:42.325081-03 | 1 2015-07-06 18:50:30.164932-03 | 1 2015-08-22 15:01:54.03697-03 | 1 2015-08-06 18:25:07.57763-03 | 1 2015-04-07 23:12:20.850783-03 | 1 2015-07-23 17:45

Will Postgres push down a WHERE clause into a VIEW with a Window Function (Aggregate)?

*爱你&永不变心* 提交于 2019-12-06 20:04:10
问题 The docs for Pg's Window function say: The rows considered by a window function are those of the "virtual table" produced by the query's FROM clause as filtered by its WHERE, GROUP BY, and HAVING clauses if any. For example, a row removed because it does not meet the WHERE condition is not seen by any window function. A query can contain multiple window functions that slice up the data in different ways by means of different OVER clauses, but they all act on the same collection of rows

Semantic exception error in HIVE while using last_value window function

拥有回忆 提交于 2019-12-06 15:50:51
I have a table with the following data: dt device id count 2018-10-05 computer 7541185957382 6 2018-10-20 computer 7541185957382 3 2018-10-14 computer 7553187775734 6 2018-10-17 computer 7553187775734 10 2018-10-21 computer 7553187775734 2 2018-10-22 computer 7549187067178 5 2018-10-20 computer 7553187757256 3 2018-10-11 computer 7549187067178 10 I want to get the last and first dt for each id . Hence, I used the window functions first_value and last_value as follows: select id,last_value(dt) over (partition by id order by dt) last_dt from table order by id ; But I am getting this error:

Optimizing SUM OVER PARTITION BY for several hierarchical groups

不问归期 提交于 2019-12-06 11:38:48
I have a table like below: Region Country Manufacturer Brand Period Spend R1 C1 M1 B1 2016 5 R1 C1 M1 B1 2017 10 R1 C1 M1 B1 2017 20 R1 C1 M1 B2 2016 15 R1 C1 M1 B3 2017 20 R1 C2 M1 B1 2017 5 R1 C2 M2 B4 2017 25 R1 C2 M2 B5 2017 30 R2 C3 M1 B1 2017 35 R2 C3 M2 B4 2017 40 R2 C3 M2 B5 2017 45 I need to find SUM([Spend] over different groups as follow: Total Spend over all the rows in the whole table Total Spend for each Region Total Spend for each Region and Country group Total Spend for each Region, Country and Advertiser group So I wrote this query below: SELECT [Period] ,[Region] ,[Country] ,

Limit the number of rows per ID

故事扮演 提交于 2019-12-06 08:50:00
I am trying to limit the number of rows per case to only 5 rows. Some cases have only 1 or 2 rows but some have 15 or more. This is an example of a stored procedure that I am using to count the number of rows per case. SELECT ROW_NUMBER() OVER(partition by rce.reportruncaseid ORDER BY rce.Reportruncaseid) AS Row, rce.ReportRunCaseId AS CaseId, YEAR(rce.EcoDate) AS EcoYear FROM PhdRpt.ReportCaseList AS rcl INNER JOIN PhdRpt.RptCaseEco AS rce ON rce.ReportId = rcl.ReportId AND rce.ReportRunCaseId = rcl.ReportRunCaseId GROUP BY rce.ReportId, rce.ReportRunCaseId, YEAR(rce.EcoDate) Order by rce

PARTITION BY alternative in HSQLDB

最后都变了- 提交于 2019-12-06 08:40:53
I would like to fire the query suggested in https://stackoverflow.com/a/3800572/2968357 on a HSQLDB database using select * such as WITH tmpTable AS ( SELECT p.* , ROW_NUMBER() OVER(PARTITION BY p.groupColumn order by p.groupColumn desc) AS rowCount FROM sourceTable p) SELECT * FROM tmpTable WHERE tmpTable.rowCount = 1 but getting the following error: Caused by: org.hsqldb.HsqlException: unexpected token: PARTITION required: ) meaning PARTITION BY is not supported. Is there a work-around for my specific query on HSQLDB? The second query in that answer is supported by HSQLDB. If you use the

Selecting every Nth row per user in Postgres

岁酱吖の 提交于 2019-12-06 04:41:27
I was using this SQL statement: SELECT "dateId", "userId", "Salary" FROM ( SELECT *, (row_number() OVER (ORDER BY "userId", "dateId"))%2 AS rn FROM user_table ) sa WHERE sa.rn=1 AND "userId" = 789 AND "Salary" > 0; But every time the table gets new rows the result of the query is different. Am I missing something? Assuming that ("dateId", "userId") is unique and new rows always have a bigger (later) dateId . After some comments: What I think you need: SELECT "dateId", "userId", "Salary" FROM ( SELECT "dateId", "userId", "Salary" ,(row_number() OVER ( PARTITION BY "userId" -- either this ORDER

row_number() over partition in hql

余生长醉 提交于 2019-12-06 02:29:53
问题 What is the equivalent of row_number() over partition in hql I have the following query in hql: select s.Companyname, p.Productname, sum(od.Unitprice * od.Quantity - od.Discount) as SalesAmount FROM OrderDetails as od inner join od.Orders as o inner join od.Products as p " + "inner join p.Suppliers as s" + " where o.Orderdate between '2010/01/01' and '2014/01/01' GROUP BY s.Companyname,p.Productname" I want to do partition by s.Companyname where RowNumber <= n . 回答1: As far as I know you

Django filtering on Window functions

半腔热情 提交于 2019-12-06 02:09:37
问题 I have two Models in Django, A and B. Each A has several Bs assigned to it, and the Bs are ordered, which is done with a field B.order_index that counts upwards from zero for any A. I want to write a query that checks if there is any A where some of the Bs have a gap or have duplicate order_index values. In SQL, this could be done like this: SELECT order_index, RANK() OVER(PARTITION BY a_id ORDER BY order_index ASC) - 1 AS rnk WHERE rnk = order_index' However, when I try this in Django with

PostgreSQL IGNORE NULLS in window functions

痞子三分冷 提交于 2019-12-05 21:44:26
On the left panel data without IGNORE NULLS. On the right panel data with IGNORE NULLS. So I need to get right variant in PostgreSQL Need to emulate Oracle IGNORE NULLS in window functions (LEAD and LAG) in PostgreSQL. SELECT empno, ename, orig_salary, LAG(orig_salary, 1, 0) IGNORE NULLS OVER (ORDER BY orig_salary) AS sal_prev FROM tbl_lead; If there are NULL, it should return the latest not null value. I've tried it via PostgreSQL user defined aggregate functions, but it's rather hard to understand methodology of it https://www.postgresql.org/docs/9.6/static/sql-createaggregate.html The