aggregate-functions

Returning Multiple Arrays from User-Defined Aggregate Function (UDAF) in Apache Spark SQL

点点圈 提交于 2019-12-30 03:13:10
问题 I am trying to create a user-defined aggregate function (UDAF) in Java using Apache Spark SQL that returns multiple arrays on completion. I have searched online and cannot find any examples or suggestions on how to do this. I am able to return a single array, but cannot figure out how to get the data in the correct format in the evaluate() method for returning multiple arrays. The UDAF does work as I can print out the arrays in the evaluate() method, I just can't figure out how to return

What can an aggregate function do in the ORDER BY clause?

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-30 03:03:33
问题 Lets say I have a plant table: id fruit 1 banana 2 apple 3 orange I can do these SELECT * FROM plant ORDER BY id; SELECT * FROM plant ORDER BY fruit DESC; which does the obvious thing. But I was bitten by this, what does this do? SELECT * FROM plant ORDER BY SUM(id); SELECT * FROM plant ORDER BY COUNT(fruit); SELECT * FROM plant ORDER BY COUNT(*); SELECT * FROM plant ORDER BY SUM(1) DESC; All these return just the first row (which is with id = 1). What's happening underhood? What are the

Efficiently Include Column not in Group By of SQL Query

一曲冷凌霜 提交于 2019-12-29 06:15:51
问题 Given Table A Id INTEGER Name VARCHAR(50) Table B Id INTEGER FkId INTEGER ; Foreign key to Table A I wish to count the occurrances of each FkId value: SELECT FkId, COUNT(FkId) FROM B GROUP BY FkId Now I simply want to also output the Name from Table A . This will not work: SELECT FkId, COUNT(FkId), a.Name FROM B b INNER JOIN A a ON a.Id=b.FkId GROUP BY FkId because a.Name is not contained in the GROUP BY clause (produces is invalid in the select list because it is not contained in either an

GROUP BY and COUNT in PostgreSQL

爷,独闯天下 提交于 2019-12-29 03:33:47
问题 The query: SELECT COUNT(*) as count_all, posts.id as post_id FROM posts INNER JOIN votes ON votes.post_id = posts.id GROUP BY posts.id; Returns n records in Postgresql: count_all | post_id -----------+--------- 1 | 6 3 | 4 3 | 5 3 | 1 1 | 9 1 | 10 (6 rows) I just want to retrieve the number of records returned: 6 . I used a subquery to achieve what I want, but this doesn't seem optimum: SELECT COUNT(*) FROM ( SELECT COUNT(*) as count_all, posts.id as post_id FROM posts INNER JOIN votes ON

ORDER BY Alias not working

被刻印的时光 ゝ 提交于 2019-12-28 06:35:36
问题 UPDATING QUESTION: ERROR: column "Fruits" does not exist Running Postgres 7.4(Yeah we are upgrading) Why can't I ORDER BY the column alias? wants tof."TypeOfFruits" in the ORDER BY as well, why? SELECT (CASE WHEN tof."TypeOfFruits" = 'A' THEN 'Apple' WHEN tof."TypeOfFruits" = 'P' THEN 'Pear' WHEN tof."TypeOfFruits" = 'G' THEN 'Grapes' ELSE 'Other' END) AS "Fruits", SUM(CASE WHEN r.order_date BETWEEN DATE_TRUNC('DAY', LOCALTIMESTAMP) AND DATE_TRUNC('DAY', LOCALTIMESTAMP) + INTERVAL '1 DAY'

GROUP BY without aggregate function

守給你的承諾、 提交于 2019-12-27 16:49:48
问题 I am trying to understand GROUP BY (new to oracle dbms) without aggregate function. How does it operate? Here is what i have tried. EMP table on which i will run my SQL. SELECT ename , sal FROM emp GROUP BY ename , sal SELECT ename , sal FROM emp GROUP BY ename; Result ORA-00979: not a GROUP BY expression 00979. 00000 - "not a GROUP BY expression" *Cause: *Action: Error at Line: 397 Column: 16 SELECT ename , sal FROM emp GROUP BY sal; Result ORA-00979: not a GROUP BY expression 00979. 00000 -

Showing Distinct Values with Aggregates

走远了吗. 提交于 2019-12-25 18:32:13
问题 I have a table for recording daily price from different suppliers. My goal is to find the best (low price) supplier. The table structure is Table Name: lab1 Columns: ID, Product_ID, Price_date, Price, Supplier ----------------------------------------------------------------------------------- ID Product_ID Price_date Price Supplier -------------------------------------------------------------------------------------- 1 8 26-10-2014 1300 SP1 2 8 05-10-2014 1600 SP2 3 8 15-10-2014 1300 SP1 4 8

How to Use count and Group By with Self join in the same table in sql server 2008?

半腔热情 提交于 2019-12-25 18:20:16
问题 I have a single table with columns of st_name and id . I need to get the count of st_name and Group by st_name . How do I do this? 回答1: select st_name, count(*) as grp_cnt, (select count(distinct st_name) from your_table) as st_cnt from your_table group by st_name 来源: https://stackoverflow.com/questions/19633094/how-to-use-count-and-group-by-with-self-join-in-the-same-table-in-sql-server-200

Replicating an Excel SUMIFS formula

让人想犯罪 __ 提交于 2019-12-25 15:19:10
问题 I need to replicate - or at least find an alternative solution - for a SUMIFS function I have in Excel. I have a transactional database: SegNbr Index Revenue SUMIF A 1 10 30 A 1 20 30 A 2 30 100 A 2 40 100 B 1 50 110 B 1 60 110 B 3 70 260 B 3 80 260 and I need to create another column that sums the Revenue, by SegmentNumber, for all indexes that are equal or less the Index in that row. It is a distorted rolling revenue as it will be the same for each SegmentNumber/Index key. This is the

Replicating an Excel SUMIFS formula

吃可爱长大的小学妹 提交于 2019-12-25 15:18:09
问题 I need to replicate - or at least find an alternative solution - for a SUMIFS function I have in Excel. I have a transactional database: SegNbr Index Revenue SUMIF A 1 10 30 A 1 20 30 A 2 30 100 A 2 40 100 B 1 50 110 B 1 60 110 B 3 70 260 B 3 80 260 and I need to create another column that sums the Revenue, by SegmentNumber, for all indexes that are equal or less the Index in that row. It is a distorted rolling revenue as it will be the same for each SegmentNumber/Index key. This is the