aggregate-functions

Implement aggregation in Teradata

拥有回忆 提交于 2019-12-23 04:41:08
问题 I want to aggregate 2 fields proct_dt, dw_job_id in ascendinng order My scenario would be clear by using below queries and result. First query :- sel * from scratch.COGIPF_RUNREPORT_test1 order by proct_dt,dw_job_id where dw_job_id =10309 Output :- dw_job_id proct_dt start_ts end_ts time_diff 1 10,309 2018-03-06 00:00:00 2018-03-06 07:04:18 2018-03-06 07:04:22.457000 0 2 10,309 2018-03-06 00:00:00 2018-03-06 06:58:50 2018-03-06 06:58:51.029000 0 3 10,309 2018-03-07 00:00:00 2018-03-07 06:35

How to Calculate Aggregated Product Function in SQL Server

馋奶兔 提交于 2019-12-23 02:53:40
问题 I have a table with 2 column: No. Name Serial 1 Tom 1 2 Bob 5 3 Don 3 4 Jim 6 I want to add a column whose a content is multiply Serial column like this: No. Name Serial Multiply 1 Tom 2 2 2 Bob 5 10 3 Don 3 30 4 Jim 6 180 How can i do that? 回答1: Oh, this is a pain. Most databases do not support a product aggregation function. You can emulate it with logs and powers. So, something like this might work: select t.*, (select exp(sum(log(serial))) from table t2 where t2.no <= t.no ) as

Using AVG() function between two tables

纵饮孤独 提交于 2019-12-23 01:37:07
问题 I have two tables, and I need to determine the company that offers the highest average salary for any position. My tables are as follows: employer eID (primary key), eName, location position eID (primary key), pName (primary key), salary) The code I wrote determines all avg salaries that are higher than one, but I need to find only the highest average salary over all Here is my code so far: SQL> select eName 2 from Employer E inner join position P on E.eID = P.eID 3 where salary > (select avg

Performance of UDAF versus Aggregator in Spark

拈花ヽ惹草 提交于 2019-12-22 17:10:11
问题 I am trying to write some performance-mindful code in Spark and wondering whether I should write an Aggregator or a User-defined Aggregate Function (UDAF) for my rollup operations on a Dataframe. I have not been able to find any data anywhere on how fast each of these methods are and which you should be using for spark 2.0+. 来源: https://stackoverflow.com/questions/45356452/performance-of-udaf-versus-aggregator-in-spark

What is causing a scope parameter error in my SSRS chart?

限于喜欢 提交于 2019-12-22 07:54:09
问题 Why am I getting this error in my chart? (Chart Image) I am using this expression in the chart: Series: =Sum(Fields!Mins_Att.Value)/Sum(Fields!Mins_Poss.Value) Series 1: =Sum(Fields!Mins_Att.Value, "Chart2_CategoryGroup2")/Sum(Fields!Mins_Poss.Value, "Chart2_CategoryGroup2") and I am getting this error: The Y expression for the Chart has a scope parameter that is not valid for an aggregate function. The scope parameter must be set to a string constant that is equal the name of group, data

Performance difference: select top 1 order by vs. select min(val)

巧了我就是萌 提交于 2019-12-21 17:25:09
问题 Question is simple. Which query will be faster: SELECT TOP 1 value FROM table ORDER BY value or SELECT TOP 1 MIN(value) FROM table We can assume that we have two cases, Case 1. No index and Case 2. With index on value. Any insights are appreciated. Thanks! 回答1: In the case where no index exists: MIN(value) should be implemented in O(N) time with a single scan; TOP 1 ... ORDER BY will require O(N Log N) time because of the specified sort (unless the DB engine is smart enough to read intent,

Group by column “grp” and compress DataFrame - (take last not null value for each column ordering by column “ord”)

二次信任 提交于 2019-12-21 15:27:47
问题 Assuming I have the following DataFrame: +---+--------+---+----+----+ |grp|null_col|ord|col1|col2| +---+--------+---+----+----+ | 1| null| 3|null| 11| | 2| null| 2| xxx| 22| | 1| null| 1| yyy|null| | 2| null| 7|null| 33| | 1| null| 12|null|null| | 2| null| 19|null| 77| | 1| null| 10| s13|null| | 2| null| 11| a23|null| +---+--------+---+----+----+ here is the same sample DF with comments, sorted by grp and ord : scala> df.orderBy("grp", "ord").show +---+--------+---+----+----+ |grp|null_col

Group by column “grp” and compress DataFrame - (take last not null value for each column ordering by column “ord”)

*爱你&永不变心* 提交于 2019-12-21 15:27:36
问题 Assuming I have the following DataFrame: +---+--------+---+----+----+ |grp|null_col|ord|col1|col2| +---+--------+---+----+----+ | 1| null| 3|null| 11| | 2| null| 2| xxx| 22| | 1| null| 1| yyy|null| | 2| null| 7|null| 33| | 1| null| 12|null|null| | 2| null| 19|null| 77| | 1| null| 10| s13|null| | 2| null| 11| a23|null| +---+--------+---+----+----+ here is the same sample DF with comments, sorted by grp and ord : scala> df.orderBy("grp", "ord").show +---+--------+---+----+----+ |grp|null_col

PostgreSQL aggregate or window function to return just the last value

限于喜欢 提交于 2019-12-21 12:16:48
问题 I'm using an aggregate function with the OVER clause in PostgreSQL 9.1 and I want to return just the last row for each window. The last_value() window function sounds like it might do what I want - but it doesn't. It returns a row for each row in the window, whereas I want just one row per window A simplified example: SELECT a, some_func_like_last_value(b) OVER (PARTITION BY a ORDER BY b) FROM ( SELECT 1 AS a, 'do not want this' AS b UNION SELECT 1, 'just want this' ) sub I want this to

How to Use Group By clause when we use Aggregate function in the Joins?

[亡魂溺海] 提交于 2019-12-21 10:49:55
问题 I want to join three tables and to calculate the Sum(Quantity) of the Table A. I tried something and I get the desired output. But still I have confusion based on aggregate function and Group By clause. While calculating the sum value by joining two or more tables, what are the columns we need to mention in the Group By clause and why do we need to give those columns? For Example: Here is my table and the desired query. TableA: ItemID, JobOrderID, CustomerID, DivisionID, Quantity TableB: