aggregate-functions

MySQL 8 Calculating Average by Partitioning By Date

隐身守侯 提交于 2021-02-19 09:12:23
问题 I've setup a fiddle here: https://www.db-fiddle.com/f/snDGExYZgoYASvWkDGHKDC/2 But also: Schema: CREATE TABLE `scores` ( `id` bigint unsigned NOT NULL AUTO_INCREMENT, `shift_id` int unsigned NOT NULL, `employee_name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL, `score` double(8,2) unsigned NOT NULL, `created_at` timestamp NOT NULL, PRIMARY KEY (`id`) ); INSERT INTO scores(shift_id, employee_name, score, created_at) VALUES (1, "John", 6.72, "2020-04-01 00:00:00"), (1, "Bob", 15.71, "2020

MySQL 8 Calculating Average by Partitioning By Date

醉酒当歌 提交于 2021-02-19 09:09:50
问题 I've setup a fiddle here: https://www.db-fiddle.com/f/snDGExYZgoYASvWkDGHKDC/2 But also: Schema: CREATE TABLE `scores` ( `id` bigint unsigned NOT NULL AUTO_INCREMENT, `shift_id` int unsigned NOT NULL, `employee_name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL, `score` double(8,2) unsigned NOT NULL, `created_at` timestamp NOT NULL, PRIMARY KEY (`id`) ); INSERT INTO scores(shift_id, employee_name, score, created_at) VALUES (1, "John", 6.72, "2020-04-01 00:00:00"), (1, "Bob", 15.71, "2020

MySQL 8 Calculating Average by Partitioning By Date

本秂侑毒 提交于 2021-02-19 09:05:39
问题 I've setup a fiddle here: https://www.db-fiddle.com/f/snDGExYZgoYASvWkDGHKDC/2 But also: Schema: CREATE TABLE `scores` ( `id` bigint unsigned NOT NULL AUTO_INCREMENT, `shift_id` int unsigned NOT NULL, `employee_name` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL, `score` double(8,2) unsigned NOT NULL, `created_at` timestamp NOT NULL, PRIMARY KEY (`id`) ); INSERT INTO scores(shift_id, employee_name, score, created_at) VALUES (1, "John", 6.72, "2020-04-01 00:00:00"), (1, "Bob", 15.71, "2020

Pyspark - How to get basic stats (mean, min, max) along with quantiles (25%, 50%) for numerical cols in a single dataframe

此生再无相见时 提交于 2021-02-17 05:37:26
问题 I have a spark df spark_df = spark.createDataFrame( [(1, 7, 'foo'), (2, 6, 'bar'), (3, 4, 'foo'), (4, 8, 'bar'), (5, 1, 'bar') ], ['v1', 'v2', 'id'] ) Expected Output id avg(v1) avg(v2) min(v1) min(v2) 0.25(v1) 0.25(v2) 0.5(v1) 0.5(v2) 0 bar 3.666667 5.0 2 1 some-value some-value some-value some-value 1 foo 2.000000 5.5 1 4. some-value some-value some-value some-value Until, now I can achieve the basic stats like avg, min, max. But not able to get the quantiles. I know ,this can be achieved

Selecting rows from a table with specific values per id

匆匆过客 提交于 2021-02-11 14:16:54
问题 I have the below table Table 1 Id WFID data1 data2 1 12 'd' 'e' 1 13 '3' '4f' 1 15 'e' 'dd' 2 12 'f' 'ee' 3 17 'd' 'f' 2 17 'd' 'f' 4 12 'd' 'f' 5 20 'd' 'f' From this table I just want to select the rows which has 12 and 17 only exclusively. Like from the table I just want to retrieve the distinct id's 2,3 and 4. 1 is excluded because it has 12 but also has 13 and 15. 5 is excluded because it has 20. 2 in included because it has just 12 and 17. 3 is included because it has just 17 4 is

How do I remove results based on conditions to calculate an average and specific movie

不想你离开。 提交于 2021-02-10 16:20:51
问题 I have the schema below. A quick explanation of it is: bob rated the movie up, 5/5 james rated the movie up, 1/5 macy rated the movie up, 5/5 No one has rated the movie avengers. The logic: If I am personA, look up everyone I have blocked. Look up all the movie reviews. Anyone who has left a movie review, and personA has blocked, remove them from the calculation. Calculate the average rating of the movies. CREATE TABLE movies ( id integer AUTO_INCREMENT primary key, name varchar(100) NOT NULL

left join multiplying values

爷,独闯天下 提交于 2021-02-10 12:31:41
问题 I have the following queries - SELECT COUNT(capture_id) as count_captures FROM captures WHERE user_id = 9 ...returns 5 SELECT COUNT(id) as count_items FROM items WHERE creator_user_id = 9 ...returns 22 I tried the following query - SELECT COUNT(capture_id) as count_captures, COUNT(items.id) as count_items FROM captures LEFT JOIN items ON captures.user_id = items.creator_user_id WHERE user_id = 9 ...but it returns two columns both with 110 as the value. I would want 5 in one column and 22 in

left join multiplying values

送分小仙女□ 提交于 2021-02-10 12:31:03
问题 I have the following queries - SELECT COUNT(capture_id) as count_captures FROM captures WHERE user_id = 9 ...returns 5 SELECT COUNT(id) as count_items FROM items WHERE creator_user_id = 9 ...returns 22 I tried the following query - SELECT COUNT(capture_id) as count_captures, COUNT(items.id) as count_items FROM captures LEFT JOIN items ON captures.user_id = items.creator_user_id WHERE user_id = 9 ...but it returns two columns both with 110 as the value. I would want 5 in one column and 22 in

Calculating follower growth over time for each influencer

血红的双手。 提交于 2021-02-08 07:01:47
问题 I have a table with influencers and their follower counter for each day: influencer_id | date | followers 1 | 2020-05-29 | 7361 1 | 2020-05-28 | 7234 ... 2 | 2020-05-29 | 82 2 | 2020-05-28 | 85 ... 3 | 2020-05-29 | 3434 3 | 2020-05-28 | 2988 3 | 2020-05-27 | 2765 ... Let's say I want to calculate how many followers each individual influencer has gained in the last 7 days and get the following table: influencer_id | growth 1 | <num followers last day - num followers first day> 2 | " 3 | " As a

Calculating follower growth over time for each influencer

社会主义新天地 提交于 2021-02-08 07:01:07
问题 I have a table with influencers and their follower counter for each day: influencer_id | date | followers 1 | 2020-05-29 | 7361 1 | 2020-05-28 | 7234 ... 2 | 2020-05-29 | 82 2 | 2020-05-28 | 85 ... 3 | 2020-05-29 | 3434 3 | 2020-05-28 | 2988 3 | 2020-05-27 | 2765 ... Let's say I want to calculate how many followers each individual influencer has gained in the last 7 days and get the following table: influencer_id | growth 1 | <num followers last day - num followers first day> 2 | " 3 | " As a