Calculating multiple averages across different parts of the table?

安稳与你 提交于 2021-02-11 13:12:16

问题


I have the following transactions table:

customer_id purchase_date   product         category        department      quantity    store_id
    1       2020-10-01      Kit Kat         Candy           Food                2       store_A
    1       2020-10-01      Snickers        Candy           Food                1       store_A
    1       2020-10-01      Snickers        Candy           Food                1       store_A
    2       2020-10-01      Snickers        Candy           Food                2       store_A
    2       2020-10-01      Baguette        Bread           Food                5       store_A
    2       2020-10-01      iPhone          Cell phones     Electronics         2       store_A
    3       2020-10-01      Sony PS5        Games           Electronics         1       store_A

I would like to calculate the average number of products purchased (for each product in the table). I'm also looking to calculate averages across each category and each department by accounting for all products within the same category or department respectively. Care should be taken to divide over unique customers AND the product quantity being greater than 0 (a 0 quantity indicates a refund, and should not be accounted for).

So basically, the output table would like below:

...where store_id and average_level_type are partition columns.

Is there a way to achieve this in a single pass over the transactions table? or do I need to break down my approach into multiple steps?

Thanks!


回答1:


How about using “union all” as below -

Select store_id, 'product' as average_level_type,product as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,product
Union all
Select store_id, 'category' as average_level_type, category as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,category
Union all
Select store_id, 'department' as average_level_type,department as id, sum(quantity) as total_quantity,
Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average
from transactions
where quantity > 0
group by store_id,department;

If you want to avoid using union all in that case you can use something like rollup() or group by grouping sets() to achieve the same but the query would be a little more complicated to get the output in the exact format which you have shown in the question.

EDIT : Below is how you can use grouping sets to get the same output -

Select store_id,
       case when G_ID = 3 then 'product' 
            when G_ID = 5 then 'category'
            when G_ID = 6 then 'department' end As average_level_type,
       case when G_ID = 3 then product 
            when G_ID = 5 then category
            when G_ID = 6 then department end As id,
       total_quantity,
       unique_customer_count,
       average
from            
    (select store_id, product, category, department, sum(quantity) as total_quantity, Count(distinct customer_id) as unique_customer_count, sum(quantity)/count(distinct customer_id) as average, GROUPING__ID As G_ID
    from transactions
    group by store_id,product,category,department
    grouping sets((store_id,product),(store_id,category),(store_id,department))
    ) Tab
order by 2    
;


来源:https://stackoverflow.com/questions/64456712/calculating-multiple-averages-across-different-parts-of-the-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!