How to pivot on dynamic values in Snowflake

我只是一个虾纸丫 提交于 2021-02-16 20:04:05

问题


I want to pivot a table based on a field which can contain "dynamic" values (not always known beforehand).

I can make it work by hard coding the values (which is undesirable):

SELECT *
FROM my_table
  pivot(SUM(amount) FOR type_id IN (1,2,3,4,5,20,50,83,141,...);

But I can't make it work using a query to provide the values dynamically:

SELECT *
FROM my_table
  pivot(SUM(amount) FOR type_id IN (SELECT id FROM types);
---
090150 (22000): Single-row subquery returns more than one row. 

SELECT *
FROM my_table
  pivot(SUM(amount) FOR type_id IN (SELECT ARRAY_AGG(id) FROM types);
---
001038 (22023): SQL compilation error:                                          
Can not convert parameter 'my_table.type_id' of type [NUMBER(38,0)] into expected type [ARRAY]

Is there a way to accomplish this?


回答1:


I don't think it's possible in native SQL, but I wrote an article and published some code showing how my team does this by generating the query from Python.

You can call the Python script directly, passing arguments similar to the options Excel gives you for pivot tables:

python generate_pivot_query.py                  \
    --dbtype snowflake --database mydb          \
    --host myhost.url --port 5432               \
    --user me --password myp4ssw0rd             \
    --base-columns customer_id                  \
    --pivot-columns category                    \
    --exclude-columns order_id                  \
    --aggfunction-mappings amount=sum           \
    myschema orders

Or, if you're Airflow, you can use a CreatePivotTableOperator to create tasks directly.




回答2:


I wrote a Snowflake stored procedure to get dynamics pivots inside Snowflake, check:

  • https://hoffa.medium.com/dynamic-pivots-in-sql-with-snowflake-c763933987c

3 steps:

  1. Query
  2. Call stored procedure call pivot_prev_results()
  3. Find the results select * from table(result_scan(last_query_id(-2)))

The procedure:

create or replace procedure pivot_prev_results()
returns string
language javascript
execute as caller as
$$
  var cols_query = `
      select '\\'' 
        || listagg(distinct pivot_column, '\\',\\'') within group (order by pivot_column)
        || '\\'' 
      from table(result_scan(last_query_id(-1)))
  `;
  var stmt1 = snowflake.createStatement({sqlText: cols_query});
  var results1 = stmt1.execute();
  results1.next();
  var col_list = results1.getColumnValue(1);
  
  pivot_query = `
         select * 
         from (select * from table(result_scan(last_query_id(-2)))) 
         pivot(max(pivot_value) for pivot_column in (${col_list}))
     `
  var stmt2 = snowflake.createStatement({sqlText: pivot_query});
  stmt2.execute();
  return `select * from table(result_scan('${stmt2.getQueryId()}'));\n  select * from table(result_scan(last_query_id(-2)));`;
$$;


来源:https://stackoverflow.com/questions/57172520/how-to-pivot-on-dynamic-values-in-snowflake

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!