AWS Athena: Delete partitions between date range

后端 未结 2 915
渐次进展
渐次进展 2020-12-21 06:50

I have an athena table with partition based on date like this:

20190218

I want to delete all the partitions that are created last year.

相关标签:
2条回答
  • 2020-12-21 07:19

    While the Athena SQL may not support it at this time, the Glue API call GetPartitions (that Athena uses under the hood for queries) supports complex filter expressions similar to what you can write in a SQL WHERE expression.

    Instead of deleting partitions through Athena you can do GetPartitions followed by BatchDeletePartition using the Glue API.

    0 讨论(0)
  • 2020-12-21 07:29

    According to https://docs.aws.amazon.com/athena/latest/ug/alter-table-drop-partition.html, ALTER TABLE tblname DROP PARTITION takes a partition spec, so no ranges are allowed.

    In Presto you would do DELETE FROM tblname WHERE ..., but DELETE is not supported by Athena either.

    For these reasons, you need to do leverage some external solution.

    For example:

    1. list the files as in https://stackoverflow.com/a/48824373/65458
    2. delete the files and containing directories
    3. update partitions information (https://docs.aws.amazon.com/athena/latest/ug/msck-repair-table.html should be helpful)
    0 讨论(0)
提交回复
热议问题