Accessing tables being updated in Athena

天大地大妈咪最大 提交于 2020-04-30 16:33:54

问题


When issuing the msck repair table statement, is the table still accessible for querying during the udpate?

I ask because I'm trying to figure out the best update schedule for a relatively large S3 hive table that is used to drive some reports in QuickSight. Will issuing this command break anyone who happens to simultaneously be running a QuickSight report based on this table?


回答1:


Yes, the table will be available for running queries while you are running MSCK REPAIR TABLE, it's a background process. Queries run while that command is running will see different partitions, though, as the partitions the command discovers will be added as they are found.

Be aware that running MSCK REPAIR TABLE is a very inefficient process, with many partitions it will run for a very long time, and it is not incremental. This doesn't matter for query performance, but if it takes a long time now, it will only ever take longer and longer and might not be a viable long term strategy. There are some other questions here on StackOverflow about it that you can read to find other strategies for keeping your tables up to date.



来源:https://stackoverflow.com/questions/55621399/accessing-tables-being-updated-in-athena

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!