Query csv tables stored s3 through athena

試著忘記壹切 提交于 2019-12-10 11:58:49

问题


Recently we started to store our backups in aws s3. It is all csv files that we need to query through aws athena. We tried to insert the tables one by one but it's taking too long, it is a fair amount of data. Is there any API that we can use or something that is alredy set? we were about to do something with spark, but maybe there is a simpler way, or something that's already have been done. thanks


回答1:


You can simply create an external table on top of CSV files with the required properties.

Reference : Create External Table on AWS Athena

You can also use Glue Crawler and configure it to automatically populate the tables for you.

Reference : Cataloging tables with a crawler

There are different AWS SDK's available (here) to automate your tasks like uploading files to S3, creating athena tables or cataloging tables through glue clawler.



来源:https://stackoverflow.com/questions/52041500/query-csv-tables-stored-s3-through-athena

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!