AWS Athena Returning Zero Records from Tables Created from GLUE Crawler input csv from S3

人盡茶涼 提交于 2020-01-14 09:51:50

问题


Part One :

I tried glue crawler to run on dummy csv loaded in s3 it created a table but when I try view table in athena and query it it shows Zero Records returned.

But the demo data of ELB in Athena works fine.

Part Two (Scenario:)

Suppose I Have a excel file and data dictionary of how and what format data is stored in that file , I want that data to be dumped in AWS Redshift What would be best way to achieve this ?


回答1:


I experienced the same issue. You need to give the folder path instead of the real file name to the crawler and run it. I tried with feeding folder name to the crawler and it worked. Hope this helps. Let me know. Thanks,




回答2:


I experienced the same issue. try creating separate folder for single table in s3 buckets than rerun the glue crawler.you will get a new table in glue data catalog which has the same name as s3 bucket folder name .




回答3:


Delete Crawler ones again create Crawler(only one csv file should be not more available in s3 and run the crawler) important note one CSV file run it we can view the records in Athena.




回答4:


I was indeed providing the S3 folder path instead of the filename and still couldn't get Athena to return any records ("Zero records returned", "Data scanned: 0KB").

Turns out the problem was that the input files (my rotated log files automatically uploaded to S3 from Elastic Beanstalk) start with underscore (_), e.g. _var_log_nginx_rotated_access.log1534237261.gz! Apparently that's not allowed.



来源:https://stackoverflow.com/questions/47266924/aws-athena-returning-zero-records-from-tables-created-from-glue-crawler-input-cs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!