问题
I have external tables created in AWS Athena to query S3 data, however, the location path has 1000+ files. So I need the corresponding filename of the record to be displayed as a column in the table.
select file_name , col1 from table where file_name = "test20170516"
In short, I need to know INPUT__FILE__NAME(hive) equivalent in AWS Athena Presto or any other ways to achieve the same.
回答1:
You can do this with the $path pseudo column.
select "$path" from table
回答2:
If you need just the filename, you can extract it with regeexp_extract()
.
To use it in Athena on the "$path"
you can do something like this:
SELECT regexp_extract("$path", '[^/]+$') AS filename from table;
If you need the filename without the extension, you can do:
SELECT regexp_extract("$path", '[ \w-]+?(?=\.)') AS filename_without_extension from table;
Here is the documentation on Presto Regular Expression Functions
来源:https://stackoverflow.com/questions/44011433/how-to-get-input-file-name-as-column-in-aws-athena-external-tables