Part of Filename as a column in Hive Table

丶灬走出姿态 提交于 2019-12-11 06:59:29

问题


I want to get the first part of my filename as a column in my Hive Table

My filename is : 20151102114450.46400_Always_1446482638967.xml

I wrote a query (below query) using regex in Hive of Microsoft Azure to get the first part of it i.e., 20151102114450

But when I run query I am getting the output as 20151102164358

select CAST(regexp_replace(regexp_replace(regexp_replace(CAST(CAST(regexp_replace(split(INPUT__FILE__NAME,'[_]')[2],'.xml','') AS BIGINT) as TimeStamp),':',''),'-',''),' ','') AS BIGINT) as VERSION

Can anyone tell me where I am going wrong and what needs to be corrected ?


回答1:


I tried this in Cloudera, hopefully it should work in Azure as well.

select from_unixtime(unix_timestamp(regexp_extract('20151102114450.46400_Always_1446482638967.xml','^(.*?)\\.'),'yyyyMMddHHmmss'),'yyyy-MMM-dd HH:mm:ss');

2015-Nov-02 11:44:50
Time taken: 19.644 seconds, Fetched: 1 row(s)

Another option:

select from_unixtime(unix_timestamp(split('20151102114450.46400_Always_1446482638967.xml','\\.')[0],'yyyyMMddHHmmss'),'yyyy-MMM-dd HH:mm:ss')


来源:https://stackoverflow.com/questions/37331487/part-of-filename-as-a-column-in-hive-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!