问题
The documentation says that timestamps support the following conversion:
•Floating point numeric types: Interpreted as UNIX timestamp in seconds with decimal precision
First of all, I'm not sure how to interpret this. If I have a timestamp 2013-01-01 12:00:00.423, can I convert this to a numeric type that retains the milliseconds? Because that is what I want.
More generally, I need to do comparisons between timestamps such as
select maxts - mints as latency from mytable
where maxts and mints are timestamp columns. Currently, this gives me NullPointerException using Hive 0.11.0. I am able to perform queries if I do something like
select unix_timestamp(maxts) - unix_timestamp(mints) as latency from mytable
but this only works for seconds, not millisecond precision.
Any help appreciated. Tell me if you need additional information.
回答1:
If you want to work with milliseconds, don't use the unix timestamp functions because these consider date as seconds since epoch.
hive> describe function extended unix_timestamp;
unix_timestamp([date[, pattern]]) - Returns the UNIX timestamp
Converts the current or specified time to number of seconds since 1970-01-01.
Instead, convert the JDBC compliant timestamp to double.
E.g:
Given a tab delimited data:
cat /user/hive/ts/data.txt :
a 2013-01-01 12:00:00.423 2013-01-01 12:00:00.433
b 2013-01-01 12:00:00.423 2013-01-01 12:00:00.733
CREATE EXTERNAL TABLE ts (txt string, st Timestamp, et Timestamp)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '/user/hive/ts';
Then you may query the difference between startTime(st) and endTime(et) in milliseconds as follows:
select
txt,
cast(
round(
cast((e-s) as double) * 1000
) as int
) latency
from (select txt, cast(st as double) s, cast(et as double) e from ts) q;
来源:https://stackoverflow.com/questions/18737561/how-do-i-get-millisecond-precision-in-hive