Generating time series between two dates in PostgreSQL

China☆狼群 提交于 2019-11-26 07:59:01

Can be done without conversion to/from int (but to/from timestamp instead)

SELECT date_trunc('day', dd):: date
FROM generate_series
        ( '2007-02-01'::timestamp 
        , '2008-04-01'::timestamp
        , '1 day'::interval) dd
        ;

This should be the optimal way:

SELECT day::date 
FROM   generate_series(timestamp '2004-03-07'
                     , timestamp '2004-08-16'
                     , interval  '1 day') AS t(day);
  • Additional date_trunc() is not needed. The cast to date (day::date) does that implicitly.

  • But there is also no point in casting date literals to date as input parameter. Au contraire, timestamp is the best choice. The advantage in performance is small, but there is no reason not to take it. And you do not needlessly involve DST (daylight saving time) rules coupled with the conversion from date to timestamp with time zone and back. See below.

Equivalent short syntax:

SELECT day::date 
FROM   generate_series(timestamp '2004-03-07', '2004-08-16', '1 day') day;

Or with the set-returning function in the SELECT list:

SELECT generate_series(timestamp '2004-03-07', '2004-08-16', '1 day')::date AS day;

The AS keyword is required in the last variant, Postgres would misinterpret the column alias day otherwise. And I would not advise that variant before Postgres 10 - at least not with more than one set-returning function in the same SELECT list:

Why?

There are a number of overloaded variants of generate_series(). Currently (Postgres 11):

SELECT oid::regprocedure   AS function_signature
     , prorettype::regtype AS return_type
FROM   pg_proc
where  proname = 'generate_series';
function_signature                                                                | return_type                
:-------------------------------------------------------------------------------- | :--------------------------
generate_series(integer,integer,integer)                                          | integer                    
generate_series(integer,integer)                                                  | integer                    
generate_series(bigint,bigint,bigint)                                             | bigint                     
generate_series(bigint,bigint)                                                    | bigint                     
generate_series(numeric,numeric,numeric)                                          | numeric                    
generate_series(numeric,numeric)                                                  | numeric                    
generate_series(timestamp without time zone,timestamp without time zone,interval) | timestamp without time zone
generate_series(timestamp with time zone,timestamp with time zone,interval)       | timestamp with time zone

(The numeric variants were added with Postgres 9.5.) The relevant ones are the last two in bold taking and returning timestamp / timestamptz.

As you can see, there is no variant taking or returning date. An explicit cast is needed to return date. Passing timestamp resolves to the best variant directly without descending into function type resolution rules and without additional cast for the input.

timestamp '2004-03-07' is perfectly valid, btw. The omitted time part defaults to 00:00 with ISO format.

Thanks to function type resolution we can still pass date. But that requires more work from Postgres. There is an implicit cast from date to timestamp as well as one from date to timestamptz. Would be ambiguous, but timestamptz is "preferred" among "Date/time types". So the match is decided at step 4d.:

Run through all candidates and keep those that accept preferred types (of the input data type's type category) at the most positions where type conversion will be required. Keep all candidates if none accept preferred types. If only one candidate remains, use it; else continue to the next step.

In addition to the extra work in function type resolution this adds an extra cast to timestamptz. The cast to timestamptz not only adds more cost, it can also introduce problems with DST leading to unexpected results in rare cases. (DST is a moronic concept, btw, can't stress this enough.) Related:

I added demos to the fiddle showing the more expensive query plan:

dbfiddle here

Related:

You can generate series directly with dates. No need to use ints or timestamps:

select date::date 
from generate_series(
  '2004-03-07'::date,
  '2004-08-16'::date,
  '1 day'::interval
) date;

you can use like

select generate_series ( '2012-12-31'::timestamp , '2018-10-31'::timestamp , '1 day'::interval) :: date

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!