Return array of years as year ranges

二次信任 提交于 2019-12-23 16:33:21

问题


I'm attempting to query a table which contains a character varying[] column of years, and return those years as a string of comma-delimited year ranges. The year ranges would be determined by sequential years present within the array, and years/year ranges which are not sequential should be separated be commas.

The reason the data-type is character varying[] rather than integer[] is because a few of the values contain ALL instead of a list of years. We can omit these results.

So far I've had little luck approaching the problem as I'm not really even sure where to start.

Would someone be able to give me some guidance or provide a useful examples of how one might solve such as challenge?

years_table Example

+=========+============================+
| id      | years                      |
| integer | character varying[]        |
+=========+============================+
| 1       | {ALL}                      |
| 2       | {1999,2000,2010,2011,2012} |
| 3       | {1990,1991,2007}           |
+---------+----------------------------+

Output Goal:

Example SQL Query:

SELECT id, [year concat logic] AS year_ranges
FROM years_table WHERE 'ALL' NOT IN years

Result:

+====+======================+
| id | year_ranges          |
+====+======================+
| 2  | 1999-2000, 2010-2012 |
| 3  | 1990-1991, 2007      |
+----+----------------------+

回答1:


SELECT id, string_agg(year_range, ', ') AS year_ranges
FROM (
   SELECT id, CASE WHEN count(*) > 1
               THEN min(year)::text || '-' ||  max(year)::text 
               ELSE min(year)::text
              END AS year_range
   FROM  (
      SELECT *, row_number() OVER (ORDER BY id, year) - year AS grp
      FROM  (
         SELECT id, unnest(years) AS year
         FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
                      ,(3,      '{1990,1991,2007}')
               ) AS tbl(id, years)
         ) sub1
      ) sub2
   GROUP  BY id, grp
   ORDER  BY id, min(year)
   ) sub3
GROUP  BY id
ORDER  BY id

Produces exactly the desired result.

If you deal with an an array of varchar (varchar[], just cast it to int[], before you proceed. It seems to be in perfectly legal form for that:

years::int[]

Replace the inner sub-select with the name of your source table in productive code.

 FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
              ,(3,      '{1990,1991,2007}')
       ) AS tbl(id, years)

->

FROM  tbl

Since we are dealing with a naturally ascending number (the year) we can use a shortcut to form groups of consecutive years (forming a range). I subtract the year itself from row number (ordered by year). For consecutive years, both row number and year increment by one and produce the same grp number. Else, a new range starts.

More on window functions in the manual here and here.

A plpgsql function might be even faster in this case. You'd have to test. Examples in these related answers:
Ordered count of consecutive repeats / duplicates
ROW_NUMBER() shows unexpected values




回答2:


SQL Fiddle Not the output format you asked for but I think it can be more useful:

select id, g, min(year), max(year)
from (
    select id, year,
        count(not g or null) over(partition by id order by year) as g
    from (
        select id, year,
            lag(year, 1, 0) over(partition by id order by year) = year - 1 as g
        from (
            select id, unnest(years)::integer as year
            from years
            where years != '{ALL}'
        ) s
    ) s
) s
group by 1, 2


来源:https://stackoverflow.com/questions/17533040/return-array-of-years-as-year-ranges

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!