年的返回数组

时间:2022-03-28 22:58:02

I'm attempting to query a table which contains a character varying[] column of years, and return those years as a string of comma-delimited year ranges. The year ranges would be determined by sequential years present within the array, and years/year ranges which are not sequential should be separated be commas.

我正在尝试查询一个包含字符变化的[]年列的表,并将这些年作为逗号分隔的年范围的字符串返回。年份范围将根据数组中存在的连续年份确定,而非连续年份的年份/年份范围应该用逗号分隔。

The reason the data-type is character varying[] rather than integer[] is because a few of the values contain ALL instead of a list of years. We can omit these results.

数据类型之所以是字符变化的[]而不是整数[],是因为有些值包含所有值,而不是年份列表。我们可以忽略这些结果。

So far I've had little luck approaching the problem as I'm not really even sure where to start.

到目前为止,我在解决这个问题上运气不太好,因为我甚至不确定从哪里开始。

Would someone be able to give me some guidance or provide a useful examples of how one might solve such as challenge?

有人能给我一些指导或者提供一个有用的例子来说明如何解决问题,比如挑战吗?

years_table Example

years_table例子

+=========+============================+
| id      | years                      |
| integer | character varying[]        |
+=========+============================+
| 1       | {ALL}                      |
| 2       | {1999,2000,2010,2011,2012} |
| 3       | {1990,1991,2007}           |
+---------+----------------------------+

Output Goal:

输出目标:

Example SQL Query:

示例SQL查询:

SELECT id, [year concat logic] AS year_ranges
FROM years_table WHERE 'ALL' NOT IN years

Result:

结果:

+====+======================+
| id | year_ranges          |
+====+======================+
| 2  | 1999-2000, 2010-2012 |
| 3  | 1990-1991, 2007      |
+----+----------------------+

2 个解决方案

#1


4  

SELECT id, string_agg(year_range, ', ') AS year_ranges
FROM (
   SELECT id, CASE WHEN count(*) > 1
               THEN min(year)::text || '-' ||  max(year)::text 
               ELSE min(year)::text
              END AS year_range
   FROM  (
      SELECT *, row_number() OVER (ORDER BY id, year) - year AS grp
      FROM  (
         SELECT id, unnest(years) AS year
         FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
                      ,(3,      '{1990,1991,2007}')
               ) AS tbl(id, years)
         ) sub1
      ) sub2
   GROUP  BY id, grp
   ORDER  BY id, min(year)
   ) sub3
GROUP  BY id
ORDER  BY id

Produces exactly the desired result.

产生理想的结果。

If you deal with an an array of varchar (varchar[], just cast it to int[], before you proceed. It seems to be in perfectly legal form for that:

如果您处理一个varchar (varchar[])数组,请在继续之前将其转换为int[]。这似乎是完全合法的形式:

years::int[]

Replace the inner sub-select with the name of your source table in productive code.

用生产代码中的源表的名称替换内部子选择。

 FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
              ,(3,      '{1990,1991,2007}')
       ) AS tbl(id, years)

->

- >

FROM  tbl

Since we are dealing with a naturally ascending number (the year) we can use a shortcut to form groups of consecutive years (forming a range). I subtract the year itself from row number (ordered by year). For consecutive years, both row number and year increment by one and produce the same grp number. Else, a new range starts.

由于我们处理的是一个自然上升的数字(年份),我们可以使用快捷方式来形成连续的年份(形成一个范围)。我从行号(按年排序)中减去年份本身。连续数年,行数和年增加1,产生相同的grp数。否则,一个新的范围就开始了。

More on window functions in the manual here and here.

更多关于窗口功能的手册在这里和这里。

A plpgsql function might be even faster in this case. You'd have to test. Examples in these related answers:
Ordered count of consecutive repeats / duplicates
ROW_NUMBER() shows unexpected values

在这种情况下,plpgsql函数可能会更快。你需要测试。这些相关答案中的示例:连续重复/重复行_number()的有序计数显示了意想不到的值

#2


2  

SQL Fiddle Not the output format you asked for but I think it can be more useful:

SQL不是你要求的输出格式,但我认为它可以更有用:

select id, g, min(year), max(year)
from (
    select id, year,
        count(not g or null) over(partition by id order by year) as g
    from (
        select id, year,
            lag(year, 1, 0) over(partition by id order by year) = year - 1 as g
        from (
            select id, unnest(years)::integer as year
            from years
            where years != '{ALL}'
        ) s
    ) s
) s
group by 1, 2

#1


4  

SELECT id, string_agg(year_range, ', ') AS year_ranges
FROM (
   SELECT id, CASE WHEN count(*) > 1
               THEN min(year)::text || '-' ||  max(year)::text 
               ELSE min(year)::text
              END AS year_range
   FROM  (
      SELECT *, row_number() OVER (ORDER BY id, year) - year AS grp
      FROM  (
         SELECT id, unnest(years) AS year
         FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
                      ,(3,      '{1990,1991,2007}')
               ) AS tbl(id, years)
         ) sub1
      ) sub2
   GROUP  BY id, grp
   ORDER  BY id, min(year)
   ) sub3
GROUP  BY id
ORDER  BY id

Produces exactly the desired result.

产生理想的结果。

If you deal with an an array of varchar (varchar[], just cast it to int[], before you proceed. It seems to be in perfectly legal form for that:

如果您处理一个varchar (varchar[])数组,请在继续之前将其转换为int[]。这似乎是完全合法的形式:

years::int[]

Replace the inner sub-select with the name of your source table in productive code.

用生产代码中的源表的名称替换内部子选择。

 FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
              ,(3,      '{1990,1991,2007}')
       ) AS tbl(id, years)

->

- >

FROM  tbl

Since we are dealing with a naturally ascending number (the year) we can use a shortcut to form groups of consecutive years (forming a range). I subtract the year itself from row number (ordered by year). For consecutive years, both row number and year increment by one and produce the same grp number. Else, a new range starts.

由于我们处理的是一个自然上升的数字(年份),我们可以使用快捷方式来形成连续的年份(形成一个范围)。我从行号(按年排序)中减去年份本身。连续数年,行数和年增加1,产生相同的grp数。否则,一个新的范围就开始了。

More on window functions in the manual here and here.

更多关于窗口功能的手册在这里和这里。

A plpgsql function might be even faster in this case. You'd have to test. Examples in these related answers:
Ordered count of consecutive repeats / duplicates
ROW_NUMBER() shows unexpected values

在这种情况下,plpgsql函数可能会更快。你需要测试。这些相关答案中的示例:连续重复/重复行_number()的有序计数显示了意想不到的值

#2


2  

SQL Fiddle Not the output format you asked for but I think it can be more useful:

SQL不是你要求的输出格式,但我认为它可以更有用:

select id, g, min(year), max(year)
from (
    select id, year,
        count(not g or null) over(partition by id order by year) as g
    from (
        select id, year,
            lag(year, 1, 0) over(partition by id order by year) = year - 1 as g
        from (
            select id, unnest(years)::integer as year
            from years
            where years != '{ALL}'
        ) s
    ) s
) s
group by 1, 2