问题
I have two columns, id integer and version text. I am trying to convert the strings in version into integers so that I may select the maximum (most recent) version of the id.
However, the the first instance of the id stores itself as the version. Example:
id | version
---+--------
10 | '10'
as opposed to:
id | version
---+--------
10 | '10-0'
Additional rows follow the convention id: 10, version: 10-1. Etc.
How can I accomplish this? I have tried split_part() and cast as int. However, split_part(version, "-", 2) will return what looks like an empty string. I have tried running this using a COALESCE(splitpart..., '0') to no avail as it tried to read the empty field returned by the field index 2.
回答1:
Use the combination of coalesce() and nullif(), example:
with my_table(version) as (
values
('10'), ('10-1'), ('10-2')
)
select
version,
split_part(version, '-', 1)::int as major,
coalesce(nullif(split_part(version, '-', 2), ''), '0')::int as minor
from my_table
version | major | minor
---------+-------+-------
10 | 10 | 0
10-1 | 10 | 1
10-2 | 10 | 2
(3 rows)
回答2:
To get around the version strings which have no hyphen, you can use a CASE expression:
CASE WHEN version LIKE '%-%'
THEN SPLIT_PART(version, '-', 2)::int
ELSE 0 END
The basic idea is to use the version number, cast to an int, when a hyphen is present, but otherwise to assume that the version is zero if the hyphen is absent.
With this hurdle out of the way, your query now just reduces to a ROW_NUMBER() query. Here, the partition is the id, and the ordering is given using the above CASE expression for the version.
SELECT
t.id, t.version
FROM
(
SELECT
id,
CASE WHEN version LIKE '%-%'
THEN version
ELSE version || '-0' END AS version,
ROW_NUMBER() OVER (PARTITION BY id
ORDER BY
CASE WHEN version LIKE '%-%'
THEN SPLIT_PART(version, '-', 2)::int
ELSE 0 END DESC) rn
FROM yourTable
) t
WHERE t.rn = 1
ORDER BY t.id;
Demo here:
Rextester
回答3:
split_part() returns the empty string ('') - not NULL - when the part to be returned is empty or non-existent. That's why COALESCE does nothing here. And the empty string ('') has no representation as integer value, hence it throws an error when trying to cast it.
The shortest way in this example should be GREATEST(split_part( ... ) , '0') before casting, since the empty string sorts before any other non-empty string or even NULL (in any locale). Then use DISTINCT ON () to get the row with the "biggest" version for each id.
Test setup
CREATE TABLE tbl (
id integer NOT NULL
, version text NOT NULL
);
INSERT INTO tbl VALUES
(10, '10-2')
, (10, '10-1')
, (10, '10') -- missing subversion
, (10, '10-111') -- multi-digit number
, (11, '11-1')
, (11, '11-0') -- proper '0'
, (11, '11-') -- missing subversion but trailing '-'
, (11, '11-2');
Solutions
SELECT DISTINCT ON (id) *
FROM tbl
ORDER BY id, GREATEST(split_part(version, '-', 2), '0')::int DESC;
Result:
id | version
----+---------
10 | 10-111
11 | 10-2
Or you could also use NULLIF and use NULLS LAST (in descending order) to sort:
SELECT DISTINCT ON (id) *
FROM tbl
ORDER BY id, NULLIF(split_part(version, '-', 2), '')::int DESC NULLS LAST;
Same result.
Or a more explicit CASE statement:
CASE WHEN split_part(version, '-', 2) = '' THEN '0' ELSE split_part(version, '-', 2) END
dbfiddle here
Related:
- Order varchar string as numeric
- Select first row in each GROUP BY group?
- PostgreSQL sort by datetime asc, null first?
- How to convert empty to null in PostgreSQL?
来源:https://stackoverflow.com/questions/45766644/substituting-value-in-empty-field-after-using-split-part