问题
I have a column named MR which is a varchar. When I run a query with an ORDER BY it doesn't seem to be ordered correctly.
select MR, LName, FName
from users
order by MR
Results:
MR | LNAME | FNAME
----------+-------+-------
1234-234 | HEN | LO
2343MA2 | SY | JACK
MR20001 | LINA | MARY
MR200011 | TEST | CASE
MR20002 | KO | MIKE
Why does MR200011 show before MR20002? Any Idea guys on how I can properly sort this? The format of MR is not fixed.
回答1:
You are sorting by string, not by the value of the number. The character in position 7 is the difference that's being compared:
MR200011
MR20002
^
And because '2' > '1', this is the order you end up with. The 8th character is never compared, because the character-based sort order doesn't depend on it.
To 'fix' this issue, create a stored function which takes your varchar value, and returns a new 'sort string' which pads the numeric components to a fixed length.
e.g.
MR20002 -> MR0020002
MR200011 -> MR0200011
but more importantly, if you have two blocks of numbers, they don't become corrupted:
A1234-234 -> A000000001234-000000000234
A1234-5123 -> A000000001234-000000005123
The following function performs this transformation on sql-server - you'd have to adapt this function for mysql:
create function dbo.get_numeric_sort_key(@value varchar(100))
returns varchar(200)
as
begin
declare @pad_characters varchar(12)
declare @numeric_block varchar(12)
declare @output varchar(200)
set @pad_characters = '000000000000'
set @output = ''
set @numeric_block = ''
declare @idx int
declare @len int
declare @char char(1)
set @idx = 1
set @len = len(@value)
while @idx <= @len
begin
set @char = SUBSTRING(@value, @idx, 1)
if @char in ('0','1','2','3','4','5','6','7','8','9')
begin
set @numeric_block = @numeric_block + @char
end
else
begin
if (@numeric_block <> '')
begin
set @output = @output + right(@pad_characters + @numeric_block, 12)
set @numeric_block = ''
end
set @output = @output + @char
end
set @idx = @idx + 1
end
if (@numeric_block <> '')
set @output = @output + right(@pad_characters + @numeric_block, 12)
return @output
end
Then change your order by
clause to use the new function:
select MR, LName, FName
from users
order by dbo.get_numeric_sort_key(MR)
If you have a large amount of data, it would be worth adding a calculated field to the end of your table definition (populated by this function) so that you don't have to do a scan every time you run this query.
回答2:
The combination of number and alphabets sorts correctly only when the length of all the entries are fixed. In your case, the length of MR200011 and MR20002 are not equal and sorting is done based on MR200011 MR20002? The 8th Character is missing
回答3:
Maybe this query doesn't look really nice, but it will sort the rows in the order you want:
select
MR,
LName,
FName
from (
select
MR,
LName,
FName,
least(
case when locate('0', MR)>0 then locate('0', MR) else length(MR)+1 end,
case when locate('1', MR)>0 then locate('1', MR) else length(MR)+1 end,
case when locate('2', MR)>0 then locate('2', MR) else length(MR)+1 end,
case when locate('3', MR)>0 then locate('3', MR) else length(MR)+1 end,
case when locate('4', MR)>0 then locate('4', MR) else length(MR)+1 end,
case when locate('5', MR)>0 then locate('5', MR) else length(MR)+1 end,
case when locate('6', MR)>0 then locate('6', MR) else length(MR)+1 end,
case when locate('7', MR)>0 then locate('7', MR) else length(MR)+1 end,
case when locate('8', MR)>0 then locate('8', MR) else length(MR)+1 end,
case when locate('9', MR)>0 then locate('9', MR) else length(MR)+1 end) pos
from users
) users_pos
order by
left(MR, pos-1),
mid(MR, pos, length(MR)-pos+1)+0
in the subquery users_pos I'm calculating the first position of a digit, I'm then ordering by left(MR, pos-1)
which is the non-numeric beginning of the string, and by mid(MR, pos, length(MR)-pos+1)+0
which is the numeric part of the string, adding 0 will be converted to number and ordered as a number (so 20002 comes before 200011).
See it working here.
来源:https://stackoverflow.com/questions/14497692/how-do-i-sort-a-varchar-with-numbers-and-letters-without-a-specific-format