问题
I have the following Perl script, intended to show collation in Danish.
#!/usr/local/ActivePerl-5.16/bin/perl
use 5.014_001;
use utf8;
use Unicode::Collate;
use strict;
use warnings;
use Carp;
use Data::Dump;
use Encode qw( encode_utf8 );
use Unicode::Collate::Locale;
binmode STDOUT, ':encoding(UTF-8)';
my @words =("AAI Document Type", "Apple", "Zebra");
my $coll = Unicode::Collate::Locale->new(locale => "da");
my @result = $coll->sort(@words);
foreach my $item (@result){
print $item, "\n";
}
It outputs
Apple
Zebra
AAI Document Type
Why does "AAI Document Type" go to the end? There seems to be something about "AA" that triggers this behavior.
回答1:
AA
is treated as a single letter in Danish, also written as Å
.
Details here.
Obviously in an abbreviation like AAI
, treating the AA
as Å
isn't appropriate (it really is two A
characters). I suppose the way to avoid that would be to use a different collation.
来源:https://stackoverflow.com/questions/15257974/what-is-it-about-aa-and-danish-collation