What is it about AA and Danish Collation?

孤街浪徒 提交于 2019-12-24 00:39:02

问题


I have the following Perl script, intended to show collation in Danish.

#!/usr/local/ActivePerl-5.16/bin/perl

use 5.014_001;
use utf8;
use Unicode::Collate;
use strict;
use warnings;
use Carp;
use Data::Dump;
use Encode qw( encode_utf8 );
use Unicode::Collate::Locale;


binmode STDOUT, ':encoding(UTF-8)';

my @words =("AAI Document Type", "Apple", "Zebra");

my $coll = Unicode::Collate::Locale->new(locale => "da");

my @result = $coll->sort(@words);


foreach my $item (@result){
print $item, "\n";
}

It outputs

Apple
Zebra
AAI Document Type

Why does "AAI Document Type" go to the end? There seems to be something about "AA" that triggers this behavior.


回答1:


AA is treated as a single letter in Danish, also written as Å.

Details here.

Obviously in an abbreviation like AAI, treating the AA as Å isn't appropriate (it really is two A characters). I suppose the way to avoid that would be to use a different collation.



来源:https://stackoverflow.com/questions/15257974/what-is-it-about-aa-and-danish-collation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!