How to transliterate Cyrillic to Latin text

后端 未结 10 1333
情话喂你
情话喂你 2020-12-05 05:01

I have a method which turns any Latin text (e.g. English, French, German, Polish) into its slug form,

e.g. Alpha Bravo Charlie => alpha-bravo-char

10条回答
  •  南笙
    南笙 (楼主)
    2020-12-05 05:48

    You can use .NET open source dll library UnidecodeSharpFork to transliterate Cyrillic and many more languages to Latin.

    Example usage:

    Assert.AreEqual("Rabota s kirillitsey", "Работа с кириллицей".Unidecode());
    Assert.AreEqual("CZSczs", "ČŽŠčžš".Unidecode());
    Assert.AreEqual("Hello, World!", "Hello, World!".Unidecode());
    

    Testing Cyrillic:

    /// 
    /// According to http://en.wikipedia.org/wiki/Romanization_of_Russian BGN/PCGN.
    /// http://en.wikipedia.org/wiki/BGN/PCGN_romanization_of_Russian
    /// With converting "ё" to "yo".
    /// 
    [TestMethod]
    public void RussianAlphabetTest()
    {
        string russianAlphabetLowercase = "а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я";
        string russianAlphabetUppercase = "А Б В Г Д Е Ё Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я";
    
        string expectedLowercase = "a b v g d e yo zh z i y k l m n o p r s t u f kh ts ch sh shch \" y ' e yu ya";
        string expectedUppercase = "A B V G D E Yo Zh Z I Y K L M N O P R S T U F Kh Ts Ch Sh Shch \" Y ' E Yu Ya";
    
        Assert.AreEqual(expectedLowercase, russianAlphabetLowercase.Unidecode());
        Assert.AreEqual(expectedUppercase, russianAlphabetUppercase.Unidecode());
    }
    

    Simple, fast and powerful. And it's easy to extend/modify transliteration table if you want to.

提交回复
热议问题