How to remove accents and tilde in a C++ std::string

前端 未结 8 1399
半阙折子戏
半阙折子戏 2020-12-15 21:26

I have a problem with a string in C++ which has several words in Spanish. This means that I have a lot of words with accents and tildes. I want to replace them for their not

8条回答
  •  难免孤独
    2020-12-15 22:14

    First, this is a really bad idea: you’re mangling somebody’s language by removing letters. Although the extra dots in words like “naïve” seem superfluous to people who only speak English, there are literally thousands of writing systems in the world in which such distinctions are very important. Writing software to mutilate someone’s speech puts you squarely on the wrong side of the tension between using computers as means to broaden the realm of human expression vs. tools of oppression.

    What is the reason you’re trying to do this? Is something further down the line choking on the accents? Many people would love to help you solve that.

    That said, libicu can do this for you. Open the transform demo; copy and paste your Spanish text into the “Input” box; enter

    NFD; [:M:] remove; NFC
    

    as “Compound 1” and click transform.

    (With help from slide 9 of Unicode Transforms in ICU. Slides 29-30 show how to use the API.)

提交回复
热议问题