Converting Unicode string to ASCII

余生颓废 提交于 2019-12-23 06:13:08

问题


I have strings containing characters which are not found in ASCII; such as á, é, í, ó, ú; and I need a function to convert them into something acceptable such as a, e, i, o, u. This is because I will be creating IIS web sites from those strings (i.e. I will be using them as domain names).


回答1:


function Convert-DiacriticCharacters {
    param(
        [string]$inputString
    )
    [string]$formD = $inputString.Normalize(
            [System.text.NormalizationForm]::FormD
    )
    $stringBuilder = new-object System.Text.StringBuilder
    for ($i = 0; $i -lt $formD.Length; $i++){
        $unicodeCategory = [System.Globalization.CharUnicodeInfo]::GetUnicodeCategory($formD[$i])
        $nonSPacingMark = [System.Globalization.UnicodeCategory]::NonSpacingMark
        if($unicodeCategory -ne $nonSPacingMark){
            $stringBuilder.Append($formD[$i]) | out-null
        }
    }
    $stringBuilder.ToString().Normalize([System.text.NormalizationForm]::FormC)
}

The resulting function will convert diacritics in the follwoing way:

PS C:\> Convert-DiacriticCharacters "Ångström"
Angstrom
PS C:\> Convert-DiacriticCharacters "Ó señor"
O senor

Copied from: http://cosmoskey.blogspot.nl/2009/09/powershell-function-convert.html




回答2:


Taking this answer from a C#/.Net question it seems to work in PowerShell ported roughly like this:

function Remove-Diacritics
{
    Param([string]$Text)


    $chars = $Text.Normalize([System.Text.NormalizationForm]::FormD).GetEnumerator().Where{ 

        [System.Char]::GetUnicodeCategory($_) -ne [System.Globalization.UnicodeCategory]::NonSpacingMark

    }


    (-join $chars).Normalize([System.Text.NormalizationForm]::FormC)

}

e.g.

PS C:\> Remove-Diacritics 'abcdeéfg'
abcdeefg


来源:https://stackoverflow.com/questions/46658267/converting-unicode-string-to-ascii

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!