Remove Unicode characters in a String

一笑奈何 提交于 2019-11-27 07:08:29

问题


How do I remove all special characters which don't fall under ASCII category in VBA?

These are some of the symbols which appear in my string.

Œ œ Š š Ÿ ƒ

There are many more such characters.

These don't belong to ASCII category as you can see here http://www.ascii.cl/htmlcodes.htm

I tried something like this

strName = Replace(strName, ChrW(376), " ")

回答1:


Would a RegEx solution be of interest to you?

There are plenty of examples for different languages on this site - here's a C# one: How can you strip non-ASCII characters from a string? (in C#).

Try this for VBA:

Private Function GetStrippedText(txt As String) As String
    Dim regEx As Object

    Set regEx = CreateObject("vbscript.regexp")
    regEx.Pattern = "[^\u0000-\u007F]"
    GetStrippedText = regEx.Replace(txt, "")

End Function



回答2:


What do you get when you write the following in the immediate window?

?Replace("ŸŸŸŸ", ChrW(376), "ale")

I get: alealealeale




回答3:


Try with below

Function ClearUnwantedString(fulltext As String) As String
    Dim output As String
    Dim character As String
    For i = 1 To Len(fulltext)
        character = Mid(fulltext, i, 1)
        If (character >= "a" And character <= "z") Or (character >= "0" And character <= "9") Or (character >= "A" And character <= "Z") Then
            output = output & character
        End If
    Next
    ClearUnwantedString = output
End Function

Sub test()
    a = ClearUnwantedString("dfjŒœŠdskl")
End Sub



回答4:


try application.clean()

it will remove all unprintable characters




回答5:


Supposing you have:

Then the following code will get the String from A1 and let through only ANSI (code 0 to 255) in A2.

Sub test()
 Dim s1 As String, s2 As String, c As String, i As Long, iAsc As Integer

 s1 = Range("A1").Value

 s2 = ""

 For i = 1 To Len(s1)
  c = Mid(s1, i, 1)
  iAsc = AscW(c)
  If iAsc <= 255 Then
   s2 = s2 & c
  End If
 Next

 Range("A2").Value = s2

End Sub



回答6:


Don't need to loop each character

Maybe late, but maybe it helps someone:

Public Function StripNonAsciiChars(ByVal InputString As String) As String
    Dim i As Integer
    Dim RegEx As Object
    Set RegEx = CreateObject("VBScript.RegExp")
    With RegEx
        .Global = True
        .MultiLine = True
        .IgnoreCase = True
        .Pattern = "[^\u0000-\u007F]"
        StripNonAsciiChars = Application.WorksheetFunction.Trim(RegEx.Replace(InputString, " "))
    End With
End Function


来源:https://stackoverflow.com/questions/37024107/remove-unicode-characters-in-a-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!