问题
I have this code and I still can't seem to replace non English characters like Vietnamese or Thai from my data with a simple "placeholder".
Sub NonLatin()
Dim cell As Range
For Each cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
s = cell.Value
For i = 1 To Len(s)
If Mid(s, i, 1) Like "[!A-Za-z0-9@#$%^&* * ]" Then cell.Value = "placeholder"
Next
Next
End Sub
Appreciate your help
回答1:
You can replace any chars that are out of e. g. ASCII range (first 128 chars) with placeholder using the below code:
Option Explicit
Sub Test()
Dim oCell As Range
With CreateObject("VBScript.RegExp")
.Global = True
.Pattern = "[^u0000-u00F7]"
For Each oCell In [A1:C4]
oCell.Value = .Replace(oCell.Value, "*")
Next
End With
End Sub
回答2:
See this question for details about using Regular Expressions in your VBA code.
Then use regular expressions in a function like this one to process strings. Here I am assuming you want to replace each invalid character with a placeholder, rather than the entire string. If it's the entire string then you don't need to do individual character checks, you can simply use the + or * qualifiers for multiple characters in your Regular Expression's pattern, and test the entire string together.
Function LatinString(str As String) As String
' After including a reference to "Microsoft VBScript Regular Expressions 5.5"
' Set up the regular expressions object
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = True
.IgnoreCase = False
' This is the pattern of ALLOWED characters.
' Note that special characters should be escaped using a slash e.g. \$ not $
.Pattern = "[A-Za-z0-9]"
End With
' Loop through characters in string. Replace disallowed characters with "?"
Dim i As Long
For i = 1 To Len(str)
If Not regEx.Test(Mid(str, i, 1)) Then
str = Left(str, i - 1) & "?" & Mid(str, i + 1)
End If
Next i
' Return output
LatinString = str
End Function
You can use this in your code by
Dim cell As Range
For Each cell In Range("A1", Cells(Rows.Count, "A").End(xlUp))
cell.Value = LatinString(cell.Value)
Next
For a byte-level method which converts a Unicode string to a UTF8 string, without using Regular Expressions, check out this article
来源:https://stackoverflow.com/questions/45544464/how-to-write-a-vba-code-to-remove-and-replace-utf8-characters