I was just wondering if anyone has decoded UTF-8 in VB6? I am having a problem where ANSI 127 and greater are not being properly decoded for whatever reason.
For instance gets decoded into
and I'm not sure why.
I was just wondering if anyone has decoded UTF-8 in VB6? I am having a problem where ANSI 127 and greater are not being properly decoded for whatever reason.
For instance gets decoded into
and I'm not sure why.
Here's what I've done. Use the MultiByteToWide Char like Comintern said to:
Private Const CP_UTF8 As Long = 65001 ' UTF-8 Code Page 'Sys call to convert multiple byte chars to a char Private Declare Function MultiByteToWideChar Lib "KERNEL32" ( _ ByVal CodePage As Long, _ ByVal dwFlags As Long, _ ByVal lpMultiByteStr As Long, _ ByVal cchMultiByte As Long, _ ByVal lpWideCharStr As Long, _ ByVal cchWideChar As Long) As Long
Note that I've specified the windows code page, meaning the character set we are working with is UTF-8 Unicode.
Next here is my Decode function. I've called it DecodeURI:
'------------------------------------------------------------------ ' NAME: DecodeURI (PUBLIC) ' DESCRIPTION: Decodes a UTF8 encoded string ' CALLED BY: HandleNavigate ' PARAMETERS: ' EncodedURL (I,REQ) - the UTF-8 encoded string to decode ' RETURNS: the the decoded UTF-8 string '------------------------------------------------------------------ Private Function DecodeURI(ByVal EncodedURI As String) As String Dim bANSI() As Byte Dim bUTF8() As Byte Dim lIndex As Long Dim lUTFIndex As Long If Len(EncodedURI) = 0 Then Exit Function End If EncodedURI = Replace$(EncodedURI, "+", " ") ' In case encoding isn't used. bANSI = StrConv(EncodedURI, vbFromUnicode) ' Convert from unicode text to ANSI values ReDim bUTF8(UBound(bANSI)) ' Declare dynamic array, get length For lIndex = 0 To UBound(bANSI) ' from 0 to length of ANSI If bANSI(lIndex) = &H25 Then ' If we have ASCII 37, %, then bUTF8(lUTFIndex) = Val("&H" & Mid$(EncodedURI, lIndex + 2, 2)) ' convert hex to ANSI lIndex = lIndex + 2 ' this character was encoded into two bytes Else bUTF8(lUTFIndex) = bANSI(lIndex) ' otherwise don't need to do anything special End If lUTFIndex = lUTFIndex + 1 ' advance utf index Next DecodeURI = FromUTF8(bUTF8, lUTFIndex) ' convert to string End Function
And converting from UTF-8 using the system call:
'------------------------------------------------------------------ ' NAME: FromUTF8 (Private) ' DESCRIPTION: Use the system call MultiByteToWideChar to ' get chars using more than one byte and return ' return the whole string ' CALLED BY: DecodeURI ' PARAMETERS: ' UTF8 (I,REQ) - the ID of the element to return ' Length (I,REQ) - length of the string ' RETURNS: the full raw data of this field '------------------------------------------------------------------ Private Function FromUTF8(ByRef UTF8() As Byte, ByVal Length As Long) As String Dim lDataLength As Long lDataLength = MultiByteToWideChar(CP_UTF8, 0, VarPtr(UTF8(0)), Length, 0, 0) ' Get the length of the data. FromUTF8 = String$(lDataLength, 0) ' Create array big enough MultiByteToWideChar CP_UTF8, 0, VarPtr(UTF8(0)), _ Length, StrPtr(FromUTF8), lDataLength ' End Function
Hope that helps! I tested it with your character and it appeared to work (as all characters should).
Public Function UTF8ENCODE(ByVal sStr As String) As String
For L& = 1 To Len(sStr)
lChar& = AscW(Mid(sStr, L&, 1))
If lChar&
sUtf8$ = sUtf8$ + Mid(sStr, L&, 1)
ElseIf ((lChar& > 127) And (lChar&
sUtf8$ = sUtf8$ + Chr(((lChar& \ 64) Or 192))
sUtf8$ = sUtf8$ + Chr(((lChar& And 63) Or 128))
Else
sUtf8$ = sUtf8$ + Chr(((lChar& \ 144) Or 234))
sUtf8$ = sUtf8$ + Chr((((lChar& \ 64) And 63) Or 128))
sUtf8$ = sUtf8$ + Chr(((lChar& And 63) Or 128))
End If
Next L&
UTF8ENCODE = sUtf8$
End Function