Android ICS 4.0 NDK NewStringUTF is crashing down the App

前端 未结 10 1958
长情又很酷
长情又很酷 2020-12-02 15:01

I have a method in JNI C/C++ which takes jstring and returns back jstring some thing like as below,

  NATIVE_CALL(jstring, method)(JNIEnv * env, jobject obj         


        
10条回答
  •  再見小時候
    2020-12-02 15:24

    The cause of this problem is directly related to a known UTF-8 bug in the NDK/JNI GetStringUTFChars() function (and probably related functions like NewStringUTF). These NDK functions do not convert supplementary Unicode characters (i.e., Unicode characters with a value of U+10000 and above) correctly. This leads to incorrect UTF-8 and subsequent crashes.

    I encountered the crash when handling user input text that contained emoticon characters (see the corresponding Unicode chart). Emoticon characters lie in the Supplementary Unicode character range.

    Analysis of the Problem

    1. The Java client passes a string containing a supplementary Unicode character to JNI/NDK.
    2. JNI uses the NDK function GetStringUTFChars() to extract the contents of the Java string.
    3. GetStringUTFChars() returns the string data as incorrect and invalid UTF-8.

    There is a known NDK bug whereby GetStringUTFChars() incorrectly converts supplementary Unicode characters, producing an incorrect and invalid UTF-8 sequence.

    In my case, the resulting string was a JSON buffer. When the buffer was passed to the JSON parser, the parser promptly failed because one of the UTF-8 characters of the extracted UTF-8 had an invalid UTF-8 prefix byte.

    Possible Workaround

    The solution I've used can be summarized as follows:

    1. The goal is to prevent GetStringUTFChars() from performing the incorrect UTF-8 encoding of the supplementary Unicode character.
    2. This is done by the Java client encoding the request string as Base64.
    3. The Base64-encoded request is passed to JNI.
    4. JNI calls GetStringUTFChars(), which extracts the Base64-encoded string without performing any UTF-8 encoding.
    5. The JNI code then decodes the Base-64 data, producing the original UTF-16 (wide char) request string, including the supplementary Unicode character.

    In this way we circumvent the problem of extracting supplementary Unicode characters from the Java string. Instead, we convert the data to Base-64 ASCII before calling GetStringUTFChars(), extract the Base-64 ASCII characters using GetStringUTFChars(), and convert the Base-64 data back to wide characters.

提交回复
热议问题