How can I tell if a string contains multibyte characters in Javascript?

后端 未结 1 1140
没有蜡笔的小新
没有蜡笔的小新 2020-12-04 17:18

Is it possible in Javascript to detect if a string contains multibyte characters? If so, is it possible to tell which ones?

The problem I\'m running into is this (ap

相关标签:
1条回答
  • 2020-12-04 17:26

    JavaScript strings are UCS-2 encoded but can represent Unicode code points outside the Basic Multilingual Pane (U+0000 - U+D7FF and U+E000 - U+FFFF) using two 16 bit numbers (a UTF-16 surrogate pair), the first of which must be in the range U+D800 - U+DFFF.

    Based on this, it's easy to detect whether a string contains any characters that lie outside the Basic Multilingual Plane (which is what I think you're asking: you want to be able to identify whether a string contains any characters that lie outside the range of code points that JavaScript represents as a single character):

    function containsSurrogatePair(str) {
        return /[\uD800-\uDFFF]/.test(str);
    }
    
    alert( containsSurrogatePair("foo") ); // false
    alert( containsSurrogatePair("f                                                                    
    0 讨论(0)
提交回复
热议问题