Character Encoding independent character swap

Deadly 提交于 2019-12-23 01:38:10

问题


I like to use this piece of code when I want to reverse a string. [When I am not using std::string or other inbuilt functions in C]. As a beginner when I initially thought of this I had ASCII table in mind. I think this can work well with Unicode too. I assumed since the difference in values (ASCII etc) is fixed, so it works.

Are there any character encodings in which this code may not work?

char a[11],t;
int len,i;
strcpy(a,"Particl");    
printf("%s\n",a);
len = strlen(a);
for(i=0;i<(len/2);i++)
{
    a[i] += a[len-1-i];
    a[len-1-i] = a[i] - a[len-1-i];
    a[i] -= a[len-1-i];
}
printf("%s\n",a);

Update:

This link is informative in association with this question.


回答1:


This will not work with any encoding in which some (not necessarily all) codepoints require more than one char unit to represent, because you are reversing byte-by-byte instead of codepoint-by-codepoint. For the usual 8-bit char this includes all encodings that can represent all of Unicode.

For example: in UTF-16BE, the string "hello" maps to the byte sequence 00 68 00 65 00 6c 00 6c 00 6f. Your algorithm applied to this byte sequence will produce the sequence 6f 00 6c 00 6c 00 65 00 68 00, which is the UTF-16BE encoding of the string "漀氀氀攀栀".

It gets worse -- doing a codepoint-by-codepoint reversal of a Unicode string still won't produce the correct results in all cases, because Unicode has many codepoints that act on their surroundings rather than standing alone as characters. As a trivial example, codepoint-reversing the string "Spın̈al Tap", which contains U+0308 COMBINING DIAERESIS, will produce "paT länıpS" -- see how the diaeresis has migrated from the N to the A? The consequences of codepoint-by-codepoint reversal on a string containing bidirectional overrides or conjoining jamo would be even more dire.



来源:https://stackoverflow.com/questions/16547194/character-encoding-independent-character-swap

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!