icu4c--> ushape.c missing character in shaping?

℡╲_俬逩灬. 提交于 2019-12-10 10:41:05

问题


in our langauge we use arabic characters in writing with some differences, icu's ushape.c ( arabic shaper) only works with main arabic characters and dosn't shape my language specific characters ( i.e 0x6D5 etc) i'v changed ushape.c to work with my language and it worked well except for on character, that is 0x649, in arabic they have only 2 shapes, in my langauge we have 4 shapes for it.

i'v changed line 183

1                + 256 * 0x7F,/*0x0649*/

to

1+2+8             + 256 * 0x98 /*0x649*/

and changed line 121

static const UChar yehHamzaToYeh[] =
{
/* isolated*/ 0xFEEF,
/* final   */ 0xFEF0
};

to

static const UChar yehHamzaToYeh[] =
    {
        /* isolated */0xFEEF, 
                       0xFBE8, // my language specific
                      0xFBE9,// my language specific
        /* final */   0xFEF0 
   };

from ushape.c

now it can produce 3 shapes with no problem ( the beginning,isolated and final), but middle shape is displayed as a square ( missing character ) .

i tried replacing "* 0x98" with other numbers, but this best i can get.

what should i do ?


回答1:


Uighur? I discussed with a couple of people about Uighur rendering, not this particular issue but in general.

When you said you get a square, what Unicode character do you get?

What you really should do is to file a bug with ICU and discuss it there. This is a feature request, not a usage question.

My rusty recollection is that for Uighur it makes different use of shaping, and you will want to basically have a different mode on the shaper.




回答2:


ICU indeed seems to have problems for shaping with some languages, e.g. Urdu.

Your specific character 649 however is probably not the characters that you are looking for.

U+649 is alef maksura which looks identical to Farsi Yeh U+6cc which is shaped properly by ICU.

They do have different presentation forms: Alef maksura only has isolated and final form: U+feef U+fef0 Farsi yeh has all four forms: U+fbfc U+fbfd U+fbfe U+fbff



来源:https://stackoverflow.com/questions/3855940/icu4c-ushape-c-missing-character-in-shaping

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!