icu4c--> ushape.c missing character in shaping?

问题

in our langauge we use arabic characters in writing with some differences, icu's ushape.c ( arabic shaper) only works with main arabic characters and dosn't shape my language specific characters ( i.e 0x6D5 etc) i'v changed ushape.c to work with my language and it worked well except for on character, that is 0x649, in arabic they have only 2 shapes, in my langauge we have 4 shapes for it.

i'v changed line 183

1                + 256 * 0x7F,/*0x0649*/

1+2+8             + 256 * 0x98 /*0x649*/

and changed line 121

static const UChar yehHamzaToYeh[] =
{
/* isolated*/ 0xFEEF,
/* final   */ 0xFEF0
};

static const UChar yehHamzaToYeh[] =
    {
        /* isolated */0xFEEF, 
                       0xFBE8, // my language specific
                      0xFBE9,// my language specific
        /* final */   0xFEF0 
   };

from ushape.c

now it can produce 3 shapes with no problem ( the beginning,isolated and final), but middle shape is displayed as a square ( missing character ) .

i tried replacing "* 0x98" with other numbers, but this best i can get.

what should i do ?

回答1:

Uighur? I discussed with a couple of people about Uighur rendering, not this particular issue but in general.

When you said you get a square, what Unicode character do you get?

What you really should do is to file a bug with ICU and discuss it there. This is a feature request, not a usage question.

My rusty recollection is that for Uighur it makes different use of shaping, and you will want to basically have a different mode on the shaper.

回答2:

ICU indeed seems to have problems for shaping with some languages, e.g. Urdu.

Your specific character 649 however is probably not the characters that you are looking for.

U+649 is alef maksura which looks identical to Farsi Yeh U+6cc which is shaped properly by ICU.

They do have different presentation forms: Alef maksura only has isolated and final form: U+feef U+fef0 Farsi yeh has all four forms: U+fbfc U+fbfd U+fbfe U+fbff

来源：https://stackoverflow.com/questions/3855940/icu4c-ushape-c-missing-character-in-shaping

标签

icu