Visually-identical characters in Unicode

微笑、不失礼 提交于 2021-02-04 13:44:25

问题


I want to find visually identical characters for a specific character in Unicode. I know how to find canonical or compatibility decompositions of a character; but they do not give me what I want. I want to find characters that are visually identical (not similar), and their only difference can be their sizes.

for example I want : (s,S), or (S,S) (whose code points are different). I do not want (ß, β), or (e, é).

Any suggestions? Thanks.


回答1:


For a particular character, you could start from annotations in the code charts in the Unicode standard. The annotations often refer to other characters for various reasons, including similarity or identity of shape. But the annotations are not meant to cover everything.

You could also draw your character at http://shapecatcher.com/ and ask it to recognize it. You often get a long list of visually similar alternatives.

As @TedHopp writes in his comment, visual identity is font-dependent. For example, “s” and “S” need not be identical in shape; in most fonts, they are not – the basic form is the same, but there are various differences in stroke width variation, curvature, serifs, etc. However, some characters can be expected to be visually identical in any font that contains them, such as Latin capital A, Greek capital alpha Α, and Cyrillic capital А.

You did not specify the purpose of the study, but you might be doing something that has been carried out to some extent by the Unicode Consortium. See UTR #6, Unicode Security Considerations, which also contains references to related work, including UTS #9, Unicode Security Mechanisms, which contains confusables.txt, Recommended confusable mapping for IDN (i.e., for a particular context, but it may be of interest for other purposes as well).



来源:https://stackoverflow.com/questions/13260890/visually-identical-characters-in-unicode

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!