Ideas for converting straight quotes to curly quotes

前端 未结 9 2186
星月不相逢
星月不相逢 2021-02-07 16:40

I have a file that contains \"straight\" (normal, ASCII) quotes, and I\'m trying to convert them to real quotation mark glyphs (“curly” quotes, U+2018 to U+201D). Since the tran

9条回答
  •  半阙折子戏
    2021-02-07 17:19

    Computational linguistics anyone?

    Somebody mentioned if you had a vast amount of cultural context, it might be feasible. So the overkill but most accurate automated solution to the problem is shallow parsing. This requires a corpus of whatever language and mode you're dealing with (e.g. the Brown corpus for general English).

    Develop a classifier for curly quotes based on the syntactic context of the curly quotes occurring in the corpus. Finally, give your arbitrary syntactic context with a straight quote to your classifier and out pops the most probable quote character!

提交回复
热议问题