Ideas for converting straight quotes to curly quotes

前端未结

关注

 9  2186

星月不相逢 2021-02-07 16:40

I have a file that contains \"straight\" (normal, ASCII) quotes, and I\'m trying to convert them to real quotation mark glyphs (“curly” quotes, U+2018 to U+201D). Since the tran

9条回答

半阙折子戏 (楼主)

2021-02-07 17:19

Computational linguistics anyone?

Somebody mentioned if you had a vast amount of cultural context, it might be feasible. So the overkill but most accurate automated solution to the problem is shallow parsing. This requires a corpus of whatever language and mode you're dealing with (e.g. the Brown corpus for general English).

Develop a classifier for curly quotes based on the syntactic context of the curly quotes occurring in the corpus. Finally, give your arbitrary syntactic context with a straight quote to your classifier and out pops the most probable quote character!

0 讨论(0)

查看其它9个回答
发布评论:

提交评论
- 加载中...