I have a file that contains \"straight\" (normal, ASCII) quotes, and I\'m trying to convert them to real quotation mark glyphs (“curly” quotes, U+2018 to U+201D). Since the tran
Computational linguistics anyone?
Somebody mentioned if you had a vast amount of cultural context, it might be feasible. So the overkill but most accurate automated solution to the problem is shallow parsing. This requires a corpus of whatever language and mode you're dealing with (e.g. the Brown corpus for general English).
Develop a classifier for curly quotes based on the syntactic context of the curly quotes occurring in the corpus. Finally, give your arbitrary syntactic context with a straight quote to your classifier and out pops the most probable quote character!