A “regex for words” (semantic replacement) - any example syntax and libraries?

狂风中的少年 提交于 2019-12-12 10:37:18

问题


I'm looking for syntatic examples or common techniques for doing regular expression style transformations on words instead of characters, given a procedural language.

For example, to trace copying, one would want to create a document with similar meaning but with different word choices.

I'd like to be able to concisely define these possible transformations that I can apply to a text stream.

Eg. "fast noun" to "rapid noun", but "go fast." wouldn't get transformed (no noun afterwards.
Or: "Alice will sing song" to "song will be sung by Alice"

I'd expect this to be done in grammatical checkers, such as detecting passive voice.

A C# implementation for this sort of language-processing would be really neat, but I think the bulk of any effort is coming up with the right rules - Keeping the rules clear and understandable seems like a place to begin.


回答1:


You could try Jason Rennie > WordNet-QueryData-1.47 > WordNet::QueryData




回答2:


One good place to start researching would be "Word Net" - it's a dictionary of semantics, grouping words together by similar meaning, and also recording the relationships between words in useful ways.

There are a bunch of software projects leveraging the Word Net corpus, one of them may be what you need.




回答3:


If you aren't tied to a particular language, Haskell has Aarne Ranta's Grammatical Framework:

http://www.grammaticalframework.org/

which is explicitly designed to generate parsers, etc for natural language processing of this sort.




回答4:


A good place to start would be SIL's CARLAStudio for its "Computer Assisted Related Language Adaptation" suite. Alternatively SIL's Adapt It. SIL has a huge range of linguistic analysis software, which is the direction you appear to be going. It's certainly a big jump from regular expressions, which don't care about the meaning, to something that can handle linguistic analysis.




回答5:


If you want something more robust for natural language parsing/transforming, you could try the C# port of OpenNLP.




回答6:


I am not aware of any syntaxes that exist for English language processing like you discuss. You would need to create your own DSL using one of the toolsets (such as Word Net) out there.



来源:https://stackoverflow.com/questions/228658/a-regex-for-words-semantic-replacement-any-example-syntax-and-libraries

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!