发表新帖

发表新帖

Is there an algorithm that tells the semantic similarity of two phrases

后端未结

关注

 11  1270

孤独总比滥情好 2020-11-27 09:44

input: phrase 1, phrase 2

output: semantic similarity value (between 0 and 1), or the probability these two phrases are talking about the same thing

11条回答

温柔的废话 (楼主)

2020-11-27 10:21

There's a short and a long answer to this.

The short answer:

Use the WordNet::Similarity Perl package. If Perl is not your language of choice, check the WordNet project page at Princeton, or google for a wrapper library.

The long answer:

Determining word similarity is a complicated issue, and research is still very hot in this area. To compute similarity, you need an appropriate represenation of the meaning of a word. But what would be a representation of the meaning of, say, 'chair'? In fact, what is the exact meaning of 'chair'? If you think long and hard about this, it will twist your mind, you will go slightly mad, and finally take up a research career in Philosophy or Computational Linguistics to find the truth™. Both philosophers and linguists have tried to come up with an answer for literally thousands of years, and there's no end in sight.

So, if you're interested in exploring this problem a little more in-depth, I highly recommend reading Chapter 20.7 in Speech and Language Processing by Jurafsky and Martin, some of which is available through Google Books. It gives a very good overview of the state-of-the-art of distributional methods, which use word co-occurrence statistics to define a measure for word similarity. You are not likely to find libraries implementing these, however.

0 讨论(0)

查看其它11个回答
发布评论:

提交评论
- 加载中...

热议问题