问题
I have a table posts
:
CREATE TABLE posts (
id serial primary key,
content text
);
When a user submits a post, how can I compare his post with the others and find similar posts?
I'm looking for something like StackOverflow does with the "Similar Questions".
回答1:
While Text Search is an option it is not meant for this type of search primarily. The typical use case would be to find words in a document based on dictionaries and stemming, not to compare whole documents.
I am sure StackOverflow has put some smarts into the similarity search, as this is not a trivial matter.
You can get halfway decent results with the similarity function and operators provided by the pg_trgm module:
SELECT content, similarity(content, 'grand new title asking foo') AS sim_score
FROM posts
WHERE content % 'grand new title asking foo'
ORDER BY 2 DESC, content;
Be sure to have a GiST index on content
for this.
But you'll probably have to do more. You could combine it with Text Search after identifying keywords in the new content ..
回答2:
You need to use Full Text Search in Postgres.
http://www.postgresql.org/docs/9.1/static/textsearch-intro.html
来源:https://stackoverflow.com/questions/17842196/finding-similar-posts-with-postgresql