问题
I have the WordNet lexical database in MySQL. I am looking to find synonyms of given words. Currently the data is set out in three tables as a many-to-many relationship:
words - (147,000 rows)
wordid, word
synsets - (119,000 rows)
synsetid
sense - (206,000 rows)
wordid, synsetid
All tables have indexes set up on them.
Each word can have several synsets and each synset can have several words. I am looking to return all words for all synsets for a given word. There tends to be around 2 synsets for each word (one for the verb usage, one for the noun) The SQL query I'm using for this is:
SELECT w.word
FROM sense s
INNER JOIN words w
ON s.wordid = w.wordid
WHERE s.synsetid
IN
(
SELECT s.synsetid
FROM words w
INNER JOIN sense s
ON w.wordid = s.wordid
WHERE w.word = "word_to_search"
)
AND w.word <> 'word_to_search' ORDER BY synsetid
This seems to be taking a very long time however (~0.75 secs). When you split the queries up they take ~0.0005 secs for the inner query and similar for each of the outer queries.
So what am I doing wrong? Is there a much more appropriate way to structure this query?
EDIT:
So the solution I have come up with after reading the linked articles below is:
SELECT w.word
FROM sense s
INNER JOIN words w
ON s.wordid = w.wordid
JOIN
(
SELECT s.synsetid
FROM words w
INNER JOIN sense s
ON w.wordid = s.wordid
WHERE w.word = "word_to_search"
) i
ON i.synsetid = s.synsetid
This executes in ~0.0008 sec.
回答1:
avoiding IN and NOT IN (in your case just IN) with INNER JOIN ON could boost performance.
edit:
These links:
link 1
link 2
research the effectiveness of JOINS vs IN's and other interchangeable operations. They, however, conclude that IN and NOT IN does not need to be avoided.
回答2:
Perhaps this (Updated)
SELECT w2.word, synsetid
FROM words w
INNER JOIN synset s on s.wordId = w.wordID
INNER JOIN words2 w2 on w2.wordID = s.wordID
WHERE w.word = "word_to_search"
GROUP BY w2.word, synsetid
ORDER BY synsetid, w2.word
Now I think I understand what you want. All words within the same synset as the requested word.
来源:https://stackoverflow.com/questions/10666531/sql-nested-query-slow-using-in