I have a big rhyme database with 360000 words (entries). Every word has a category (for example: \'sheet\' and \'meet\' have the category \'eet\'). A query to find suitable
Perhaps you could implement the levenshtein algorithm into mysql as a stored function, below is an example, hope it helps:
DELIMITER //
CREATE FUNCTION levenshtein( s1 VARCHAR(255), s2 VARCHAR(255) )
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE s1_len, s2_len, i, j, c, c_temp, cost INT;
DECLARE s1_char CHAR;
-- max strlen=255
DECLARE cv0, cv1 VARBINARY(256);
SET s1_len = CHAR_LENGTH(s1), s2_len = CHAR_LENGTH(s2), cv1 = 0x00, j = 1, i = 1, c = 0;
IF s1 = s2 THEN
RETURN 0;
ELSEIF s1_len = 0 THEN
RETURN s2_len;
ELSEIF s2_len = 0 THEN
RETURN s1_len;
ELSE
WHILE j <= s2_len DO
SET cv1 = CONCAT(cv1, UNHEX(HEX(j))), j = j + 1;
END WHILE;
WHILE i <= s1_len DO
SET s1_char = SUBSTRING(s1, i, 1), c = i, cv0 = UNHEX(HEX(i)), j = 1;
WHILE j <= s2_len DO
SET c = c + 1;
IF s1_char = SUBSTRING(s2, j, 1) THEN
SET cost = 0; ELSE SET cost = 1;
END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j, 1)), 16, 10) + cost;
IF c > c_temp THEN SET c = c_temp; END IF;
SET c_temp = CONV(HEX(SUBSTRING(cv1, j+1, 1)), 16, 10) + 1;
IF c > c_temp THEN
SET c = c_temp;
END IF;
SET cv0 = CONCAT(cv0, UNHEX(HEX(c))), j = j + 1;
END WHILE;
SET cv1 = cv0, i = i + 1;
END WHILE;
END IF;
RETURN c;
END;
Source http://www.artfulsoftware.com/infotree/queries.php#552 (Fixed by adding DELIMITER //)
Test example script
dbhost = $host;
$this->dbname = $dbname;
$this->dbuser = $user;
$this->dbpass = $pass;
}
private function connect(){
if (!$this->db instanceof PDO){
$this->db = new PDO('mysql:dbname='.$this->dbname.';host='.$this->dbhost, $this->dbuser, $this->dbpass);
$this->db->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
}
}
//A Model method for the levenshtein_query.
public function levenshtein_query($word,$dist){
$this->connect();
$sql = "SELECT `word` FROM `words` WHERE levenshtein( :word ,`word` ) BETWEEN 0 AND $dist";
$statement = $this->db->prepare($sql);
$statement->bindParam(':word', $word, PDO::PARAM_STR);
$statement->execute();
return $statement->fetchAll(PDO::FETCH_ASSOC);
}
}
//ini the model class
$model = new DB('localhost','test_db','root','');
//The Word posted
$word = 'eet';
$result = $model->levenshtein_query($word,1);
print_r($result);
/*
//The Result
Array
(
[0] => Array
(
[word] => bet
)
[1] => Array
(
[word] => get
)
[2] => Array
(
[word] => jet
)
[3] => Array
(
[word] => let
)
[4] => Array
(
[word] => met
)
[5] => Array
(
[word] => pet
)
[6] => Array
(
[word] => set
)
[7] => Array
(
[word] => vet
)
[8] => Array
(
[word] => wet
)
[9] => Array
(
[word] => yet
)
[10] => Array
(
[word] => meet
)
)
*/
Perhaps its of some interest...