Puzzle Solving: Finding All Words Within a Larger Word in PHP

后端未结

关注

 1  688

So I have a database of words between 3 and 20 characters long. I want to code something in PHP that finds all of the smaller words that are contained within a larger word.

相关标签:

1条回答

梦如初夏

2020-12-20 03:12
Here is a simple solution that should be pretty efficient, but will only work up to certain size of words (probably about 15-20 characters it will break down, depending on whether the letters making up the word are low-frequency letters with lower values or high-frequency letters with higher values):
1. Assign each letter a prime number according to it's frequency. So e is 2, t = 3, a = 5, etc. using frequency values from here or some similar source.
2. Precalculate the value of each word in your word list by multiplying the prime values for the letters in the word, and store in the table in a bigint data type column. For instance, tea would have a value of 3*2*5=30. If a word has repeated letters, repeat the factor, so that teat should have a value of 3*2*5*3=90.
3. When checking if a word, such as rain, is contained inside of another word, such as inward, it's sufficient to check if the value for rain divides the value for inward. In this case, inward = 14213045, rain = 7315, and 14213045 is divisible by 7315, so the word rain is inside the word inward.
4. A bigint column maxes out at 9223372036854775807, which should be fine up to about 15-20 characters (depending on the frequencies of letters in the word). For instance, I picked up the first 20-letter word from here, which is anitinstitutionalism, and has a value of 6901041299724096525 which would just barely fit inside the bigint column. However, the 14-letter word xylopyrography has a value of 635285791503081662905, which is too big. You might have to handle the really large ones as special cases using an alternate method, but hopefully there's few enough of them that it would still be relatively efficient.
The query would work something like the demo I've prepared here: http://www.sqlfiddle.com/#!2/9bd27/8
0 讨论(0)
发布评论:

提交评论
- 加载中...