Validate words against an English dictionary in Rails?

后端 未结 2 712
慢半拍i
慢半拍i 2020-12-06 12:35

I\'ve done some Google searching but couldn\'t find what I was looking for.

I\'m developing a scrabble-type word game in rails, and was wondering if there was a simp

相关标签:
2条回答
  • 2020-12-06 12:59

    A piece of language-agnostic advice here, is that if you only care about the existence of a word (which in such a case, you do), and you are planning to load the entire database into the application (which your query suggests you're considering) then a DAWG will enable you to check the existence in O(n) time complexity where n is the size of the word (dictionary size has no effect - overall the lookup is essentially O(1)), while being a relatively minimal structure in terms of memory (indeed, some insertions will actually reduce the size of the structure, a DAWG for "top, tap, taps, tops" has fewer nodes than one for "tops, tap").

    0 讨论(0)
  • 2020-12-06 13:00

    You need two things:

    1. a word list
    2. some code

    The word list is the tricky part. On most Unix systems there's a word list at /usr/share/dict/words or /usr/dict/words -- see http://en.wikipedia.org/wiki/Words_(Unix) for more details. The one on my Mac has 234,936 words in it. But they're not all valid Scrabble words. So you'd have to somehow acquire a Scrabble dictionary, make sure you have the right license to use it, and process it so it's a text file.

    (Update: The word list for LetterPress is now open source, and available on GitHub.)

    The code is no problem in the simple case. Here's a script I whipped up just now:

    words = {}
    File.open("/usr/share/dict/words") do |file|
      file.each do |line|
        words[line.strip] = true
      end
    end
    p words["magic"]
    p words["saldkaj"]
    

    This will output

    true
    nil
    

    I leave it as an exercise for the reader to make it into a proper Words object. (Technically it's not a Dictionary since it has no definitions.) Or to use a DAWG instead of a hash, even though a hash is probably fine for your needs.

    0 讨论(0)
提交回复
热议问题