Best data structure for crossword puzzle search

后端 未结 5 1865
无人共我
无人共我 2020-12-05 21:24

I have a large database for solving crossword puzzles, consisting of a word and a description. My application allows searching for words of a specific length and characters

5条回答
  •  失恋的感觉
    2020-12-05 22:00

    Since you use a database, create a Suffixes table.
    For example :

      Suffix          |   WordID   | SN
      ----------------+------------+----   
      StackOverflow           10      1
      tackOverflow            10      2
      ackOverflow             10      3
      ckOverflow              10      4
      kOverflow               10      5
      ...
    

    With that table it's easy to get all words that contain a particular char in a specific position,
    like this:

    SELECT WordID FROM suffixes
    WHERE suffix >= 't' AND suffix < 'u' AND SN = 2
    

    Get all words which contain 't' at position 2.

    Update: if you want to save space, and sacrifice a bit of speed, you can use a suffix array.

    You can store all the words in a line (array) with a separator among them, ie the $, and create a suffix array which will have pointers to chars. Now, given a char c you can find all instances of words which contain it rather fast. Still, you'll have to examine if it's in the right position.
    (by checking how far it is from the $s)

    Probably with the above technique the search will be x10 faster than searching all the words in your original program.

    Update 2: I've used the database approach in one of my utilities where I needed to locate suffixes such as "ne", for example, and I forgot to adjust (optimize) it for this specific problem.

    You can just store a single char as a suffix:

      Suffix   |   WordID   | SN
      ---------+------------+----   
      S                10      1
      t                10      2
      a                10      3
      c                10      4
      k                10      5
      ...
    

    which saves a lot of space. Now, the query becomes

    SELECT WordID FROM suffixes
    WHERE suffix = 't' AND SN = 2
    

提交回复
热议问题