I just tried for my first programming interview and one of the questions was to write a program that given a 7 digit telephone number, could print all possible combinations
This approach uses R and is based on first converting the dictionary to its corresponding digit representation, then using this as a look-up.
The conversion only takes 1 second on my machine (converting from the native Unix dictionary of about 100,000 words), and typical look-ups of up to 100 different digit inputs take a total of .1 seconds:
library(data.table)
#example dictionary
dict.orig = tolower(readLines("/usr/share/dict/american-english"))
#split each word into its constituent letters
#words shorter than the longest padded with "" for simpler retrieval
dictDT = setDT(tstrsplit(dict.orig, split = "", fill = ""))
#lookup table for conversion
#NB: the following are found in the dictionary and would need
# to be handled separately -- ignoring here
# (accents should just be appended to
# matches for unaccented version):
# c("", "'", "á", "â", "å", "ä",
# "ç", "é", "è", "ê", "í", "ñ",
# "ó", "ô", "ö", "û", "ü")
lookup = data.table(num = c(rep('2', 3), rep('3', 3), rep('4', 3),
rep('5', 3), rep('6', 3), rep('7', 4),
rep('8', 3), rep('9', 4)),
let = letters)
#using the lookup table, convert to numeric
for (col in names(dictDT)) {
dictDT[lookup, (col) := i.num, on = setNames("let", col)]
}
#back to character vector
dict.num = do.call(paste0, dictDT)
#sort both for faster vector search
idx = order(dict.num)
dict.num = dict.num[idx]
dict.orig = dict.orig[idx]
possibilities = function(input) dict.orig[dict.num == input]
#sample output:
possibilities('269')
# [1] "amy" "bmw" "cox" "coy" "any" "bow" "box" "boy" "cow" "cox" "coy"
possibilities('22737')
# [1] "acres" "bards" "barer" "bares" "barfs" "baser" "bases" "caper"
# [9] "capes" "cards" "cares" "cases"