Soundex algorithm in Python (homework help request)

前端 未结 3 483
不知归路
不知归路 2020-12-11 14:12

The US census bureau uses a special encoding called “soundex” to locate information about a person. The soundex is an encoding of surnames (last names) based on the way a su

3条回答
  •  慢半拍i
    慢半拍i (楼主)
    2020-12-11 14:18

    surname = input("Enter surname of the author: ") #asks user to input the author's surname
    
    while surname != "": #initiates a while loop thats loops on as long as the input is not equal to an empty line
    
        str_ini = surname[0] #denotes the initial letter of the surname string
        mod_str1 = surname[1:] #denotes modified string excluding the first letter of the surname
    
        import re #importing re module to access the sub function
        mod_str2 = re.sub(r'[aeiouyhwAEIOUYHW]', '', mod_str1) #eliminating any instances of the given letters
    
    
        mod_str21 = re.sub(r'[bfpvBFPV]', '1', mod_str2)
        mod_str22 = re.sub(r'[cgjkqsxzCGJKQSXZ]', '2', mod_str21)
        mod_str23 = re.sub(r'[dtDT]', '3', mod_str22)
        mod_str24 = re.sub(r'[lL]', '4', mod_str23)
        mod_str25 = re.sub(r'[mnMN]', '5', mod_str24)
        mod_str26 = re.sub(r'[rR]', '6', mod_str25)
                    #substituting given letters with specific numbers as required by the soundex algorithm
    
        mod_str3 = str_ini.upper()+mod_str26 #appending the surname initial with the remaining modified trunk
    
        import itertools #importing itertools module to access the groupby function
        mod_str4 = ''.join(char for char, rep in itertools.groupby(mod_str3))
                    #grouping each character of the string into individual characters
                    #removing sequences of identical numbers with a single number
                    #joining the individually grouped characters into a string
    
        mod_str5 = (mod_str4[:4]) #setting character limit of the modified string upto the fourth place
    
        if len (mod_str5) == 1:
            print (mod_str5 + "000\n")
        elif len (mod_str5) == 2:
            print (mod_str5 + "00\n")
        elif len (mod_str5) == 3:
            print (mod_str5 + "0\n")
        else:
            print (mod_str5 + "\n")
                    #using if, elif and else arguments for padding with trailing zeros
    
        print ("Press enter to exit") #specification for the interactor, to press enter (i.e., equivalent to a new line for breaking the while loop) when he wants to exit the program
        surname = input("Enter surname of the author: ") #asking next input from the user if he wants to carry on
    
    exit(0) #exiting the program at the break of the while loop
    

提交回复
热议问题