Is there a way to remove duplicate and continuous words/phrases in a string?

前端 未结 6 451
广开言路
广开言路 2021-01-13 11:27

Is there a way to remove duplicate and continuous words/phrases in a string? E.g.

[in]: foo foo bar bar foo bar

6条回答
  •  感动是毒
    2021-01-13 12:00

    Personally, I do not think we need to use any other modules for this (although I admit some of them are GREAT). I just managed this with simple looping by first converting the string into a list. I tried it on all the examples listed above. It works fine.

    sentence = str(raw_input("Please enter your sentence:\n"))
    
    word_list = sentence.split()
    
    def check_if_same(i,j): # checks if two sets of lists are the same
    
        global word_list
        next = (2*j)-i   # this gets the end point for the second of the two lists to compare (it is essentially j + phrase_len)
        is_same = False
        if word_list[i:j] == word_list[j:next]:
    
            is_same = True
            # The line below is just for debugging. Prints lists we are comparing and whether it thinks they are equal or not
            #print "Comparing: " + ' '.join(word_list[i:j]) + " " + ''.join(word_list[j:next]) + " " + str(answer)
    
        return is_same
    
    phrase_len = 1
    
    while phrase_len <= int(len(word_list) / 2): # checks the sentence for different phrase lengths
    
        curr_word_index=0
    
        while curr_word_index < len(word_list): # checks all the words of the sentence for the specified phrase length
    
            result = check_if_same(curr_word_index, curr_word_index + phrase_len) # checks similarity
    
            if result == True:
                del(word_list[curr_word_index : curr_word_index + phrase_len]) # deletes the repeated phrase
            else:
                curr_word_index += 1
    
        phrase_len += 1
    
    print "Answer: " + ' '.join(word_list)
    

提交回复
热议问题