Is there a way to remove duplicate and continuous words/phrases in a string?

前端未结

关注

 6  451

广开言路 2021-01-13 11:27

Is there a way to remove duplicate and continuous words/phrases in a string? E.g.

[in]: foo foo bar bar foo bar

6条回答

感动是毒 (楼主)

2021-01-13 12:00

Personally, I do not think we need to use any other modules for this (although I admit some of them are GREAT). I just managed this with simple looping by first converting the string into a list. I tried it on all the examples listed above. It works fine.

sentence = str(raw_input("Please enter your sentence:\n"))

word_list = sentence.split()

def check_if_same(i,j): # checks if two sets of lists are the same

    global word_list
    next = (2*j)-i   # this gets the end point for the second of the two lists to compare (it is essentially j + phrase_len)
    is_same = False
    if word_list[i:j] == word_list[j:next]:

        is_same = True
        # The line below is just for debugging. Prints lists we are comparing and whether it thinks they are equal or not
        #print "Comparing: " + ' '.join(word_list[i:j]) + " " + ''.join(word_list[j:next]) + " " + str(answer)

    return is_same

phrase_len = 1

while phrase_len <= int(len(word_list) / 2): # checks the sentence for different phrase lengths

    curr_word_index=0

    while curr_word_index < len(word_list): # checks all the words of the sentence for the specified phrase length

        result = check_if_same(curr_word_index, curr_word_index + phrase_len) # checks similarity

        if result == True:
            del(word_list[curr_word_index : curr_word_index + phrase_len]) # deletes the repeated phrase
        else:
            curr_word_index += 1

    phrase_len += 1

print "Answer: " + ' '.join(word_list)

0 讨论(0)

查看其它6个回答