Is there a way to remove duplicate and continuous words/phrases in a string? E.g.
[in]: foo foo bar bar foo bar
Personally, I do not think we need to use any other modules for this (although I admit some of them are GREAT). I just managed this with simple looping by first converting the string into a list. I tried it on all the examples listed above. It works fine.
sentence = str(raw_input("Please enter your sentence:\n"))
word_list = sentence.split()
def check_if_same(i,j): # checks if two sets of lists are the same
global word_list
next = (2*j)-i # this gets the end point for the second of the two lists to compare (it is essentially j + phrase_len)
is_same = False
if word_list[i:j] == word_list[j:next]:
is_same = True
# The line below is just for debugging. Prints lists we are comparing and whether it thinks they are equal or not
#print "Comparing: " + ' '.join(word_list[i:j]) + " " + ''.join(word_list[j:next]) + " " + str(answer)
return is_same
phrase_len = 1
while phrase_len <= int(len(word_list) / 2): # checks the sentence for different phrase lengths
curr_word_index=0
while curr_word_index < len(word_list): # checks all the words of the sentence for the specified phrase length
result = check_if_same(curr_word_index, curr_word_index + phrase_len) # checks similarity
if result == True:
del(word_list[curr_word_index : curr_word_index + phrase_len]) # deletes the repeated phrase
else:
curr_word_index += 1
phrase_len += 1
print "Answer: " + ' '.join(word_list)