I need to loop in a list containing french words and find an asterisk because I want to concatenate the word before the asterisk and the word after the asterisk each time an asterisk appear and continue to the next. For example, in the sequence:
['les','engage', '*', 'ment', 'de','la']
I want to concatenate 'engage' and 'ment' and the output (engagement) should be checked by a dictionary. If in the dictionary, append to a list.
With my code I only get the asterisk:
import nltk
from nltk.tokenize import word_tokenize
import re
with open ('text-test.txt') as tx:
text =word_tokenize(tx.read().lower())
with open ('Fr-dictionary.txt') as fr:
dic = word_tokenize(fr.read().lower())
ast=re.compile(r'[\*]+')
regex=list(filter(ast.match,text))
valid_words=[]
invalid_words=[]
last = None
for w in text:
if w in regex:
last=w
a=last + w[+1]
break
if a in dic:
valid_words.append(a)
else:
continue
I wondered how to manage a list (nonsense) like this:
words = ['Bien', '*', 'venue', 'pour', 'les','engage', '*', 'ment', 'trop', 'de', 'YIELD', 'peut','être','contre', '*', 'productif' ]
So I came u with a method like this:
def join_asterisk(ary):
i, size = 0, len(ary)
while i < size-2:
if ary[i+1] == '*':
yield ary[i] + ary[i+2]
i+=2
else: yield ary[i]
i += 1
if i < size:
yield ary[i]
Which returns:
print(list(join_asterisk(words)))
#=> ['Bienvenue', 'pour', 'les', 'engagement', 'trop', 'de', 'YIELD', 'peut', 'être', 'contreproductif']
Instead of thinking "time travel" (i.e. go back and forth), the Pythonic way would be to think functional (time travel has it's place in very resource constrained environments).
One way is to go the enumeration way as @Yosufsn showed. Another is to zip the list with itself, but with padding appended on either side. Like this:
words = ['les','engage', '*', 'ment', 'de','la']
for a,b,c in zip([None]*2+words, [None]+words+[None], words+[None]*2):
if b == '*':
print( a+c )
I think you need a simple code like this:
words = ['les','engage', '*', 'ment', 'de','la']
for n,word in enumerate (words):
if word == "*":
exp = words[n-1] + words[n+1]
print (exp)
Output:
"engagement"
With this output, you can subsequently check with your dictionary.
来源:https://stackoverflow.com/questions/55196135/go-one-step-back-and-one-step-forward-in-a-loop-with-python