问题
I would like to, given a JSGF grammar, generate all the terminal strings it would map to. For example, given A (B | C) D [E], my desired output would be:
A B D E
A C D E
A B D
A C D
I decided to start with the easiest item, the optional brackets, but soon ran into a brick wall. It sort of works for 1 item, but not for an item with alternatives. Any advice would be appreciated.
What I have now:
import re
rule = raw_input('Enter the rule you want to test: ')
items = re.findall(r"\w[\w ]*\w|\w|\[|\]|\(|\)", rule)
for anitem in range(len(items)):
bracketc = items[:anitem].count('[') - items[:anitem].count(']')
if items[anitem] != '[' and items[anitem] != ']':
if bracketc > 0:
optional = True
else:
optional = False
while optional == True:
print ' '.join(items)
it2 = items[:]
it2.remove(it2[anitem])
print ' '.join(it2)
break
It works for 1 item, and given a string A B [C] D, returns:
A B [ C ] D
A B [ ] D
but breaks down at increasing complexity, so I am guessing I need something completely different.
回答1:
From your example, I've written the following piece of code :
rule="A(B|C)D[E]FG"
def generate_strings(rule):
if not rule:
return [""]
begin,end=rule[0],rule[1:]
if begin=='[':
i=end.find(']')
if i==-1:
raise Exception("Unmatched '['")
alt,end=end[0:i],end[i+1:]
return [a+e for e in generate_strings(end) for a in [alt,""]]
if begin=='(':
i=end.find(')')
if i==-1:
raise Exception("Unmatched '('")
alt,end=end[0:i].split('|'),end[i+1:]
return [a+e for e in generate_strings(end) for a in alt]
if begin in [']',')','|']:
raise Exception("Unexpected " + begin)
return [begin + e for e in generate_strings(end)]
print generate_strings(rule)
Edit : This is an attempt to make things work with nested expression. It doesn't quite work all the time as the parsing is much more delicate now : when we find a closing bracket, it might not be the one we want but the one for a nested expression. The same for pipes and parenthesis.
def flatten(l):
return [item for sublist in l for item in sublist]
def generate_strings(rule):
if not rule:
return [""]
begin,end=rule[0],rule[1:]
if begin=='[':
i=end.find(']')
if i==-1:
raise Exception("Unmatched '['")
alt=flatten([generate_strings(a) for a in [end[0:i],""]])
end=end[i+1:]
return [a+e for e in generate_strings(end) for a in alt]
if begin=='(':
i=end.find(')')
if i==-1:
raise Exception("Unmatched '('")
alt=flatten([generate_strings(a) for a in end[0:i].split('|')])
end=end[i+1:]
return [a+e for e in generate_strings(end) for a in alt]
if begin in [']',')','|']:
raise Exception("Unexpected " + begin)
return [begin + e for e in generate_strings(end)]
print generate_strings(rule)
回答2:
Generator form of Josay's answer:
def generate_strings(rule):
if not rule:
yield ""
else:
begin, end = rule[0], rule[1:]
if begin == '[':
i = end.find(']')
if i == -1:
raise ValueError("Unmatched '['")
optional, end = end[:i], end[i+1:]
for e in generate_strings(end):
yield e
yield optional + e
elif begin == '(':
i = end.find(')')
if i == -1:
raise ValueError("Unmatched '('")
parts, end = end[:i].split('|'), end[i+1:]
for e in generate_strings(end):
for p in parts:
yield p + e
elif begin in '])|':
raise ValueError("Unexpected " + begin)
else:
for e in generate_strings(end):
yield begin + e
>>> list(generate_strings("A(B|C)D[E]FG"))
['ABDFG', 'ACDFG', 'ABDEFG', 'ACDEFG']
来源:https://stackoverflow.com/questions/17174891/how-can-i-generate-all-possible-strings-given-a-grammar-rule