How can I generate all possible strings given a grammar rule?

旧时模样 提交于 2019-12-11 04:29:59

问题


I would like to, given a JSGF grammar, generate all the terminal strings it would map to. For example, given A (B | C) D [E], my desired output would be:

A B D E
A C D E
A B D
A C D

I decided to start with the easiest item, the optional brackets, but soon ran into a brick wall. It sort of works for 1 item, but not for an item with alternatives. Any advice would be appreciated.

What I have now:

import re
rule = raw_input('Enter the rule you want to test: ')
items = re.findall(r"\w[\w ]*\w|\w|\[|\]|\(|\)", rule)
for anitem in range(len(items)):
    bracketc = items[:anitem].count('[') - items[:anitem].count(']')
    if items[anitem] != '[' and items[anitem] != ']':
        if bracketc > 0:
            optional = True
        else:
            optional = False
        while optional == True:
            print ' '.join(items)
            it2 = items[:]
            it2.remove(it2[anitem])
            print ' '.join(it2)
            break

It works for 1 item, and given a string A B [C] D, returns:

A B [ C ] D
A B [ ] D

but breaks down at increasing complexity, so I am guessing I need something completely different.


回答1:


From your example, I've written the following piece of code :

rule="A(B|C)D[E]FG"

def generate_strings(rule):
    if not rule:
        return [""]
    begin,end=rule[0],rule[1:]
    if begin=='[':
        i=end.find(']')
        if i==-1:
            raise Exception("Unmatched '['")
        alt,end=end[0:i],end[i+1:]
        return [a+e for e in generate_strings(end) for a in [alt,""]]
    if begin=='(':
        i=end.find(')')
        if i==-1:
            raise Exception("Unmatched '('")
        alt,end=end[0:i].split('|'),end[i+1:]
        return [a+e for e in generate_strings(end) for a in alt]
    if begin in [']',')','|']:
        raise Exception("Unexpected " + begin)
    return [begin + e for e in generate_strings(end)]

print generate_strings(rule)

Edit : This is an attempt to make things work with nested expression. It doesn't quite work all the time as the parsing is much more delicate now : when we find a closing bracket, it might not be the one we want but the one for a nested expression. The same for pipes and parenthesis.

def flatten(l):
    return [item for sublist in l for item in sublist]

def generate_strings(rule):
    if not rule:
        return [""]
    begin,end=rule[0],rule[1:]
    if begin=='[':
        i=end.find(']')
        if i==-1:
            raise Exception("Unmatched '['")
        alt=flatten([generate_strings(a) for a in [end[0:i],""]])
        end=end[i+1:]
        return [a+e for e in generate_strings(end) for a in alt]
    if begin=='(':
        i=end.find(')')
        if i==-1:
            raise Exception("Unmatched '('")
        alt=flatten([generate_strings(a) for a in end[0:i].split('|')])
        end=end[i+1:]
        return [a+e for e in generate_strings(end) for a in alt]
    if begin in [']',')','|']:
        raise Exception("Unexpected " + begin)
    return [begin + e for e in generate_strings(end)]

print generate_strings(rule)



回答2:


Generator form of Josay's answer:

def generate_strings(rule):
    if not rule:
        yield ""
    else:
        begin, end = rule[0], rule[1:]
        if begin == '[':
            i = end.find(']')
            if i == -1:
                raise ValueError("Unmatched '['")
            optional, end = end[:i], end[i+1:]
            for e in generate_strings(end):
                yield e
                yield optional + e
        elif begin == '(':
            i = end.find(')')
            if i == -1:
                raise ValueError("Unmatched '('")
            parts, end = end[:i].split('|'), end[i+1:]

            for e in generate_strings(end):
                for p in parts:
                    yield p + e
        elif begin in '])|':
            raise ValueError("Unexpected " + begin)
        else:
            for e in generate_strings(end):
                yield begin + e
>>> list(generate_strings("A(B|C)D[E]FG"))
['ABDFG', 'ACDFG', 'ABDEFG', 'ACDEFG']


来源:https://stackoverflow.com/questions/17174891/how-can-i-generate-all-possible-strings-given-a-grammar-rule

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!