Generative regular expressions

妖精的绣舞 提交于 2019-11-30 07:24:49

Microsoft has a SMT-based gratis (MSRL-licensed) "Rex" tool for this: http://research.microsoft.com/en-us/downloads/7f1d87be-f6d9-495d-a699-f12599cea030/

From the Introduction section of the "Rex: Symbolic Regular Expression Explorer" paper:

We translate (extended) regular expressions or regexes [5] into a symbolic representation of finite automata called SFAs. In an SFA, moves are labeled by formulas representing sets of characters rather than individual characters. An SFA A is translated into a set of (recursive) axioms that describe the acceptance condition for the strings accepted by A and build on the representation of strings as lists.

As the SMT solver can output all possible solutions within some size bound, this may be close to what you're looking for.

On a more statistical and less formal front, the Regexp::Genex module from CPAN can work as well: http://search.cpan.org/dist/Regexp-Genex/

You can use it with something like this:

#!/usr/bin/env perl
use Regexp::Genex ':all';
my $hits = 100;
my $re = qr/[a-z](123|456)/;
local $Regexp::Genex::DEFAULT_LEN = length $re;
my %seen;
while ((time - $^T) < 2) {
    @seen{strings($re)} = ();
    $Regexp::Genex::DEFAULT_LEN++;
}
print "$_\n" for (sort %seen)[0..$hits-1];

Adjust the time and sample size as needed. Hope this helps!

Take a look at Xeger (Google Code).

The Visual Studio Team System appears to have an inverse regex generator, too, but it doesn't look like the algorithm is open source.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!