Is there any Regex optimizer written in Java?

前端 未结 3 1416
佛祖请我去吃肉
佛祖请我去吃肉 2021-01-23 09:46

I wrote a Java program which can generate a sequence of symbols, like \"abcdbcdefbcdbcdefg\". What I need is Regex optimizer, which can result \"a((bcd){2}ef)

3条回答
  •  情深已故
    2021-01-23 10:21

    I assume you are trying to find a small regex to encode a finite set of input strings. If so, you haven't chosen the best possible subject line.

    I can't give you an existing program, but I can tell you how to approach writing one.

    There is no canonical minimum regex form and determining the true minimum size regex is NP hard. Certainly your sets are finite, so this may be a simpler problem. I'll have to think about it.

    But a good heuristic algorithm would be:

    1. Construct a trivial non-deterministic finite automaton (NFA) that accepts all your strings.
    2. Convert the NFA to a deterministic finite automaton (DFA) with the subset construction.
    3. Minimize the DFA with the standard algorithm.
    4. Use the construction from the proof of Kleene's theorem to get to a regex.

    Note that step 3 does give you a unique minimum DFA. That would probably be the best way to encode your string sets.

提交回复
热议问题