Is there a regular expression to detect a valid regular expression?

前端 未结 8 1139
天命终不由人
天命终不由人 2020-11-22 09:48

Is it possible to detect a valid regular expression with another regular expression? If so please give example code below.

8条回答
  •  借酒劲吻你
    2020-11-22 10:18

    Though it is perfectly possible to use a recursive regex as MizardX has posted, for this kind of things it is much more useful a parser. Regexes were originally intended to be used with regular languages, being recursive or having balancing groups is just a patch.

    The language that defines valid regexes is actually a context free grammar, and you should use an appropriate parser for handling it. Here is an example for a university project for parsing simple regexes (without most constructs). It uses JavaCC. And yes, comments are in Spanish, though method names are pretty self-explanatory.

    SKIP :
    {
        " "
    |   "\r"
    |   "\t"
    |   "\n"
    }
    TOKEN : 
    {
        < DIGITO: ["0" - "9"] >
    |   < MAYUSCULA: ["A" - "Z"] >
    |   < MINUSCULA: ["a" - "z"] >
    |   < LAMBDA: "LAMBDA" >
    |   < VACIO: "VACIO" >
    }
    
    IRegularExpression Expression() :
    {
        IRegularExpression r; 
    }
    {
        r=Alternation() { return r; }
    }
    
    // Matchea disyunciones: ER | ER
    IRegularExpression Alternation() :
    {
        IRegularExpression r1 = null, r2 = null; 
    }
    {
        r1=Concatenation() ( "|" r2=Alternation() )?
        { 
            if (r2 == null) {
                return r1;
            } else {
                return createAlternation(r1,r2);
            } 
        }
    }
    
    // Matchea concatenaciones: ER.ER
    IRegularExpression Concatenation() :
    {
        IRegularExpression r1 = null, r2 = null; 
    }
    {
        r1=Repetition() ( "." r2=Repetition() { r1 = createConcatenation(r1,r2); } )*
        { return r1; }
    }
    
    // Matchea repeticiones: ER*
    IRegularExpression Repetition() :
    {
        IRegularExpression r; 
    }
    {
        r=Atom() ( "*" { r = createRepetition(r); } )*
        { return r; }
    }
    
    // Matchea regex atomicas: (ER), Terminal, Vacio, Lambda
    IRegularExpression Atom() :
    {
        String t;
        IRegularExpression r;
    }
    {
        ( "(" r=Expression() ")" {return r;}) 
        | t=Terminal() { return createTerminal(t); }
        |  { return createLambda(); }
        |  { return createEmpty(); }
    }
    
    // Matchea un terminal (digito o minuscula) y devuelve su valor
    String Terminal() :
    {
        Token t;
    }
    {
        ( t= | t= ) { return t.image; }
    }
    

提交回复
热议问题