问题
As I described in Antlr greedy-option I have some problems with a language that could include string-literals inside a string-literal, such as:
START: "img src="test.jpg""
Mr. Bart Kiers mentioned in my thread that it is not possible to create a grammar which could solve my problem. Therefore I decided to change the language to:
START: "img src='test.jpg'"
before starting the lexer (and parser).
File-input could be:
START: "aaa"aaa" "aaa"aaaaa" :END_START START: "aaa"aaa" "aaa"aa a aa" :END_START START: "aaab"bbaaaa" :END_START
So I have got a solution, but it is not correct. I have two questions regarding to my problem (below the code). My code would be:
public static void main(String[] args) {
    try{
        FileInputStream fis = new FileInputStream("src/file.txt");
        String preparedCode = preparingCode(fis);
        ANTLRStringStream in = new ANTLRStringStream(preparedCode);
        TestLexer lex = new TestLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lex);
        TestParser parser = new TestParser(tokens);
        parser.rule();
    }catch(IOException ex){
        ex.printStackTrace();
    } catch (RecognitionException e) {
        System.out.println(e.getMessage());
        System.exit(0);
    }
}
static String preparingCode(FileInputStream input){
    DataInputStream data = new DataInputStream(input);
    StringBuilder oldCode = new StringBuilder();
    StringBuffer newCode = new StringBuffer(oldCode.length());
    Pattern pattern = Pattern.compile("(START:\\s\")(.+)(\"\\n:END_START)");
    String strLine;
    try{
      while ((strLine = data.readLine()) != null)   
          oldCode.append(strLine + "\n");
    }
    catch(IOException ex){
      ex.printStackTrace();
    }
    Matcher matcher = pattern.matcher(oldCode);
    while (matcher.find()) {
      //eliminate quotes inside a string literal
      String stringLiteral = matcher.group(2).replaceAll("\"", "'");
      String replace = matcher.group(1) + stringLiteral + matcher.group(3);
      matcher.appendReplacement(newCode, Matcher.quoteReplacement(replace));
    }
    matcher.appendTail(newCode);
    System.out.println(newCode);
    return newCode.toString();
}
My questions are:
- Which pattern would be the correct one? It is important that the string literal could be defined over more than one line e.g. "aaaa"\n"bbb", but always closes with an "\n:END_START" line. My wish would be the following result:
START: "aaa'aaa' 'aaa'aaaaa" :END_START START: "aaa'aaa' 'aa'aa a aa" :END_START START: "aaab'bbaaaa" :END_START
I played around with the pattern flag Pattern.DOTALL
Pattern pattern = Pattern.compile("(START:\s\")(.+)(\"\n:END_START)", Pattern.DOTALL);
 - If I would use the correct pattern, is there any other efficient way how to fix it?
Fix for the first question
I have to use a non-greedy approach with the pattern flag Pattern.DOTALL:
Pattern pattern = Pattern.compile("(START:\\s\")(.+?)(\"\\n:END_START)", Pattern.DOTALL);回答1:
Fix for the first question
I have to use a non-greedy approach with the  pattern flag Pattern.DOTALL:
Pattern pattern = Pattern.compile("(START:\\s\")(.+?)(\"\\n:END_START)", Pattern.DOTALL);The code:
 public static void main(String[] args) {
    try{
        FileInputStream fis = new FileInputStream("src/file.txt");
        String preparedCode = preparingCode(fis);
        ANTLRStringStream in = new ANTLRStringStream(preparedCode);
        TestLexer lex = new TestLexer(in);
        CommonTokenStream tokens = new CommonTokenStream(lex);
        TestParser parser = new TestParser(tokens);
        parser.rule();
    }catch(IOException ex){
        ex.printStackTrace();
    } catch (RecognitionException e) {
        System.out.println(e.getMessage());
        System.exit(0);
    }
}
static String preparingCode(FileInputStream input){
    DataInputStream data = new DataInputStream(input);
    StringBuilder oldCode = new StringBuilder();
    StringBuffer newCode = new StringBuffer(oldCode.length());
    Pattern pattern = Pattern.compile("(START:\\s\")(.+?)(\"\\n:END_START)", Pattern.DOTALL);
    String strLine;
    try{
      while ((strLine = data.readLine()) != null)   
          oldCode.append(strLine + "\n");
    }
    catch(IOException ex){
      ex.printStackTrace();
    }
    Matcher matcher = pattern.matcher(oldCode);
    while (matcher.find()) {
        System.out.println("++++"+matcher.group(2));
      //eliminate quotes inside a string literal
      String stringLiteral = matcher.group(2).replaceAll("\"", "'");
      String replace = matcher.group(1) + stringLiteral + matcher.group(3);
      matcher.appendReplacement(newCode, Matcher.quoteReplacement(replace));
    }
    matcher.appendTail(newCode);
    System.out.println(newCode);
    return newCode.toString();
}
So is there any other way how to fix this problem?
来源:https://stackoverflow.com/questions/10013170/efficiently-replacing-a-string-or-character-from-file-input-for-the-antlrinputst