Why are multi-line comments in flex/bison so evasive?

假如想象 提交于 2019-12-04 23:33:57

I think you need to declare your ML_COMMENT start condition as an exclusive start condition so only the ML_COMMENT rules are active. %x ML_COMMENT instead of %s ML_COMMENT

Otherwise rules with no start conditions are also active.

Parsing comments this way can lead to errors because:

  • you need to add conditions to all of your lex rules
  • it becomes even more complex if you also want to handle // comments
  • you still have the risk that yacc/bison merges two comments including everything in between

In my parser, I handle comments like this. First define lex rules for the start of the comment, like this:

\/\*     {
         if (!SkipComment())
            return(-1);
         }

\/\/     {
         if (!SkipLine())
            return(-1);
         }

then write the SkipComment and SkipLine functions. They need to consume all the input until the end of the comment is found (this is rather old code so forgive me the somewhat archaic constructions):

bool SkipComment (void)
{
int Key;

Key=!EOF;
while (true)
   {
   if (Key==EOF)
      {
      /* yyerror("Unexpected EOF within comment."); */
      break;
      }
   switch ((char)Key)
      {
      case '*' :
         Key=input();
         if (char)Key=='/') return true;
         else               continue;
         break;
      case '\n' :
         ++LineNr;
         break;
      }
   Key=input();
   }

return false;
}

bool SkipLine (void)
{
int Key;

Key=!EOF;
while (true)
   {
   if (Key==EOF)
      return true;
   switch ((char)Key)
      {
      case '\n' :
         unput('\n');
         return true;
         break;
      }
   Key=input();
   }

return false;
}

Besides the problem with %x vs %s, you also have the problem that the . in [.\n] matches (only) a literal . and not 'any character other than newline' like a bare . does. You want a rule like

<ML_COMMENT>.|"\n"     { /* do nothing */ }

instead

I found this description of the C language grammar (actually just the lexer) very useful. I think it is mostly the same as Patrick's answer, but slightly different.

http://www.lysator.liu.se/c/ANSI-C-grammar-l.html

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!