lex | 易学教程

lex & yacc get current position

阅读更多关于 lex & yacc get current position

问题 In lex & yacc there is a macro called YY_INPUT which can be redefined, for example in a such way #define YY_INPUT(buf,result,maxlen) do { \ const int n = gzread(gz_yyin, buf, maxlen); \ if (n < 0) { \ int errNumber = 0; \ reportError( gzerror(gz_yyin, &errNumber)); } \ \ result = n > 0 ? n : YY_NULL; \ } while (0) I have some grammar rule which called YYACCEPT macro. If after YYACCEPT I called gztell (or ftell), then I got a wrong number, because parser already read some unnecessary data. So

Non-Greedy Regular Expression Matching in Flex

阅读更多关于 Non-Greedy Regular Expression Matching in Flex

I have just started with Flex and can't seem to figure out how to match the following Expression : "Dog".*"Cat" ------------------ Input : Dog Ca Cat Cc Cat ------------------ Output: Dog Ca Cat Cc Cat But I want a non-greedy matching, with the following output : Output: Dog Ca Cat How can this be acheived on Flex ? EDIT Tried the following : %% Dog.*Cat/.*Cat printf("Matched : ||%s||", yytext); dog.*cat printf("Matched : ||%s||", yytext); dOg[^c]*cAt printf("Matched : ||%s||", yytext); DOG.*?CAT printf("Matched : ||%s||", yytext); %% Input : Dog Ca Cat Cc Cat dog Ca cat Cc cat dOg Ca cAt Cc

Emulation of lex like functionality in Perl or Python

阅读更多关于 Emulation of lex like functionality in Perl or Python

问题 Here's the deal. Is there a way to have strings tokenized in a line based on multiple regexes? One example: I have to get all href tags, their corresponding text and some other text based on a different regex. So I have 3 expressions and would like to tokenize the line and extract tokens of text matching every expression. I have actually done this using flex (not to be confused with Adobe), which is an implementation of the good old lex. lex provides an elegant way to do this by executing

Non-Greedy Regular Expression Matching in Flex

阅读更多关于 Non-Greedy Regular Expression Matching in Flex

问题 I have just started with Flex and can't seem to figure out how to match the following Expression : "Dog".*"Cat" ------------------ Input : Dog Ca Cat Cc Cat ------------------ Output: Dog Ca Cat Cc Cat But I want a non-greedy matching, with the following output : Output: Dog Ca Cat How can this be acheived on Flex ? EDIT Tried the following : %% Dog.*Cat/.*Cat printf("Matched : ||%s||", yytext); dog.*cat printf("Matched : ||%s||", yytext); dOg[^c]*cAt printf("Matched : ||%s||", yytext); DOG.*

Flex / Lex Encoding Strings with Escaped Characters

阅读更多关于 Flex / Lex Encoding Strings with Escaped Characters

I'll refer to this question for some of the background: Regular expression for a string literal in flex/lex The problem I am having is handling the input with escaped characters in my lexer and I think it may be an issue to do with the encoding of the string, but I'm not sure. Here's is how I am handling string literals in my lexer: \"(\\.|[^\\"])*\" { char* text1 = strndup(yytext + 1, strlen(yytext) - 2); char* text2 = "text\n"; printf("value = <%s> <%x>\n", text1, text1); printf("value = <%s> <%x>\n", text2, text2); } This outputs the following: value = <text\n"> <15a1bb0> value = <text >

Regular expression to recognize variable declarations in C

阅读更多关于 Regular expression to recognize variable declarations in C

问题 I'm working on a regular expression to recognize variable declarations in C and I have got this. [a-zA-Z_][a-zA-Z0-9]* Is there any better solution? 回答1: A pattern to recognize variable declarations in C. Looking at a conventional declaration, we see: int variable; If that's the case, one should test for the type keyword before anything, to avoid matching something else, like a string or a constant defined with the preprocessor (?:\w+\s+)([a-zA-Z_][a-zA-Z0-9]+) variable name resides in \1.

Boost.Spirit: Lex + Qi error reporting

阅读更多关于 Boost.Spirit: Lex + Qi error reporting

问题 I am writing a parser for quite complicated config files that make use of indentation etc. I decided to use Lex to break input into tokens as it seems to make life easier. The problem is that I cannot find any examples of using Qi error reporting tools ( on_error ) with parsers that operate on stream of tokens instead of characters. Error handler to be used in on_error takes some to be able to indicate exactly where the error is in the input stream. All examples just construct std::string

Generating a compiler from lex and yacc grammar

阅读更多关于 Generating a compiler from lex and yacc grammar

I'm trying to generate a compiler so I can pass him a .c file after. I've downloaded both YACC and LEX grammars from http://www.quut.com/c/ANSI-C-grammar-y.html and named them clexyacc.l and clexyacc.y When generating it on terminal I did : yacc -d clexyacc.y lex clexyacc.l All went fine. When I move on to the last part I get a few errors. The last part is : cc lex.yy.c y.tab.c -oclexyacc.exe But I get these errors : y.tab.c:2261:16: warning: implicit declaration of function 'yylex' is invalid in C99 [-Wimplicit-function-declaration] yychar = YYLEX; ^ y.tab.c:1617:16: note: expanded from macro

Boost.Spirit: Lex + Qi error reporting

阅读更多关于 Boost.Spirit: Lex + Qi error reporting

I am writing a parser for quite complicated config files that make use of indentation etc. I decided to use Lex to break input into tokens as it seems to make life easier. The problem is that I cannot find any examples of using Qi error reporting tools ( on_error ) with parsers that operate on stream of tokens instead of characters. Error handler to be used in on_error takes some to be able to indicate exactly where the error is in the input stream. All examples just construct std::string from the pair of iterators and print them. But if Lex is used, that iterators are iterators to the

Lex - How to run / compile a lex program on commandline

阅读更多关于 Lex - How to run / compile a lex program on commandline

I am very new to Lex and Yacc. I have a Lex program. Example: wordcount.l I am using windows and putty. I am just trying to run this file.. Does the wordcount.l file go on the C drive? Do I compile the Lex program and it generates a .c program and then what do I run? I tried on the command-line: Lex wordcount.l but I just get file not found... wordcount.l %{ #include <stdlib.h> #include <stdio.h> int charCount=0; int wordCount=0; int lineCount=0; %} %% \n {charCount++; lineCount++;} [^ \t\n]+ {wordCount++; charCount+=yyleng;} . {charCount++;} %% main(argc, argv) int argc; char** argv; { if