lexer

How to handle multi-line comments in a live syntax highlighter?

与世无争的帅哥 提交于 2019-11-28 05:58:35
问题 I'm writing my own text editor with syntax highlighting in Java, and at the moment it simply parses and highlights the current line every time the user enters a single character. While presumably not the most efficient way, it's good enough and doesn't cause any noticeable performance issues. In pseudo-Java, this would be the core concept of my code: public void textUpdated(String wholeText, int updateOffset, int updateLength) { int lineStart = getFirstLineStart(wholeText, updateOffset); int

C++ parser generator [closed]

 ̄綄美尐妖づ 提交于 2019-11-28 05:55:33
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed last year . I'm writing my own scripting language and I need a software tool which generates C++ code for parsing my language. I need a lexical analyzer and a parser generator which generates C++ code. It would be nice for me to be able also to generate a Visual C++ 2010 project. Suggestions? 回答1: Try with Flex and Bison.

How Get error messages of antlr parsing?

二次信任 提交于 2019-11-28 04:31:39
问题 I wrote a grammar with antlr 4.4 like this : grammar CSV; file : row+ EOF ; row : value (Comma value)* (LineBreak | EOF) ; value : SimpleValueA | QuotedValue ; Comma : ',' ; LineBreak : '\r'? '\n' | '\r' ; SimpleValue : ~(',' | '\r' | '\n' | '"')+ ; QuotedValue : '"' ('""' | ~'"')* '"' ; then I use antlr 4.4 for generating parser & lexer, this process is successful after generate classes I wrote some java code for using grammar import org.antlr.v4.runtime.ANTLRInputStream; import org.antlr.v4

How do I get an Antlr Parser rule to read from both default AND hidden channel

眉间皱痕 提交于 2019-11-27 23:13:08
I use the normal whitespace separation into the hidden channel but I have one rule where I would like to include any whitespace for later processing but any example I have found requires some very strange manual coding. Is there no easy option to read from multiple channels like the option to put the whitespace there from the beginning. Ex. this is the WhiteSpace lexer rule WS : ( ' ' | '\t' | '\r' | '\n' ) {$channel=HIDDEN;} ; And this is my rule where I would like to include whitespace raw : '{'? (~('{'))*; Basically it's a catch all rule to capture any content that does not match other

ANTLR4 visitor pattern on simple arithmetic example

痴心易碎 提交于 2019-11-27 20:08:07
I am a complete ANTLR4 newbie, so please forgive my ignorance. I ran into this presentation where a very simple arithmetic expression grammar is defined. It looks like: grammar Expressions; start : expr ; expr : left=expr op=('*'|'/') right=expr #opExpr | left=expr op=('+'|'-') right=expr #opExpr | atom=INT #atomExpr ; INT : ('0'..'9')+ ; WS : [ \t\r\n]+ -> skip ; Which is great because it will generate a very simple binary tree that can be traversed using the visitor pattern as explained in the slides, e.g., here's the function that visits the expr : public Integer visitOpExpr(OpExprContext

Unable to compile output of lex

青春壹個敷衍的年華 提交于 2019-11-27 18:53:02
问题 When I attempt to compile the output of this trivial lex program: # lex.l integer printf("found keyword INT"); using: $ gcc lex.yy.c I get: Undefined symbols: "_yywrap", referenced from: _yylex in ccMsRtp7.o _input in ccMsRtp7.o "_main", referenced from: start in crt1.10.6.o ld: symbol(s) not found collect2: ld returned 1 exit status lex --version tells me I'm actually using 'flex 2.5.35' although ls -fla `which lex` isn't a symlink. Any ideas why the output won't compile? 回答1: From the Flex

Is it a Lexer's Job to Parse Numbers and Strings?

做~自己de王妃 提交于 2019-11-27 13:28:59
Is it a lexer's job to parse numbers and strings? This may or may not sound dumb, given that fact that I'm asking whether a lexer should parse input. However, I'm not sure whether that's in fact the lexer's job or the parser's job, because in order to lex properly, the lexer needs to parse the string/number in the first place , so it would seem like code would be duplicated if the parser does this. Is it indeed the lexer's job? Or should the lexer simply break up a string like 123.456 into the strings 123 , . , 456 and let the parser figure out the rest? Doing this wouldn't be so

When parsing Javascript, what determines the meaning of a slash?

独自空忆成欢 提交于 2019-11-27 07:36:02
Javascript has a tricky grammar to parse. Forward-slashes can mean a number of different things: division operator, regular expression literal, comment introducer, or line-comment introducer. The last two are easy to distinguish: if the slash is followed by a star, it starts a multiline comment. If the slash is followed by another slash, it is a line-comment. But the rules for disambiguating division and regex literal are escaping me. I can't find it in the ECMAScript standard . There the lexical grammar is explicitly divided into two parts, InputElementDiv and InputElementRegExp, depending on

ANTLR What is simpliest way to realize python like indent-depending grammar?

给你一囗甜甜゛ 提交于 2019-11-27 07:27:47
I am trying realize python like indent-depending grammar. Source example: ABC QWE CDE EFG EFG CDE ABC QWE ZXC As i see, what i need is to realize two tokens INDENT and DEDENT, so i could write something like: grammar mygrammar; text: (ID | block)+; block: INDENT (ID|block)+ DEDENT; INDENT: ????; DEDENT: ????; Is there any simple way to realize this using ANTLR? (I'd prefer, if it's possible, to use standard ANTLR lexer.) I don't know what the easiest way to handle it is, but the following is a relatively easy way. Whenever you match a line break in your lexer, optionally match one or more

Lexer written in Javascript?

本小妞迷上赌 提交于 2019-11-27 06:41:35
I have a project where a user needs to define a set of instructions for a ui that is completely written in javascript. I need to have the ability to parse a string of instructions and then translate them into instructions. Is there any libraries out there for parsing that are 100% javascript? Or a generator that will generate in javascript? Thanks! Stobor Something like http://jscc.phorward-software.com/ , maybe? JS/CC is the first available parser development system for JavaScript and ECMAScript -derivates. It has been developed, both, with the intention of building a productive compiler