问题
I'm looking at writing a lexer using boost::spirit::lex, but all the examples I can find seem to assume that you've read the entire file into RAM first. I'd like to write a lexer that doesn't require the whole string to be in RAM, is that possible? Or do I need to use something else?
I tried using istream_iterator, but boost gives me a compile error unless I use const char* as the iterator types.
e.g. All the examples I can find basically do this:
lex_functor_type< lex::lexertl::lexer<> > lex_functor;
// assumes entire file is in memory
char const* first = str.c_str();
char const* last = &first[str.size()];
bool r = lex::tokenize(first, last, lex_functor,
boost::bind(lex_callback_functor(), _1, ... ));
Also, is it possible to determine line/column numbers from lex tokens somehow?
Thanks!
回答1:
Spirit Lex works with any iterator as long as it conforms to the requirements of standard forward iterators. That means you can feed the lexer (invoke lex::tokenize()
) with any conforming iterator. For instance, if you want to use a std::istream
, you could wrap it into a boost::spirit::istream_iterator
:
bool tokenize(std::istream& is, ...)
{
lex_functor_type< lex::lexertl::lexer<> > lex_functor;
boost::spirit::istream_iterator first(is);
boost::spirit::istream_iterator last;
return lex::tokenize(first, last, lex_functor,
boost::bind (lex_callback_functor(), _1, ... ));
}
and it would work.
For the second part of your question (related to the line/column number of the input): yes it is possible to track the input position using the lexer. It's not trivial, though. You need to create your own token type which stores the line/column information and use this instead of the predefined token type. Many people have been asking for this, so I might just go ahead and create an example.
来源:https://stackoverflow.com/questions/4715829/how-to-use-boostspiritlex-to-lex-a-file-without-reading-the-whole-file-into