Scala Parser Combinators: Parsing in a stream

落爺英雄遲暮 提交于 2019-12-05 21:14:51

There is no easy or built-in way to accomplish this using scala's parser combinators, which provide a facility for implementing parsing expression grammars.

Operators such as ||| (longest match) are largely incompatible with a stream parsing model, as they require extensive backtracking capabilities. In order to accomplish what you are trying to do, you would need to re-formulate your grammar such that no backtracking is required, ever. This is generally much harder than it sounds.

As mentioned by others, your best bet would be to look into a preliminary phase where you chunk your input (e.g. by line) so that you can handle a portion of the stream at a time.

One easy way of doing it is to grab an Iterator from the Source object and then walk through the lines like so:

val source = Source.fromFile("myFile")
val lines = source.getLines
for (line <- lines) {
    // Do magic with the line-value
}
source.close // Close the file

But you will need to be able to use the lines one by one in your parser of course.

Source: https://groups.google.com/forum/#!topic/scala-user/LPzpXo3sUVE

You might try the StreamReader class that is part of the parsing package.

You would use it something like:

val f = StreamReader( fromFile("myfile","UTF-8").reader() )

parseAll( parser, f )

The longest match as one poster above mentioned combined with regex's using source.subSequence(0, source.length) means even StreamReader doesn't help.

The best kludgy answer I have is use getLines as others have mentioned, and chunk as the accepted answer mentions. My particular input required me to chunk 2 lines at a time. You could build an iterator out of the chunks you build to make it slightly less ugly.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!