Use Scala parser combinator to parse CSV files

前端 未结 3 1459
陌清茗
陌清茗 2020-11-30 21:09

I\'m trying to write a CSV parser using Scala parser combinators. The grammar is based on RFC4180. I came up with the following code. It almost works, but I cannot get it to

3条回答
  •  情歌与酒
    2020-11-30 21:11

    What you missed is whitespace. I threw in a couple bonus improvements.

    import scala.util.parsing.combinator._
    
    object CSV extends RegexParsers {
      override protected val whiteSpace = """[ \t]""".r
    
      def COMMA   = ","
      def DQUOTE  = "\""
      def DQUOTE2 = "\"\"" ^^ { case _ => "\"" }
      def CR      = "\r"
      def LF      = "\n"
      def CRLF    = "\r\n"
      def TXT     = "[^\",\r\n]".r
    
      def file: Parser[List[List[String]]] = repsep(record, CRLF) <~ opt(CRLF)
      def record: Parser[List[String]] = rep1sep(field, COMMA)
      def field: Parser[String] = (escaped|nonescaped)
      def escaped: Parser[String] = (DQUOTE~>((TXT|COMMA|CR|LF|DQUOTE2)*)<~DQUOTE) ^^ { case ls => ls.mkString("")}
      def nonescaped: Parser[String] = (TXT*) ^^ { case ls => ls.mkString("") }
    
      def parse(s: String) = parseAll(file, s) match {
        case Success(res, _) => res
        case _ => List[List[String]]()
      }
    }
    

提交回复
热议问题