How to parse csv using boost::spirit

后端 未结 2 816
借酒劲吻你
借酒劲吻你 2020-12-02 00:21

I have this csv line

std::string s = R\"(1997,Ford,E350,\"ac, abs, moon\",\"some \"rusty\" parts\",3000.00)\";

I can parse it using

2条回答
  •  死守一世寂寞
    2020-12-02 00:49

    For a background on parsing (optionally) quoted delimited fields, including different quoting characters (', "), see here:

    • Parse quoted strings with boost::spirit

    For a very, very, very complete example complete with support for partially quoted values and a

    splitInto(input, output, ' ');
    

    method that takes 'arbitrary' output containers and delimiter expressions, see here:

    • How to make my split work only on one real line and be capable to skip quoted parts of string?

    Addressing your exact question, assuming either quoted or unquoted fields (no partial quotes inside field values), using Spirit V2:

    Let's take the simplest 'abstract datatype' that could possibly work:

    using Column  = std::string;
    using Columns = std::vector;
    using CsvLine = Columns;
    using CsvFile = std::vector;
    

    And the repeated double-quote escapes a double-quote semantics (as I pointed out in the comment), you should be able to use something like:

    static const char colsep = ',';
    
    start  = -line % eol;
    line   = column % colsep;
    column = quoted | *~char_(colsep);
    quoted = '"' >> *("\"\"" | ~char_('"')) >> '"';
    

    The following complete test program prints

    [1997][Ford][E350][ac, abs, moon][rusty][3001.00]
    

    (Note the BOOST_SPIRIT_DEBUG define for easy debugging). See it Live on Coliru

    Full Demo

    //#define BOOST_SPIRIT_DEBUG
    #include 
    
    namespace qi = boost::spirit::qi;
    
    using Column  = std::string;
    using Columns = std::vector;
    using CsvLine = Columns;
    using CsvFile = std::vector;
    
    template 
    struct CsvGrammar : qi::grammar
    {
        CsvGrammar() : CsvGrammar::base_type(start)
        {
            using namespace qi;
    
            static const char colsep = ',';
    
            start  = -line % eol;
            line   = column % colsep;
            column = quoted | *~char_(colsep);
            quoted = '"' >> *("\"\"" | ~char_('"')) >> '"';
    
            BOOST_SPIRIT_DEBUG_NODES((start)(line)(column)(quoted));
        }
      private:
        qi::rule start;
        qi::rule line;
        qi::rule column;
        qi::rule quoted;
    };
    
    int main()
    {
        const std::string s = R"(1997,Ford,E350,"ac, abs, moon","""rusty""",3001.00)";
    
        auto f(begin(s)), l(end(s));
        CsvGrammar p;
    
        CsvFile parsed;
        bool ok = qi::phrase_parse(f,l,p,qi::blank,parsed);
    
        if (ok)
        {
            for(auto& line : parsed) {
                for(auto& col : line)
                    std::cout << '[' << col << ']';
                std::cout << std::endl;
            }
        } else
        {
            std::cout << "Parse failed\n";
        }
    
        if (f!=l)
            std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
    }
    

提交回复
热议问题