How to parse csv using boost::spirit

后端 未结 2 820
借酒劲吻你
借酒劲吻你 2020-12-02 00:21

I have this csv line

std::string s = R\"(1997,Ford,E350,\"ac, abs, moon\",\"some \"rusty\" parts\",3000.00)\";

I can parse it using

2条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-02 01:13

    Sehe's post looks a fair bit cleaner than mine, but I was putting this together for a bit, so here it is anyways:

    #include 
    #include 
    
    namespace qi = boost::spirit::qi;
    
    int main() {
        const std::string s = R"(1997,Ford,E350,"ac, abs, moon",""rusty"",3000.00)";
    
        // Tokenizer
        typedef boost::tokenizer< boost::escaped_list_separator , std::string::const_iterator, std::string> Tokenizer;
        boost::escaped_list_separator seps('\\', ',', '\"');
        Tokenizer tok(s, seps);
        for (auto i : tok)
            std::cout << i << "\n";
        std::cout << "\n";
    
        // Boost Spirit Qi
        qi::rule quoted_string = '"' >> *(qi::char_ - '"') >> '"';
        qi::rule valid_characters = qi::char_ - '"' - ',';
        qi::rule item = *(quoted_string | valid_characters );
        qi::rule()> csv_parser = item % ',';
    
        std::string::const_iterator s_begin = s.begin();
        std::string::const_iterator s_end = s.end();
        std::vector result;
    
        bool r = boost::spirit::qi::parse(s_begin, s_end, csv_parser, result);
        assert(r == true);
        assert(s_begin == s_end);
    
        for (auto i : result)
            std::cout << i << std::endl;
        std::cout << "\n";
    }   
    

    And this outputs:

    1997
    Ford
    E350
    ac, abs, moon
    rusty
    3000.00
    
    1997
    Ford
    E350
    ac, abs, moon
    rusty
    3000.00
    

    Something Worth Noting: This doesn't implement a full CSV parser. You'd also want to look into escape characters or whatever else is required for your implementation.

    Also: If you're looking into the documentation, just so you know, in Qi, 'a' is equivalent to boost::spirit::qi::lit('a') and "abc" is equivalent to boost::spirit::qi::lit("abc").

    On Double quotes: So, as Sehe notes in a comment above, it's not directly clear what the rules surrounding a "" in the input text means. If you wanted all instances of "" not within a quoted string to be converted to a ", then something like the following would work.

    qi::rule double_quote_char = "\"\"" >> qi::attr('"');
    qi::rule item = *(double_quote_char | quoted_string | valid_characters );
    

提交回复
热议问题