C++ string parsing (python style)

白昼怎懂夜的黑 提交于 2019-11-28 05:27:01
klew

I`d do something like this:

ifstream f("data.txt");
string str;
while (getline(f, str)) {
    Point p;
    sscanf(str.c_str(), "%f, %f, %f\n", &p.x, &p.y, &p.z); 
    points.push_back(p);
}

x,y,z must be floats.

And include:

#include <iostream>
#include <fstream>

The C++ String Toolkit Library (StrTk) has the following solution to your problem:

#include <string>
#include <deque>
#include "strtk.hpp"

struct point { double x,y,z; }

int main()
{
   std::deque<point> points;
   point p;
   strtk::for_each_line("data.txt",
                        [&points,&p](const std::string& str)
                        {
                           strtk::parse(str,",",p.x,p.y,p.z);
                           points.push_back(p);
                        });
   return 0;
}

More examples can be found Here

All these good examples aside, in C++ you would normally override the operator >> for your point type to achieve something like this:

point p;
while (file >> p)
    points.push_back(p);

or even:

copy(
    istream_iterator<point>(file),
    istream_iterator<point>(),
    back_inserter(points)
);

The relevant implementation of the operator could look very much like the code by j_random_hacker.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>     // For replace()

using namespace std;

struct Point {
    double a, b, c;
};

int main(int argc, char **argv) {
    vector<Point> points;

    ifstream f("data.txt");

    string str;
    while (getline(f, str)) {
        replace(str.begin(), str.end(), ',', ' ');
        istringstream iss(str);
        Point p;
        iss >> p.a >> p.b >> p.c;
        points.push_back(p);
    }

    // Do something with points...

    return 0;
}

This answer is based on the previous answer by j_random_hacker and makes use of Boost Spirit.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <boost/spirit.hpp>

using namespace std;
using namespace boost;
using namespace boost::spirit;

struct Point {
    double a, b, c;
};

int main(int argc, char **argv) 
{
    vector<Point> points;

    ifstream f("data.txt");

    string str;
    Point p;
    rule<> point_p = 
           double_p[assign_a(p.a)] >> ',' 
        >> double_p[assign_a(p.b)] >> ',' 
        >> double_p[assign_a(p.c)] ; 

    while (getline(f, str)) 
    {
        parse( str, point_p, space_p );
        points.push_back(p);
    }

    // Do something with points...

    return 0;
}

Fun with Boost.Tuples:

#include <boost/tuple/tuple_io.hpp>
#include <vector>
#include <fstream>
#include <iostream>
#include <algorithm>

int main() {
    using namespace boost::tuples;
    typedef boost::tuple<float,float,float> PointT;

    std::ifstream f("input.txt");
    f >> set_open(' ') >> set_close(' ') >> set_delimiter(',');

    std::vector<PointT> v;

    std::copy(std::istream_iterator<PointT>(f), std::istream_iterator<PointT>(),
             std::back_inserter(v)
    );

    std::copy(v.begin(), v.end(), 
              std::ostream_iterator<PointT>(std::cout)
    );
    return 0;
}

Note that this is not strictly equivalent to the Python code in your question because the tuples don't have to be on separate lines. For example, this:

1,2,3 4,5,6

will give the same output than:

1,2,3
4,5,6

It's up to you to decide if that's a bug or a feature :)

You could read the file from a std::iostream line by line, put each line into a std::string and then use boost::tokenizer to split it. It won't be quite as elegant/short as the python one but a lot easier than reading things in a character at a time...

Its nowhere near as terse, and of course I didn't compile this.

float atof_s( std::string & s ) { return atoi( s.c_str() ); }
{ 
ifstream f("data.txt")
string str;
vector<vector<float>> data;
while( getline( f, str ) ) {
  vector<float> v;
  boost::algorithm::split_iterator<string::iterator> e;
  std::transform( 
     boost::algorithm::make_split_iterator( str, token_finder( is_any_of( "," ) ) ),
     e, v.begin(), atof_s );
  v.resize(3); // only grab the first 3
  data.push_back(v);
}

One of Sony Picture Imagework's open-source projects is Pystring, which should make for a mostly direct translation of the string-splitting parts:

Pystring is a collection of C++ functions which match the interface and behavior of python’s string class methods using std::string. Implemented in C++, it does not require or make use of a python interpreter. It provides convenience and familiarity for common string operations not included in the standard C++ library

There are a few examples, and some documentation

all these are good examples. yet they dont answer the following:

  1. a CSV file with different column numbers (some rows with more columns than others)
  2. or when some of the values have white space (ya yb,x1 x2,,x2,)

so for those who are still looking, this class: http://www.codeguru.com/cpp/tic/tic0226.shtml is pretty cool... some changes might be needed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!