C++, boost: which is fastest way to parse string like tcp://adr:port/ into address string and one int for port?

后端 未结 5 2273
南旧
南旧 2020-12-22 01:31

we have std::string A with tcp://adr:port/ How to parse it into address std::string and one int for port?

相关标签:
5条回答
  • 2020-12-22 02:20
    void extract(std::string const& ip, std::string& address, std::string& service)
    {
       boost::regex e("tcp://(.+):(\\d+)/");
       boost::smatch what;
       if(boost::regex_match(ip, what, e, boost::match_extra))
       {
         boost::smatch::iterator it = what.begin();
         ++it; // skip the first entry..
         address = *it;
         ++it;
         service = *it;
       }
    }
    

    EDIT: reason service is a string here is that you'll need it as a string for resolver! ;)

    0 讨论(0)
  • 2020-12-22 02:21

    Although some wouldn't consider it particularly kosher C++, probably the easiest way would be to use sscanf:

    sscanf(A.c_str(), "tcp://%[^:]:%d", &addr, &port);
    

    Another possibility would be to put the string into a stringstream, imbue the stream with a facet that treats most alphabetic and punctuation as whitespace, and just read the address and port like:

    std::istringstream buffer(A);
    buffer.imbue(new numeric_only);
    buffer >> addr >> port;
    

    The facet would look something like this:

    struct digits_only: std::ctype<char> 
    {
        digits_only(): std::ctype<char>(get_table()) {}
    
        static std::ctype_base::mask const* get_table()
        {
            // everything is white-space:
            static std::vector<std::ctype_base::mask> 
                rc(std::ctype<char>::table_size,std::ctype_base::space);
    
            // except digits, which are digits
            std::fill(&rc['0'], &rc['9'], std::ctype_base::digit);
    
            // and '.', which we'll call punctuation:
            rc['.'] = std::ctype_base::punct;
            return &rc[0];
        }
    };
    

    operator>> treats whitespace as separators between "fields", so this will treat something like 192.168.1.1:25 as two strings: "192.168.1.1" and "25".

    0 讨论(0)
  • 2020-12-22 02:21

    Nowadays one may also meet IPv6 addresses with a host part that already contains a variable number of colons and dots. Splitting URL's then should be done following RFC3986. See wikipedia IPv6

    0 讨论(0)
  • 2020-12-22 02:26

    Fastest as in computer time or programmer time? I can't speak of benchmarks but the uri library in the cpp-netlib framework works very well and is very easy and straightforward to use.

    http://cpp-netlib.github.com/0.8-beta/uri.html

    0 讨论(0)
  • 2020-12-22 02:26

    You could use a tool like re2c to create a fast custom scanner. I'm also unclear on what you consider to be "fastest" -- for the processor or development time or both?

    0 讨论(0)
提交回复
热议问题