Read until a string delimiter in boost::asio::streambuf

余生长醉 提交于 2019-11-29 04:33:27

The async_read_until() operation commits all data read into the streambuf's input sequence, and the bytes_transferred value will contain the number of bytes up to and including the first delimiter. While the operation may read more data beyond the delimiter, one can use the bytes_transferred and delimiter size to extract only the desired data. For example, if cmd1\r\n\r\ncmd2 is available to be read from a socket, and an async_read_until() operation is initiated with a delimiter of \r\n\r\n, then the streambuf's input sequence could contain cmd1\r\n\r\ncmd2:

    ,--------------- buffer_begin(streambuf.data())
   /   ,------------ buffer_begin(streambuf.data()) + bytes_transferred
  /   /                - delimiter.size()
 /   /       ,------ buffer_begin(streambuf.data()) + bytes_transferred
/   /       /   ,--  buffer_end(streambud.data())
cmd1\r\n\r\ncmd2

As such, one could extract cmd1 into a string from the streambuf via:

// Extract up to the first delimiter.
std::string command{
  boost::asio::buffers_begin(streambuf.data(), 
  boost::asio::buffers_begin(streambuf.data()) + bytes_transferred
    - delimiter.size()};
// Consume through the first delimiter.
m_input_buffer.consume(bytes_transferred);

Here is a complete example demonstrating constructing std::string directly from the streambuf's input sequence:

#include <functional> // std::bind
#include <iostream>
#include <boost/asio.hpp>

const auto noop = std::bind([]{});

int main()
{
  using boost::asio::ip::tcp;
  boost::asio::io_service io_service;

  // Create all I/O objects.
  tcp::acceptor acceptor(io_service, tcp::endpoint(tcp::v4(), 0));
  tcp::socket socket1(io_service);
  tcp::socket socket2(io_service);

  // Connect sockets.
  acceptor.async_accept(socket1, noop);
  socket2.async_connect(acceptor.local_endpoint(), noop);
  io_service.run();
  io_service.reset();

  const std::string delimiter = "\r\n\r\n";

  // Write two commands from socket1 to socket2.
  boost::asio::write(socket1, boost::asio::buffer("cmd1" + delimiter));
  boost::asio::write(socket1, boost::asio::buffer("cmd2" + delimiter));

  // Read a single command from socket2.
  boost::asio::streambuf streambuf;
  boost::asio::async_read_until(socket2, streambuf, delimiter,
    [delimiter, &streambuf](
      const boost::system::error_code& error_code,
      std::size_t bytes_transferred)
    {
      // Verify streambuf contains more data beyond the delimiter. (e.g.
      // async_read_until read beyond the delimiter)
      assert(streambuf.size() > bytes_transferred);

      // Extract up to the first delimiter.
      std::string command{
        buffers_begin(streambuf.data()),
        buffers_begin(streambuf.data()) + bytes_transferred
          - delimiter.size()};

      // Consume through the first delimiter so that subsequent async_read_until
      // will not reiterate over the same data.
      streambuf.consume(bytes_transferred);

      assert(command == "cmd1");
      std::cout << "received command: " << command << "\n"
                << "streambuf contains " << streambuf.size() << " bytes."
                << std::endl;
    }
  );
  io_service.run();
}

Output:

received command: cmd1
streambuf contains 8 bytes.

To answer your questions first:

the buffer is supposed to have the exact and full data isn't it?

Yes, it will have all the data including "\r\n\r\n"

What do you recommend to read until I get \r\n\r\n?

What you are doing is fine enough. You just need to ignore the additional '\r' at the end of each command. This you can either do while reading from the stream or let it be handled by the command processor (or anything which does the command processing for you). My recommendation would be to defer the removal of additional '\r' to the command processor.

You probably need something on the lines of :

#include <iostream>
#include <string>
#include <sstream>

void handle_read()
{
  std::stringstream oss;
  oss << "key : value\r\nkey2: value2\r\nkey3: value3\r\n\r\n";
  std::string parsed;

  while (std::getline(oss, parsed)) {
    // Check if it'a an empty line.
    if (parsed == "\r") break;
    // Remove the additional '\r' here or at command processor code.
    if (parsed[parsed.length() - 1] == '\r') parsed.pop_back();
    std::cout << parsed << std::endl;
    std::cout << parsed.length() << std::endl;
  }

}

int main() {
    handle_read();
    return 0;
}

If your protocol allows you to send empty commands, then you will have to change the logic and have a lookout for 2 consecutive empty new lines.

What do you actually wish to parse?

Of course, you could just use knowledge from your domain and say

std::getline(iss, msg, '\r');

At a higher level, consider parsing what you need:

std::istringstream linestream(msg);
std::string command;
int arg;
if (linestream >> command >> arg) {
    // ...
}

Even better, consider a parser generator:

std::string command;
int arg;

if (qi::phrase_parse(msg.begin(), msg.end(), command_ >> qi::int_, qi::space, command, arg))
{
    // ...
}

Where command_ could be like

qi::rule<std::string::const_iterator> command_ = qi::no_case [ 
     qi::lit("my_cmd1") | qi::lit("my_cmd2") 
  ];
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!