C++ Socket Server - Unable to saturate CPU

后端 未结 6 1028
刺人心
刺人心 2020-11-29 23:18

I\'ve developed a mini HTTP server in C++, using boost::asio, and now I\'m load testing it with multiple clients and I\'ve been unable to get close to saturating the CPU. I

6条回答
  •  庸人自扰
    2020-11-29 23:46

    ASIO is fine for small to medium tasks but it isn't very good at leveraging the power of the underlying system. Neither are raw socket calls, or even IOCP on Windows but if you are experienced you will always be better than ASIO. Either way there is a lot of overhead with all of those methods, just more with ASIO.

    For what it is worth. using raw socket calls on my custom HTTP can serve 800K dynamic requests per second with a 4 core I7. It is serving from RAM, which is where you need to be for that level of performance. At this level of performance the network driver and OS are consuming about 40% of the CPU. Using ASIO I can get around 50 to 100K requests per second, its performance is quite variable and mostly bound in my app. The post by @cmeerw mostly explains why.

    One way to improve performance is by implementing a UDP proxy. Intercepting HTTP requests and then routing them over UDP to your backend UDP-HTTP server you can bypass a lot of TCP overhead in the operating system stacks. You can also have front ends which pipe through on UDP themselves, which shouldn't be too hard to do yourself. An advantage of a HTTP-UDP proxy is that it allows you to use any good frontend without modification, and you can swap them out at will without any impact. You just need a couple more servers to implement it. This modification on my example lowered the OS CPU usage to 10%, which increased my requests per second to just over a million on that single backend. And FWIW You should always have a frontend-backend setup for any performant site because the frontends can cache data without slowing down the more important dynamic requests backend.

    The future seems to be writing your own driver that implements its own network stack so you can get as close to the requests as possible and implement your own protocol there. Which probably isn't what most programmers want to hear as it is more complicated. In my case I would be able to use 40% more CPU and move to over 1 million dynamic requests per second. The UDP proxy method can get you close to optimal performance without needing to do this, however you will need more servers - though if you are doing this many requests per second you will usually need multiple network cards and multiple frontends to handle the bandwidth so having a couple lightweight UDP proxies in there isn't that big a deal.

    Hope some of this can be useful to you.

提交回复
热议问题