Qi Symbols slow performance?

半腔热情 提交于 2019-12-04 08:07:37

On 12-01-18 14:15, Stephan Menzel wrote:

So I came up with two different implementations. Please find the source attached.

I've looked at it. A few superficial observations first:

  1. You're comparing apples and pears, since Beast uses zero-copy string-views, where Qi doesn't.

  2. Also, the sample selection invokes UB because uniform_int_distribution(0,10) is out of range for the sample array (should be (0, 9)).

  3. Lastly, the map approach didn't have a mapping for the .txt extension.

With these out of the way I simplified/structured the test program to the following:

Live On Coliru

Prints the following on my system:

Lambda runs took 2319 nanoseconds
Qi     runs took 2841 nanoseconds
Map    runs took 193 nanoseconds

Now, the biggest culprit is (obviously?) that you're constructing the grammar each time through the loop (compiling the rules). Of course, there's no need. Removing that yields:

Live On Coliru

Lambda runs took 2676 nanoseconds
Qi     runs took 98 nanoseconds
Map    runs took 189 nanoseconds

That's already faster, even though you're still copying strings when there's no actual need for it. Using the inspiration from the answer linked above, I'd probably write it like:

#include <boost/spirit/include/qi.hpp>
namespace qi_impl {
    namespace qi = boost::spirit::qi;

    struct mimetype_symbols_type : qi::symbols<char, char const*> {
        mimetype_symbols_type() {
            auto rev = [](string_view s) -> std::string { return { s.rbegin(), s.rend() }; };

            this->add
                (rev(".htm"),  "text/html")
                (rev(".html"), "text/html")
                (rev(".php"),  "text/html")
                (rev(".css"),  "text/css")
                (rev(".txt"),  "text/plain")
                (rev(".js"),   "application/javascript")
                (rev(".json"), "application/json")
                (rev(".xml"),  "application/xml")
                (rev(".swf"),  "application/x-shockwave-flash")
                (rev(".flv"),  "video/x-flv")
                (rev(".png"),  "image/png")
                (rev(".jpe"),  "image/jpeg")
                (rev(".jpeg"), "image/jpeg")
                (rev(".jpg"),  "image/jpeg")
                (rev(".gif"),  "image/gif")
                (rev(".bmp"),  "image/bmp")
                (rev(".ico"),  "image/vnd.microsoft.icon")
                (rev(".tiff"), "image/tiff")
                (rev(".tif"),  "image/tiff")
                (rev(".svg"),  "image/svg+xml")
                (rev(".svgz"), "image/svg+xml")
                ;
        }
    } static const mime_symbols;

    char const* using_spirit(const string_view &n_path) {
        char const* result = "application/text";
        qi::parse(n_path.crbegin(), n_path.crend(), qi::no_case[mime_symbols], result);
        return result;
    }
}

There is no more no need to muck with finding "the last dot" in the first place, no need to "check for the match being at the end", and you get the value directly from the symbols. You are free to assign to a string_view or a std::string as desired.

Full Listing

Using string_views (both std::string_view and boost::string_view supported/shown) throughout.

Note also this shows a custom comparator being used on the map<> approach, just to prove that indeed there's a benefit from knowing that the map keys are all lower-case. (It's not, in fact, because it "cached the lowercase" since it's only used once!)

Live On Coliru

#include <boost/chrono.hpp>
#include <string>

#ifdef BOOST_STRING_VIEW
    #include <boost/utility/string_view.hpp>
    using string_view = boost::string_view;
#else
    #include <string_view>
    using string_view = std::string_view;
#endif

static auto constexpr npos = string_view::npos;

#include <boost/spirit/include/qi.hpp>
namespace qi_impl {
    namespace qi = boost::spirit::qi;

    struct mimetype_symbols_type : qi::symbols<char, char const*> {
        mimetype_symbols_type() {
            auto rev = [](string_view s) -> std::string { return { s.rbegin(), s.rend() }; };

            this->add
                (rev(".htm"),  "text/html")
                (rev(".html"), "text/html")
                (rev(".php"),  "text/html")
                (rev(".css"),  "text/css")
                (rev(".txt"),  "text/plain")
                (rev(".js"),   "application/javascript")
                (rev(".json"), "application/json")
                (rev(".xml"),  "application/xml")
                (rev(".swf"),  "application/x-shockwave-flash")
                (rev(".flv"),  "video/x-flv")
                (rev(".png"),  "image/png")
                (rev(".jpe"),  "image/jpeg")
                (rev(".jpeg"), "image/jpeg")
                (rev(".jpg"),  "image/jpeg")
                (rev(".gif"),  "image/gif")
                (rev(".bmp"),  "image/bmp")
                (rev(".ico"),  "image/vnd.microsoft.icon")
                (rev(".tiff"), "image/tiff")
                (rev(".tif"),  "image/tiff")
                (rev(".svg"),  "image/svg+xml")
                (rev(".svgz"), "image/svg+xml")
                ;
        }
    } static const mime_symbols;

    char const* using_spirit(const string_view &n_path) {
        char const* result = "application/text";
        qi::parse(n_path.crbegin(), n_path.crend(), qi::no_case[mime_symbols], result);
        return result;
    }
}

#include <boost/algorithm/string.hpp>
namespace impl {
    string_view using_iequals(const string_view &n_path) {

        using boost::algorithm::iequals;

        auto const ext = [&n_path] {
            auto pos = n_path.rfind(".");
            return pos != npos? n_path.substr(pos) : string_view {};
        }();

        if (iequals(ext, ".htm"))  return "text/html";
        if (iequals(ext, ".html")) return "text/html";
        if (iequals(ext, ".php"))  return "text/html";
        if (iequals(ext, ".css"))  return "text/css";
        if (iequals(ext, ".txt"))  return "text/plain";
        if (iequals(ext, ".js"))   return "application/javascript";
        if (iequals(ext, ".json")) return "application/json";
        if (iequals(ext, ".xml"))  return "application/xml";
        if (iequals(ext, ".swf"))  return "application/x-shockwave-flash";
        if (iequals(ext, ".flv"))  return "video/x-flv";
        if (iequals(ext, ".png"))  return "image/png";
        if (iequals(ext, ".jpe"))  return "image/jpeg";
        if (iequals(ext, ".jpeg")) return "image/jpeg";
        if (iequals(ext, ".jpg"))  return "image/jpeg";
        if (iequals(ext, ".gif"))  return "image/gif";
        if (iequals(ext, ".bmp"))  return "image/bmp";
        if (iequals(ext, ".ico"))  return "image/vnd.microsoft.icon";
        if (iequals(ext, ".tiff")) return "image/tiff";
        if (iequals(ext, ".tif"))  return "image/tiff";
        if (iequals(ext, ".svg"))  return "image/svg+xml";
        if (iequals(ext, ".svgz")) return "image/svg+xml";
        return "application/text";
    }
}

#include <boost/algorithm/string.hpp>
#include <map>

namespace impl {
    struct CiCmp {
        template <typename R1, typename R2>
        bool operator()(R1 const& a, R2 const& b) const {
            return boost::algorithm::ilexicographical_compare(a, b);
        }
    };

    static const std::map<string_view, string_view, CiCmp> s_mime_exts_map  {
        { ".txt", "text/plain" },
        { ".htm",  "text/html" },
        { ".html", "text/html" },
        { ".php",  "text/html" },
        { ".css",  "text/css"  },
        { ".js",   "application/javascript" },
        { ".json", "application/json" },
        { ".xml",  "application/xml" },
        { ".swf",  "application/x-shockwave-flash" },
        { ".flv",  "video/x-flv" },
        { ".png",  "image/png" },
        { ".jpe",  "image/jpeg" },
        { ".jpeg", "image/jpeg" },
        { ".jpg",  "image/jpeg" },
        { ".gif",  "image/gif" },
        { ".bmp",  "image/bmp" },
        { ".ico",  "image/vnd.microsoft.icon" },
        { ".tif",  "image/tiff" },
        { ".tiff", "image/tiff" },
        { ".svg",  "image/svg+xml"},
        { ".svgz", "image/svg+xml"},
    };

    string_view using_map(const string_view& n_path) {
        auto const ext = [](string_view n_path) {
            auto pos = n_path.rfind(".");
            return pos != npos? n_path.substr(pos) : string_view {};
        };

        auto i = s_mime_exts_map.find(ext(n_path));

        if (i != s_mime_exts_map.cend()) {
            return i->second;
        } else {
            return "application/text";
        }
    }
}

#include <random>
namespace samples {

    static string_view const s_samples[] = {
    "test.txt",
    "test.html",
    "longer/test.tiff",
    "www.webSite.de/ico.ico",
    "www.websIte.de/longEr/path/ico.bmp",
    "www.TEST.com/longer/path/ico.svg",
    "googlecom/shoRT/path/index.HTM",
    "googlecom/bild.jpg",
    "WWW.FLASH.COM/app.swf",
    "WWW.FLASH.COM/BILD.GIF"
    };

    std::mt19937 s_random_generator(std::random_device{}());
    std::uniform_int_distribution<> s_dis(0, boost::size(s_samples) - 1);

    string_view random_sample() {
        return s_samples[s_dis(s_random_generator)];
    }
}

#include <boost/functional/hash.hpp>
#include <iostream>
template <typename F>
int generic_test(F f) {
    auto sample = samples::random_sample();
    string_view result = f(sample);

    //std::cout << "DEBUG " << sample << " -> " << result << "\n";

    return boost::hash_range(result.begin(), result.end());
}

#include <boost/serialization/array_wrapper.hpp> // missing include in boost version on coliru
#include <boost/accumulators/accumulators.hpp>
#include <boost/accumulators/statistics.hpp>

template <typename F>
auto benchmark(F f) {
    using C = boost::chrono::high_resolution_clock;
    using duration = C::duration;

    const unsigned int loops = 100000;

    namespace ba = boost::accumulators;
    ba::accumulator_set<duration, ba::features<ba::tag::mean>> times;

    for (unsigned int i = 0; i < loops; i++) {
        auto start = C::now();
        generic_test(f);
        times(C::now() - start);
    }

    return ba::mean(times);
}


int main() {
    std::cout << std::unitbuf;
    std::cout << "Lambda runs took " << benchmark(impl::using_iequals)   << std::endl;
    std::cout << "Qi     runs took " << benchmark(qi_impl::using_spirit) << std::endl;
    std::cout << "Map    runs took " << benchmark(impl::using_map)       << std::endl;
}

Prints

Lambda runs took 2470 nanoseconds
Qi     runs took 119 nanoseconds
Map    runs took 2239 nanoseconds // see Note above
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!