How to count recurring characters at the beginning of a QString?

问题

Im dealing with a list of lines, and I need to count the hashes that occur at the beginning.

#  item 1
## item 1, 1
## item 1, 2
#  item 2

and so on.

If each line is a QString, how can I return the number of hashes occurring at the beginning of the string?

QString s("### foo # bar ");
int numberOfHashes = s.count("#"); // Answer should be 3, not 4

回答1:

Trivially:

int number_of_hashes(const QString &s) {
    int i, l = s.size();
    for(i = 0; i < l && s[i] == '#'; ++i);
    return i;
}

In other languages (mostly interpreted ones) you have to fear iteration over characters as it's slow, and delegate everything to library functions (generally written in C). In C++ iteration is perfectly fine performance-wise, so a down-to-earth for loop will do.

Just for fun, I made a small benchmark comparing this trivial method with the QRegularExpression one from OP, possibly with the RE object cached.

#include <QCoreApplication>
#include <QString>
#include <vector>
#include <QElapsedTimer>
#include <stdlib.h>
#include <iostream>
#include <QRegularExpression>

int number_of_hashes(const QString &s) {
    int i, l = s.size();
    for(i = 0; i < l && s[i] == '#'; ++i);
    return i;
}

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);
    const int count = 100000;
    std::vector<QString> ss;
    for(int i = 0; i < 100; ++i) ss.push_back(QString(rand() % 10, '#') + " foo ## bar ###");
    QElapsedTimer t;
    t.start();
    unsigned tot = 0;
    for(int i = 0; i < count; ++i) {
        for(const QString &s: ss) tot += number_of_hashes(s);
    }
    std::cerr<<"plain loop: "<<t.elapsed()*1000./count<<" ns\n";
    t.restart();
    for(int i = 0; i < count; ++i) {
        for(const QString &s: ss) tot += QRegularExpression("^[#]*").match(s).capturedLength();
    }
    std::cerr<<"QRegularExpression, rebuilt every time: "<<t.elapsed()*1000./count<<" ns\n";

    QRegularExpression re("^[#]*");
    t.restart();
    for(int i = 0; i < count; ++i) {
        for(const QString &s: ss) tot += re.match(s).capturedLength();
    }
    std::cerr<<"QRegularExpression, cached: "<<t.elapsed()*1000./count<<" ns\n";
    return tot;    
}

As expected, the QRegularExpression-based one is two orders of magnitude slower:

plain loop: 0.7 ns
QRegularExpression, rebuilt every time: 75.66 ns
QRegularExpression, cached: 24.5 ns

回答2:

Here I use the standard algorithm find_if_not to get an iterator to the first character that is not a hash. I then return the distance from the start of the string to that iterator.

int number_of_hashes(QString const& s)
{
    auto it = std::find_if_not(std::begin(s), std::end(s), [](QChar c){return c == '#';});
    return std::distance(std::begin(s), it);
}

EDIT: the find_if_not function only takes a unary predicate, not a value, so you have to pass a lambda predicate.

回答3:

int numberOfHashes = 0;
int size = s.size();
QChar ch('#');
for(int i = 0; (i < size) && (s[i] == ch); ++i) {
    ++numberOfHashes;
}

回答4:

Solution without a for-loop:

QString s("### foo # bar ");
int numberOfHashes = QRegularExpression("^[#]*").match(s).capturedLength();

回答5:

Yet another way:

int beginsWithCount(const QString &s, const QChar c) {
  int n = 0;
  for (auto ch : s)
    if (c == ch) n++; else break;
  return n;
}

回答6:

A Qt approach, making use of QString::indexOf(..):

QString s("### foo # bar ");
int numHashes = 0;

while ((numHashes = s.indexOf("#", numHashes)) == numHashes) {
    ++numHashes;
} // numHashes == 3

int QString::indexOf(const QString &str, int from = 0, 
                     Qt::CaseSensitivity cs = Qt::CaseSensitive) const
Returns the index position of the first occurrence of the string str in this string, searching forward from index position from. Returns -1 if str is not found.

Starting at index 0, the string s is searched for the first occurrence of #, and thereafter use a predicate to test whether this occurrence is at index 0. If not terminated, proceeds with index 1, and so on.

This will not short-circuit a final possibly full string search, however. In case a hash is not found at its expected position, prior to the final failing predicate check, the string will be searched fully (or until first hash at wrong position) a single time.

来源：https://stackoverflow.com/questions/52064588/how-to-count-recurring-characters-at-the-beginning-of-a-qstring

标签

c++

qt5

qstring

qregularexpression