convert a string to int

╄→尐↘猪︶ㄣ 提交于 2020-01-23 01:09:47

问题


I have a large file where each line contains space-separated integers. The task is to sparse this file line-by-line. For the string to int conversion I have three solutions:

static int stringToIntV1(const string& str) {
    return (atoi(str.c_str()));
}

However, if I pass a malformed string, it doesn't produce any error. For instance the string "123error" is converted to 123.

Second solution:

static int stringToIntV2(const string& str)
{
    int result;
    istringstream myStream(str);

    if (myStream >> result) {
        return result;
    }
    // else
    throw domain_error(str + " is not an int!");
}

I have the same problem here, malformed strings don't raise an error.

Third solution with Boost (found at Boost Library):

static int stringToIntV3(const string& str)
{
    int iResult = 0;
    try {
        iResult = lexical_cast<int>(str);
    }
    catch(bad_lexical_cast &) {
        throw domain_error(str + " is not an int!");
    }
    return iResult;
}

This one gives correct result.

However, there is a significant difference in the execution time. Testing on a large text file (32 MB), I got the following times:

  • (1) with atoi: 4.522s (winner)
  • (2) with istringstream: 15.303s (very slow)
  • (3) with lexical_cast: 10.958s (between the two)

My question: do you know how to notice malformed strings with atoi? It would give the fastest solution. Or do you know a better solution?

Update: Thanks for the answers. Following the tips, I came up with this solution:

static int stringToIntV4(const string& str)
{
    char * pEnd;
    const char * c_str = str.c_str();
    int result = strtol(c_str, &pEnd, 10);
    if (pEnd == c_str+str.length()) {
        return result;
    }
    // else
    throw domain_error("'" + str + "'" + " is not an int!");
}

The good news is that it yields if there is a problem and as efficient as the atoi version.


回答1:


I'd use strtol. It takes a parameter that it sets to point at the first character it couldn't convert, so you can use that to determine whether the entire string was converted.

Edit: as far as speed goes, I'd expect it to be slightly slower than atoi, but faster than the others you tried.




回答2:


The strtol function gives you back a pointer to the next character in the string. You could check that character after conversion to see if it is not whitespace, indicating error. I ran a simple test and the performance of strtol seems to be comparable to atoi.




回答3:


I know this one has been covered with existing functions. If performance is a paramount concern, it is not much more work to write your own conversion to do exactly what you need. For example, if you know there will not be any leading spaces or all your numbers are positive, don't deal with those cases. The compiler should be able to inline this function too.

#include <ctype.h>
#include <string>

inline int customStringToInt( const std::string &str ) {
  register const char *p = str.c_str(), *pEnd = p + str.size();

  // Remove leading whitespace (remove if no leading whitespace).
  while( p != pEnd && isspace(*p) ) ++p;

  // Handle neg/pos sign (remove if no sign).
  int sign = 1;
  if( p != pEnd ) {
    if(      *p == '-' ) { ++p; sign = -1; }
    else if( *p == '+' ) { ++p; }
  } 

  // String without any digits is not OK (blank != 0) (remove if it is OK).
  if( p == pEnd ) throw domain_error("'" + str + "'" + " has no digits."); 

  // Handle digits.
  register int i = 0;
  while( p != pEnd )
    if( isdigit(*p) ) i = i * 10 + (*p++ - '0');
    else throw domain_error("'" + str + "'" + " contains non-digits.");

  return sign * i;
}



回答4:


Can you simply check the string before calling atoi?

isdigit() is a wonderful little function, and you should have no problem writing isalldigits() to check a string in a tight, fast loop.

(if you happen to work with decimals, or +/- signs, you might want to add that in too)

So, simply verify that the string is all digits, then call atoi. Should be pretty fast.




回答5:


I did my own small test. It seem to take about the same amount of time to use atoi() as with stream operators. I have a file with 2,000,000 numbers. 10 numbers on each line (though the code does not use this fact).

The first version uses atoi() which I admit took me a while to get correct and could be more effecient. Updates accepted and to get it more effecient.

The stream one. Took 20 seconds to write and worked out of the box.
Timing Results Are:

> time ./a.exe 1
AtoI()
6.084u 0.156s 0:06.33 98.4%     0+0k 0+0io 8619pf+0w
> time ./a.exe
Iterator()
4.680u 0.296s 0:05.05 98.4%     0+0k 0+0io 6304pf+0w

Code:

#include <vector>
#include <iostream>
#include <iterator>
#include <fstream>
#include <iostream>

int main(int argc,char* argv[])
{
    std::vector<int>    data;
    std::ifstream       vals("rand.txt");

    if (argc > 1)
    {
        std::cout << "AtoI()\n";

        std::string line;
        while(std::getline(vals,line))
        {
            std::string::size_type loop = 0;
            while(loop < line.size())
            {
                while(isspace(line[loop]) && (loop < line.size()))
                {   ++loop;
                }
                std::string::size_type end = line.find_first_not_of("0123456789",loop);
                data.push_back(atoi(line.substr(loop,end - loop).c_str()));

                loop = end;
            }

        }
    }
    else
    {
        std::cout << "Iterator()\n";
        std::copy(  std::istream_iterator<int>(vals),
                    std::istream_iterator<int>(),
                    std::back_inserter(data)
                 );
    }
}



回答6:


Any leading zeroes in the numbers? if atoi returns zero and the first digit character of the number is not '0', then we have an error.




回答7:


Why not convert it while you have a valid file pointer rather than loading it into a string then parsing the integer? Basically you're wasting memory and time.

In my opinion you should do this:

// somewhere in your code
ifstream fin("filename");
if(!fin) {
 // handle error
}

int number;
while(getNextNumber(fin, number))
{
 cout << number << endl;
}
fin.close();
// done

// some function somewhere
bool getNextNumber(ifstream& fin, int& number)
{
 while(!(fin >> number))
 {
  fin.clear();
  fin.ignore(numeric_limits<streamsize>::max(),'\n');
 }

 if(!fin || fin.eof())
  return false;
 return true;
}


来源:https://stackoverflow.com/questions/2023519/convert-a-string-to-int

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!