问题
I've got the following code below which parses a text file and indexes the words and lines:
bool Database::addFromFileToListAndIndex(string path, BSTIndex* & index, list<Line *> & myList)
{
bool result = false;
ifstream txtFile;
txtFile.open(path, ifstream::in);
char line[200];
Line * ln;
//if path is valid AND is not already in the list then add it
if(txtFile.is_open() && (find(textFilePaths.begin(), textFilePaths.end(), path) == textFilePaths.end())) //the path is valid
{
//Add the path to the list of file paths
textFilePaths.push_back(path);
int lineNumber = 1;
while(!txtFile.eof())
{
txtFile.getline(line, 200);
ln = new Line(line, path, lineNumber);
if(ln->getLine() != "")
{
lineNumber++;
myList.push_back(ln);
vector<string> words = lineParser(ln);
for(unsigned int i = 0; i < words.size(); i++)
{
index->addWord(words[i], ln);
}
}
}
result = true;
}
return result;
}
My code works flawlessly and fairly quickly until I give it a HUGE text file. Then I get a stack overflow error from Visual Studio. When I switch to "Release" configuration the code runs without a hitch. Is there something wrong with my code or is there some kind of limitation when running the "Debug" configuration? Am I trying to do too much in one function? If so how can I break it up so it doesn't crash while debugging?
EDIT Per request, my implementation of addWord;
void BSTIndex::addWord(BSTIndexNode *& pCurrentRoot, string word, Line * pLine)
{
if(pCurrentRoot == NULL) //BST is empty
{
BSTIndexNode * nodeToAdd = new BSTIndexNode();
nodeToAdd->word = word;
nodeToAdd->pData = pLine;
pCurrentRoot = nodeToAdd;
return;
}
//BST not empty
if (word < (pCurrentRoot->word)) //Go left
{
addWord(pCurrentRoot->pLeft, word, pLine);
}
else //Go right
{
addWord(pCurrentRoot->pRight, word, pLine);
}
}
And lineParser:
vector<string> Database::lineParser(Line * ln) //Parses a line and returns a vector of the words it contains
{
vector<string> result;
string word;
string line = ln->getLine();
//Regular Expression, matches anything that is not a letter, number, whitespace, or apostrophe
tr1::regex regEx("[^A-Za-z0-9\\s\\']");
//Using regEx above, replaces all non matching characters with nothing, essentially removing them.
line = tr1::regex_replace(line, regEx, std::string(""));
istringstream iss(line);
while(iss >> word)
{
word = getLowercaseWord(word);
result.push_back(word);
}
return result;
}
回答1:
A stack overflow indicates that you've run out of stack space (probably obvious, but just in case). Typical causes are non-terminating or excessive recursion, or very large stack object duplication. Funnily enough it might be either in this case.
It's likely that in Release your compiler is doing tail call optimization which inhibits stack overflow from excessive recursion.
It's also likely that in Release your compiler is optimizing the return copy of the vector from lineParser.
So you need to find out which condition is overflowing in Debug, I would start with the recursion as the most likely culprit, trying changing the string parameter type to a reference, ie.
void BSTIndex::addWord(BSTIndexNode *& pCurrentRoot, string & word, Line * pLine)
This should stop you from duplicating word object on each nested invocation of addWord.
Also consider adding a std::cout << "recursing addWord" << std::endl; type statement to addWord so that you can see how deep its going and if its terminating correctly.
回答2:
The problem is almost certainly the recursive call in addWord -- in a non-optimized build this will consume lots of stack space, while in an optimized build, the compiler will turn it into a tail call, which reuses the same stack frame.
You could manually transform the recursive call into a loop pretty easily:
void BSTIndex::addWord(BSTIndexNode ** pCurrentRoot, string word, Line * pLine)
{
while (*pCurrentRoot != NULL) {
//BST not empty
if (word < (*pCurrentRoot)->word) //Go left
{
pCurrentRoot = &(*pCurrentRoot)->pLeft;
}
else //Go right
{
pCurrentRoot = &(*pCurrentRoot)->pRight;
}
}
//BST is empty
BSTIndexNode * nodeToAdd = new BSTIndexNode();
nodeToAdd->word = word;
nodeToAdd->pData = pLine;
*pCurrentRoot = nodeToAdd;
}
回答3:
You should post your stack also, that'd actually show what caused the overflow. It looks fairly obvious that the recursion in addWord is significantly consuming stack memory.
If you just want to have it work, go into your compiler/linker settings and increase the size reserved for your stack. By default it's only 1MB, crank it up to 32MB or something and you'll be assured whatever extra counter's or probe's the debuging build has, you'll have enough stack to handle it.
回答4:
You can increase the size of the stack to appropriate number of bytes.
#pragma comment(linker, "/STACK:1000000000")
来源:https://stackoverflow.com/questions/5670904/stack-overflow-when-debugging-but-not-in-release