String Builder vs Lists

孤人 提交于 2019-12-01 13:10:18

问题


I am reading in multiple files in with millions of lines and I am creating a list of all line numbers that have a specific issue. For example if a specific field is left blank or contains an invalid value.

So my question is what would be the most efficient date type to keep track of a list of numbers that could be upwards of a million number of rows. Would using String Builder, Lists, or something else be more efficient?

My end goal is to out put a message like "Specific field is blank on 1-32, 40, 45, 47, 49-51, etc. So in the case of a String Builder, I would check the previous value and if it is is only 1 more I would change it from 1 to 1-2 and if it was more than one would separate it by a comma. With the List, I would just add each number to the list and then combine then once the file has been completely read. However in this case I could have multiple list containing millions of numbers.

Here is the current code I am using to combine a list of numbers using String Builder:

string currentLine = sbCurrentLineNumbers.ToString();
string currentLineSub;

StringBuilder subCurrentLine = new StringBuilder();
StringBuilder subCurrentLineSub = new StringBuilder();

int indexLastSpace = currentLine.LastIndexOf(' ');
int indexLastDash = currentLine.LastIndexOf('-');

int currentStringInt = 0;

if (sbCurrentLineNumbers.Length == 0)
{
    sbCurrentLineNumbers.Append(lineCount);
}
else if (indexLastSpace == -1 && indexLastDash == -1)
{
    currentStringInt = Convert.ToInt32(currentLine);

    if (currentStringInt == lineCount - 1)
        sbCurrentLineNumbers.Append("-" + lineCount);
    else
    {
        sbCurrentLineNumbers.Append(", " + lineCount);
        commaCounter++;
    }
}
else if (indexLastSpace > indexLastDash)
{
    currentLineSub = currentLine.Substring(indexLastSpace);
    currentStringInt = Convert.ToInt32(currentLineSub);

    if (currentStringInt == lineCount - 1)
        sbCurrentLineNumbers.Append("-" + lineCount);
    else
    {
        sbCurrentLineNumbers.Append(", " + lineCount);
        commaCounter++;
    }
}
else if (indexLastSpace < indexLastDash)
{
    currentLineSub = currentLine.Substring(indexLastDash + 1);
    currentStringInt = Convert.ToInt32(currentLineSub);

    string charOld = currentLineSub;
    string charNew = lineCount.ToString();

    if (currentStringInt == lineCount - 1)
        sbCurrentLineNumbers.Replace(charOld, charNew);
    else
    {
        sbCurrentLineNumbers.Append(", " + lineCount);
        commaCounter++;
    }
}   

回答1:


My end goal is to out put a message like "Specific field is blank on 1-32, 40, 45, 47, 49-51

If that's the end goal, no point in going through an intermediary representation such as a List<int> - just go with a StringBuilder. You will save on memory and CPU that way.




回答2:


Depends on how you can / want to break the code up.

Given you are reading it in line order, not sure you need a list at all. Your current desired output implies that you can't output anything until the file is completely scanned. The size of the file suggests a one pass`analysis phase would be a good idea as well, given you are going to use buffered input as opposed to reading the entire thing into memory.

I'd be tempted with an enum to describe the issue e.g Field??? is blank and then use that as the key a dictionary of string builders.

As a first thought anyway




回答3:


StringBuilder serves your purpose so stick with that, if you ever need the line numbers you can easily change the code then.




回答4:


Is your output supposed to be human readable? If so, you'll hit the limit of what is reasonable to read, long before you have any performance/memory issues from your data structure. Use whatever is easiest for you to work with.

If the output is supposed to be machine readable, then that output might suggest an appropriate data structure.




回答5:


As others have pointed out, I would probably use StringBuilder. The List may have to resize many times; the new implementation of StringBuilder does not have to resize.



来源:https://stackoverflow.com/questions/10076454/string-builder-vs-lists

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!