How do you search a large text file for a string without going line by line in C#?

后端 未结 14 1817
灰色年华
灰色年华 2020-12-15 08:23

I have a large text file that I need to search for a specific string. Is there a fast way to do this without reading line by line?

This method is extremely slow beca

14条回答
  •  遥遥无期
    2020-12-15 09:11

    Here is my solution that uses a stream to read in one character at a time. I created a custom class to search for the value one character at a time until the entire value is found.

    I ran some tests with a 100MB file saved on a network drive and the speed was totally dependent on how fast it could read in the file. If the file was buffered in Windows a search of the entire file took less than 3 seconds. Otherwise it could take anywhere from 7 seconds to 60 seconds, depending on network speed.

    The search itself took less than a second if run against a String in memory and there were no matching characters. If a lot of the leading characters found matches the search could take a lot longer.

    public static int FindInFile(string fileName, string value)
    {   // returns complement of number of characters in file if not found
        // else returns index where value found
        int index = 0;
        using (System.IO.StreamReader reader = new System.IO.StreamReader(fileName))
        {
            if (String.IsNullOrEmpty(value))
                return 0;
            StringSearch valueSearch = new StringSearch(value);
            int readChar;
            while ((readChar = reader.Read()) >= 0)
            {
                ++index;
                if (valueSearch.Found(readChar))
                    return index - value.Length;
            }
        }
        return ~index;
    }
    public class StringSearch
    {   // Call Found one character at a time until string found
        private readonly string value;
        private readonly List indexList = new List();
        public StringSearch(string value)
        {
            this.value = value;
        }
        public bool Found(int nextChar)
        {
            for (int index = 0; index < indexList.Count; )
            {
                int valueIndex = indexList[index];
                if (value[valueIndex] == nextChar)
                {
                    ++valueIndex;
                    if (valueIndex == value.Length)
                    {
                        indexList[index] = indexList[indexList.Count - 1];
                        indexList.RemoveAt(indexList.Count - 1);
                        return true;
                    }
                    else
                    {
                        indexList[index] = valueIndex;
                        ++index;
                    }
                }
                else
                {   // next char does not match
                    indexList[index] = indexList[indexList.Count - 1];
                    indexList.RemoveAt(indexList.Count - 1);
                }
            }
            if (value[0] == nextChar)
            {
                if (value.Length == 1)
                    return true;
                indexList.Add(1);
            }
            return false;
        }
        public void Reset()
        {
            indexList.Clear();
        }
    }
    

提交回复
热议问题