Remove Duplicate Lines From Text File?

后端未结

关注

 5  2228

孤街浪徒 2020-12-09 05:58

Given an input file of text lines, I want duplicate lines to be identified and removed. Please show a simple snippet of C# that accomplishes this.

5条回答

生来不讨喜 (楼主)

2020-12-09 06:39

Here's a streaming approach that should incur less overhead than reading all unique strings into memory.

    var sr = new StreamReader(File.OpenRead(@"C:\Temp\in.txt"));
    var sw = new StreamWriter(File.OpenWrite(@"C:\Temp\out.txt"));
    var lines = new HashSet();
    while (!sr.EndOfStream)
    {
        string line = sr.ReadLine();
        int hc = line.GetHashCode();
        if(lines.Contains(hc))
            continue;

        lines.Add(hc);
        sw.WriteLine(line);
    }
    sw.Flush();
    sw.Close();
    sr.Close();

0 讨论(0)

查看其它5个回答