Remove Duplicate Lines From Text File?

后端 未结 5 2228
孤街浪徒
孤街浪徒 2020-12-09 05:58

Given an input file of text lines, I want duplicate lines to be identified and removed. Please show a simple snippet of C# that accomplishes this.

5条回答
  •  生来不讨喜
    2020-12-09 06:39

    Here's a streaming approach that should incur less overhead than reading all unique strings into memory.

        var sr = new StreamReader(File.OpenRead(@"C:\Temp\in.txt"));
        var sw = new StreamWriter(File.OpenWrite(@"C:\Temp\out.txt"));
        var lines = new HashSet();
        while (!sr.EndOfStream)
        {
            string line = sr.ReadLine();
            int hc = line.GetHashCode();
            if(lines.Contains(hc))
                continue;
    
            lines.Add(hc);
            sw.WriteLine(line);
        }
        sw.Flush();
        sw.Close();
        sr.Close();
    

提交回复
热议问题