Deleting duplicate lines in a file using Java

后端 未结 14 585
予麋鹿
予麋鹿 2020-12-14 01:39

As part of a project I\'m working on, I\'d like to clean up a file I generate of duplicate line entries. These duplicates often won\'t occur near each other, however. I came

14条回答
  •  遥遥无期
    2020-12-14 01:59

    The Hash Set approach is OK, but you can tweak it to not have to store all the Strings in memory, but a logical pointer to the location in the file so you can go back to read the actual value only in case you need it.

    Another creative approach is to append to each line the number of the line, then sort all the lines, remove the duplicates (ignoring the last token that should be the number), and then sort again the file by the last token and striping it out in the output.

提交回复
热议问题