`uniq` without sorting an immense text file?

后端 未结 6 2043
我在风中等你
我在风中等你 2020-12-18 07:01

I have a stupidly large text file (i.e. 40 gigabytes as of today) that I would like to filter for unique lines without sorting the file.

The file ha

6条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-12-18 07:15

    Maybe not the answer you've been looking for but here goes: use a bloom filter. https://en.wikipedia.org/wiki/Bloom_filter This sort of problem is one of the main reasons they exist.

提交回复
热议问题