What is the best compression algorithm that allows random reads/writes in a file?

后端 未结 6 971
名媛妹妹
名媛妹妹 2020-12-13 14:21

What is the best compression algorithm that allows random reads/writes in a file?

I know that any adaptive compression algorithms would be out of the question.

6条回答
  •  醉话见心
    2020-12-13 15:08

    I think Stephen Denne might be onto something here. Imagine:

    • zip-like compression of sequences to codes
    • a dictionary mapping code -> sequence
    • file will be like a filesystem
      • each write generates a new "file" (a sequence of bytes, compressed according to dictionary)
      • "filesystem" keeps track of which "file" belongs to which bytes (start, end)
      • each "file" is compressed according to dictionary
      • reads work filewise, uncompressing and retrieving bytes according to "filesystem"
      • writes make "files" invalid, new "files" are appended to replace the invalidated ones
    • this system will need:
      • defragmentation mechanism of filesystem
      • compacting dictionary from time to time (removing unused codes)
    • done properly, housekeeping could be done when nobody is looking (idle time) or by creating a new file and "switching" eventually

    One positive effect would be that the dictionary would apply to the whole file. If you can spare the CPU cycles, you could periodically check for sequences overlapping "file" boundaries and then regrouping them.

    This idea is for truly random reads. If you are only ever going to read fixed sized records, some parts of this idea could get easier.

提交回复
热议问题