发表新帖

发表新帖

What is the best compression algorithm that allows random reads/writes in a file?

后端未结

关注

 6  971

名媛妹妹 2020-12-13 14:21

What is the best compression algorithm that allows random reads/writes in a file?

I know that any adaptive compression algorithms would be out of the question.

6条回答

醉话见心 (楼主)

2020-12-13 15:08
I think Stephen Denne might be onto something here. Imagine:
- zip-like compression of sequences to codes
- a dictionary mapping code -> sequence
- file will be like a filesystem
  - each write generates a new "file" (a sequence of bytes, compressed according to dictionary)
  - "filesystem" keeps track of which "file" belongs to which bytes (start, end)
  - each "file" is compressed according to dictionary
  - reads work filewise, uncompressing and retrieving bytes according to "filesystem"
  - writes make "files" invalid, new "files" are appended to replace the invalidated ones
- this system will need:
  - defragmentation mechanism of filesystem
  - compacting dictionary from time to time (removing unused codes)
- done properly, housekeeping could be done when nobody is looking (idle time) or by creating a new file and "switching" eventually
One positive effect would be that the dictionary would apply to the whole file. If you can spare the CPU cycles, you could periodically check for sequences overlapping "file" boundaries and then regrouping them.

This idea is for truly random reads. If you are only ever going to read fixed sized records, some parts of this idea could get easier.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题