How does one store history of edits effectively?

情到浓时终转凉″ 提交于 2019-11-28 16:27:35

问题


I was just wondering for sites like stackoverflow and wikipedia, they stores history of edits indefinitely and allows user to roll back the edits. Can someone recommend any resources/books/articles regarding how to do this using any suitable technology (such as databases etc)

Thanks a lot!


回答1:


There are a number of options; the simplest, of course, being to simply record all versions independently. For a site like Stack Overflow, where posts aren't usually edited very many times, this is appropriate. However for something like Wikipedia, one needs to be more clever to save space.

In the case of Wikipedia, pages are initially stored with each version separate, in the text table. Periodically, a number of older revisions are compressed together, then packed into a single field. Since there will be a lot of repetition, you save a lot of space this way.

You might also want to look into how some version control systems do it - for example, subversion uses skip deltas, where revisions are stored as a difference from a revision halfway down the history. This means that one will have to examine at most O(lg n) revisions to reconstruct one's revision of interest.

Git, on the other hand, uses something more similar to Wikipedia's approach. Revisions are stored as individually compressed 'loose' objects at first, then periodically git takes all of the loose objects, sorts them according to a somewhat complex heuristic, then builds compressed deltas between 'nearby' objects and dumps the result as a packfile. The number of revisions that need to be read to reconstruct a file is bounded by an argument to the pack building process. This has the interesting property that deltas can be built between objects that are unrelated, in some cases.



来源:https://stackoverflow.com/questions/1219608/how-does-one-store-history-of-edits-effectively

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!