Consider this simple python code, which demonstrates a very simple version control design for a dictonary:
def build_current(history):
current = {}
for a
You mentioned these 3 methods of storing (file)-history:
Subversion uses forward deltas(aka patches) in FSFS repositories and backward deltas in BDB Repositories. Note that these implementations have different performance characteristics:
forward deltas are fast in committing, but slow on checkouts(as the "current" version must be reconstructed)
backward deltas are fast in checking out but slow on commit as new deltas must be constructed to construct the new current and rewrite the previous "current" as a bunch of deltas
Also note that FSFS uses a "skipping delta" algorithm which minimizes the jumps to reconstruct a specific version. This skipping delta however is not size optimized as mercurials snapshots; it just minimizes the number of "revisions" you need to build a full version regardless of the overall size.
Here a small ascii art (copied from the specification) of a file with 9 revisions:
0 <- 1 2 <- 3 4 <- 5 6 <- 7
0 <------ 2 4 <------ 6
0 <---------------- 4
0 <------------------------------------ 8 <- 9
where "0 <- 1" means that the delta base for revision 1 is revision 0.
The number of jumps is at most log(N) for N revisions.
Also a very pretty effect on FSFS is that older revision will be written only once and after this they will be only read by further actions. That's why subversion repositories are very stable: as long as you do not have a HW failure on your harddisk, you should be able to get a working repository even if some corruption occurred in the last commit: You still have all older revisions.
In BDB Backend you have constantly rewrite the current revision on checkins/commits, which makes this process prone to data corruption. Also as you store the full text only in current revision, corrupting the data on commit will likely destroy great parts of your history.