Merge made by 'recursive' strategy

问题

I understood that git merge recursive actually happens when there is more than 1 common ancestor, and it will create a virtual commit to merge these common ancestors before proceeding to merge the more recent commits (sorry i am not sure whether there should be a term for this).

But I have been trying to find more information on how git merge recursive strategy actually works in detail but not much info can be found.

Can anyone explain in details how git merge recursive really perform, with examples and possibly flow maps to help visualizing better?

回答1:

You can find a description here (see also part 2):

When is merge recursive needed?

What if we find "two common ancestors"? The branch explorer view below shows an alternative in which there are two possible "common ancestors".

Please note: the example is a little bit forced since there's not a good reason – initially – for the developer merging from changeset 11 into 16 instead of merging from changeset 15 (the latest from the branch main at the point of the merge).
But let's assume it has to be done for a reason, let's say, changeset 11 was stable and 13 and 15 weren't at the time, for instance.

The point is: between 15 and 16 there's not a single unique ancestor, but rather, two ancestors at the same "distance": 12 and 11.

While this won't happen frequently, it is really likely to happen with long lived branches or complex branch topologies. (The case depicted above is the shortest one driving to the "multiple ancestor" problem, but it can happen too with several changesets and branches in between the "crossed" merges).

One solution is to "select" one of the ancestors as the valid one for the merge (which is the option Mercurial takes) but it has many drawbacks.

How merge recursive works?

When more than one valid ancestor is found, the recursive-merge strategy will create a new unique "virtual ancestor" merging the ones initially found.

The following image depicts the algorithm:

A new ancestor 2 will be used as "ancestor" to merge the "src" and "dst".

The "merge recursive strategy" is able to find a better solution than just "selecting one of the two" as I'll describe below.

Note: the merge recursive strategy was initially the merge "fredrik" strategy (see commit e4cf17c, Sept. 2005, Git v0.99.7a), after Fredrik Kuivinen.
It was a python script, initiated in commit 720d150, and it illustrates the original algorithm.

For more details, consider "Current Concepts in Version Control Systems from Petr Baudiˇs 2009-09-11", page 17.

|B| = 1 : b(B) = B0
|B| = 2 : b(B) = M(LCA(B0, B1), B0, B1)
M(B, x, y) = ∆−1
(b(B), x ∪ y)
m(x, y) = M(LCA(x, y), x, y)

(Yes, I don't know either how to read this)

In case of conflict, the main idea of the algorithm is to simply leave the conflict markers in place when using the result as a base for further merges.
This means that earlier conflicts are properly propagated as well as conflicting changes in newer revisions.

This refers to revctrl.org/CrissCrossMerge, which describes the contexte of a recursive merge in a criss-cross merge.

A criss-cross merge is an ancestry graph in which minimal common ancestors are not unique.
The simplest example with scalars is something like:

   a
  / \
 b1  c1
 |\ /|
 | X |
 |/ \|
 b2  c2

The story one can tell here is that Bob and Claire made some change independently, then each merged the changes together.
They conflicted, and Bob (of course) decided his change was better, while Claire (typically) picked her version.
Now, we need to merge again. This should be a conflict.

Note that this can happen equally well with a textual merger -- they have each edited the same place in the file, and when resolving the conflict they each choose to make the resulting text identical to their original version (i.e., they don't munge the two edits together somehow, they just pick one to win).

So:

Another possible solution is to first merge 'b1' and 'c1' to a temporary node (basically, imagine that the 'X' in the diagram is actually a revision, not just edges crossing) and then use that as a base for merging 'b2' and 'c2'.

The interesting part is when merging 'b1' and 'c1' results in conflicts - the trick is that in that case, 'X' is included with the conflicts recorded inside (e.g. using the classical conflict markers).

Since both 'b2' and 'c2' had to resolve the same conflict, in the case they resolved it the same way they both remove the conflicts from 'X' in the same way and a clean merge results; if they resolved it in different ways, the conflicts from 'X' get propagated to the final merge result.

That is torek described in "git merge: how did I get a conflict in BASE file?" as an "asymmetric result":

"These asymmetric results were harmless, except for the time bomb itself plus the fact that you later ran a recursive merge.
You get to see the conflict. It's up to you to resolve it — again — but this time there's no easy ours/theirs trick, if that worked for persons C and D."

Resuming from revctrl.org/CrissCrossMerge:

If a merge would result in more than two bases ('b1', 'c1, 'd1'), they are merged consecutively - first 'b1' with 'c1' and then the result with 'd1'.

This is what "Git"'s "recursive merge" strategy does.

Note that Git 2.22 (Q2 2019) will improve that recursive merge strategy, since git merge-recursive" backend recently (Git 2.18) learned a new heuristics to infer file movement based on how other files in the same directory moved.

As this is inherently less robust heuristics than the one based on the content similarity of the file itself (rather than based on what its neighbours are doing), it sometimes gives an outcome unexpected by the end users. This has been toned down to leave the renamed paths in higher/conflicted stages in the index so that the user can examine and confirm the result.

See commit 8c8e5bd, commit e62d112, commit 6d169fd, commit e0612a1, commit 8daec1d, commit e2d563d, commit c336ab8, commit 3f9c92e, commit e9cd1b5, commit 967d6be, commit 043622b, commit 93a02c5, commit e3de888, commit 259ccb6, commit 5ec1e72 (05 Apr 2019) by Elijah Newren (newren).
^{(Merged by Junio C Hamano -- gitster -- in commit 96379f0, 08 May 2019)}

merge-recursive: switch directory rename detection default

When all of x/a, x/b, and x/c have moved to z/a, z/b, and z/c on one branch, there is a question about whether x/d added on a different branch should remain at x/d or appear at z/d when the two branches are merged.
There are different possible viewpoints here:

A) The file was placed at x/d; it's unrelated to the other files in x/ so it doesn't matter that all the files from x/ moved to z/ on one branch; x/d should still remain at x/d.

B) x/d is related to the other files in x/, and x/ was renamed to z/; therefore x/d should be moved to z/d.

Since there was no ability to detect directory renames prior to Git 2.18, users experienced (A) regardless of context.
Choice (B) was implemented in Git 2.18, with no option to go back to (A), and has been in use since.
However, one user reported that the merge results did not match their expectations, making the change of default problematic, especially since there was no notice printed when directory rename detection moved files.

Note that there is also a third possibility here:

C) There are different answers depending on the context and content that cannot be determined by Git, so this is a conflict.
Use a higher stage in the index to record the conflict and notify the user of the potential issue instead of silently selecting a resolution for them.

Add an option for users to specify their preference for whether to use directory rename detection, and default to (C).
Even when directory rename detection is on, add notice messages about files moved into new directories.

来源：https://stackoverflow.com/questions/55998614/merge-made-by-recursive-strategy

标签

git

merge

Merge made by 'recursive' strategy

问题

回答1:

When is merge recursive needed?

How merge recursive works?

merge-recursive: switch directory rename detection default

`merge-recursive`: switch directory rename detection default