Git version 2.19 introduces git range-diff
which is supposed to be used in order to compare two commit ranges. I have been reading the documentation, but I cann
A "range" in Git parlance is a pair of revision identifiers (start and end).
The first form of usage for git range-diff
is <range1> <range2>
. Since we know a range is a pair of revision identifiers, some possible examples are:
abc1234..def5678 9876foo..5432bar
HEAD..def5678 my_release_1_1..my_release_1_2
The other two forms of usage are for convenience when some of the four revision identifiers are the same as each other. Namely:
abc..def def..abc
, you can simply specify def...abc
.abc..def abc..xyz
, you can specify abc def xyz
. This seems like a common case to me: you want to compare two ranges which start at the same point.Range diff becomes very useful right after resolving merge conflicts (after rebase
, cherry-pick
, etc.),
especially when you had multiple conflicting commits, and you want to make sure that you haven't accidentally broken something during the process.
Here is a scenario usually happens in case you are doing multiple commits in a branch.
Let's say we have a branch called "our", and our branch is behind the master branch:
m1-m2-m3-m4 <- "master" branch
\
o1-o2-o3 <- "our" current branch
Before rebasing we do a backup of our branch (just made a copy branch with a name "our_bkp")
git branch our_bkp
And now we initiate rebasing with the master
git rebase master
And solve some merge conflicts on the commit "o1"...
Note: If the conflicting files on the "o1" were also used/changed in "o2" or "o3",
then we'll have to re-resolve the same merge conflicts on them as well.
Now, let's say, after an exhausting rebase process we have something like this:
_<- branch "master"
/
m1-m2-m3-m4-o1'-o2'-o3' <- branch "our" (after rebase)
\
o1-o2-o3 <- branch "our_bkp"
Since there were many merge conflicts, it's not clearly visible whether we've missed something or not.
And this is where the range-diff shines.
To make sure that we haven't missed any change or accidentally damaged anything, we can simply compare our commits from the old version of the branch with the newer version:
git range-diff our_bkp~3..our_bkp our~3..our
or
git range-diff o1..o3 o1'..o3'
If the diffs are only related to the conflicting parts, then we are good, and haven't changed anything other than them.
But if we see some other unexpected diffs, then we or git did something wrong, and we'll need to fix them.
Notes
o1..o3
I mean the numbers of those commits, e.g.: 0277a5883d132bebdb34e35ee228f4382dd2bb7..e415aee3fa53a213dc53ca6a7944301066b72f24~3
in our_bkp~3
says git to take the commit that was 3
commits before the last one on our_bkp
branch. Replace the number with the amount of the commits you had on your branch and ofcourse don't forget to replace the branch name our_bkp
with you'r backup branch's name.I have not actually used them yet, but they are meant as an improvement over the old git cherry*
flow for analysing / comparing some upstream or downstream change-set vs what you have now. To make the range-sets useful we want some set of "here are my commits" and "here are theirs", expressed as simply as possible.
A range1 range2 set would be written as, e.g.:
git range-diff theirs~5..theirs ours~4..ours
if you had, e.g.:
T1--T2--T3--T4--T5 <-- theirs
/
...--o--* <-- base
\
O1--O2--O3--O4 <-- ours
where the O
commits are "ours" and the T
commits are "theirs".
Given this exact same configuration, however, we could also write:
git range-diff theirs...ours # or ours...theirs
(note the three dots). (This is the syntax used with git rev-list --cherry-mark --left-right
, for instance.)
Or, again given this same situation, we could write:
git range-diff base theirs ours # or base ours theirs
Here base
is the stop point for both theirs and ours, and avoids having to count back 5.
If the situation is more complicated—as in the graph:
X1--T1--T2--T3 <-- theirs
/
...--o--* <-- base
\
Y1--Y2--O1--O2--O3--O4 <-- ours
neither the three-dot nor the base ours theirs
kind of syntax quite works, so the two sets of ranges (theirs~3..theirs ours~4..ours
) would be best.
The command git range-diff, that you can see here comparing two patches, has been revisited in Git 2.23 (Q3 2019), for easier identification of which part of what file the patch shown is about.
See commit 499352c, commit 444e096, commit b66885a, commit 430be36, commit e1db263, commit 44b67cb, commit 1ca6922, commit ef283b3, commit 80e1841 (11 Jul 2019), and commit 877a833, commit 570fe99, commit 85c3713, commit d6c88c4, commit 5af4087 (08 Jul 2019) by Thomas Gummerer (tgummerer).
(Merged by Junio C Hamano -- gitster -- in commit 43ba21c, 25 Jul 2019)
range-diff
: add filename to inner diffIn a range-diff it's not always clear which file a certain
funcname
of the inner diff belongs to, because the diff header (or section header as added in a previous commit) is not always visible in the range-diff.Add the filename to the inner diffs header, so it's always visible to users.
This also allows us to add the filename + the funcname to the outer diffs hunk headers using a custom userdiff pattern, which will be done in the next commit.
range-diff
: add headers to the outer hunk headerAdd the section headers/hunk headers we introduced in the previous commits to the outer diff's hunk headers.
This makes it easier to understand which change we are actually looking at. For example an outer hunk header might now look like:@@ Documentation/config/interactive.txt
while previously it would have only been
@@
which doesn't give a lot of context for the change that follows.
See t3206-range-diff.sh as an example.
And:
range-diff: add section header instead of diff header
Currently range-diff keeps the diff header of the inner diff intact (apart from stripping lines starting with index).
This diff header is somewhat useful, especially when files get different names in different ranges.However there is no real need to keep the whole diff header for that.
The main reason we currently do that is probably because it is easy to do.Introduce a new range diff hunk header, that's enclosed by "
##
", similar to how line numbers in diff hunks are enclosed by "@@
", and give human readable information of what exactly happened to the file, including the file name.This improves the readability of the range-diff by giving more concise information to the users.
For example if a file was renamed in one iteration, but not in another, the diff of the headers would be quite noisy.
However the diff of a single line is concise and should be easier to understand.
Again, t3206-range-diff.sh provides an example:
git range-diff --no-color --submodule=log topic...renamed-file >actual && sed s/Z/\ /g >expected <<-EOF && 1: 4de457d = 1: f258d75 s/5/A/ 2: fccce22 ! 2: 017b62d s/4/A/ @@ Metadata ZAuthor: Thomas Rast <trast@inf.ethz.ch> Z Z ## Commit message ## - s/4/A/ + s/4/A/ + rename file Z - ## file ## + ## file => renamed-file ## Z@@ Z 1 Z 2
But: beware of the diff.noprefix config setting: A git range-diff
would segfault with Git before 2.24 (Q4 2019)!
"git range-diff
" segfaulted when diff.noprefix
configuration was used, as it blindly expected the patch it internally generates to have the standard a/
and b/
prefixes.
The command now forces the internal patch to be built without any prefix, not to be affected by any end-user configuration.
See commit 937b76e (02 Oct 2019) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 159cdab, 11 Oct 2019)
range-diff: internally force
diff.noprefix=true
When parsing the diffs,
range-diff
expects to see the prefixesa/
andb/
in the diff headers.These prefixes can be forced off via the config setting
diff.noprefix=true
.
Asrange-diff
is not prepared for that situation, this will cause a segmentation fault.Let's avoid that by passing the
--no-prefix
option to thegit log
process that generates the diffs thatrange-diff
wants to parse.
And of course expect the output to have no prefixes, then.
And "git range-diff
" failed to handle mode-only change, which has been
corrected with Git 2.24 (Q4 2019):
See commit 2b6a9b1 (08 Oct 2019) by Thomas Gummerer (tgummerer).
(Merged by Junio C Hamano -- gitster -- in commit b6d712f, 15 Oct 2019)
range-diff: don't segfault with mode-only changes
Reported-by: Uwe Kleine-König
Signed-off-by: Thomas Gummerer
Acked-by: Johannes SchindelinIn ef283b3699 ("
apply
: makeparse_git_diff_header
public", 2019-07-11, Git v2.23.0-rc0 -- merge listed in batch #7) the 'parse_git_diff_header
' function was made public and useable by callers outside of apply.c.However it was missed that its (then) only caller, '
find_header
' did some error handling, and completing 'struct patch
' appropriately.
range-diff
then started using this function, and tried to handle this appropriately itself, but fell short in some cases.This in turn would lead to
range-diff
segfaulting when there are mode-only changes in a range.Move the error handling and completing of the struct into the '
parse_git_diff_header
' function, so other callers can take advantage of it.This fixes the segfault in '
git range-diff
'.
With Git 2.25 (Q1 2020), "git range-diff
" learned to take the "--notes=<ref>
" and the "--no-notes
" options to control the commit notes included in the log message that gets compared.
See commit 5b583e6, commit bd36191, commit 9f726e1, commit 3bdbdfb, commit 75c5aa0, commit 79f3950, commit 3a6e48e, commit 26d9485 (20 Nov 2019), and commit 9d45ac4, commit 828e829 (19 Nov 2019) by Denton Liu (Denton-L).
(Merged by Junio C Hamano -- gitster -- in commit f3c7bfd, 05 Dec 2019)
range-diff: output
## Notes ##
headerSigned-off-by: Denton Liu
When notes were included in the output of
range-diff
, they were just mashed together with the rest of the commit message. As a result, users wouldn't be able to clearly distinguish where the commit message ended and where the notes started.Output a
## Notes ##
header when notes are detected so that notes can be compared more clearly.Note that we handle case of
Notes (<ref>): -> ## Notes (<ref>) ##
with this code as well. We can't test this in this patch, however, since there is currently no way to pass along different notes refs togit log
. This will be fixed in a future patch.
And:
See commit abcf857, commit f867534, commit 828765d (06 Dec 2019) by Denton Liu (Denton-L).
(Merged by Junio C Hamano -- gitster -- in commit d1c0fe8, 16 Dec 2019)
range-diff: clear
other_arg
at end of functionSigned-off-by: Denton Liu
We were leaking memory by not clearing
other_arg
after we were done using it.
Clear it after we've finished using it.Note that this isn't strictly necessary since the memory will be reclaimed once the command exits.
However, since we are releasing thestrbufs
, we should also clearother_arg
for consistency.
With Git 2.27 (Q2 2020), "git range-diff
" is more robust.
See commit 8d1675e, commit 8cf5156 (15 Apr 2020) by Vasil Dimov (vasild).
(Merged by Junio C Hamano -- gitster -- in commit 93d1f19, 28 Apr 2020)
range-diff: fix a crash in parsing
git-log
outputSigned-off-by: Vasil Dimov
git range-diff
callsgit log
internally and tries to parse its output.But
git log
output can be customized by the user in their git config and for certain configurations either an error will be returned bygit range-diff
or it will crash.To fix this explicitly set the output format of the internally executed
git log
with--pretty=medium
.
Because that cancels--notes
, add explicitly--notes
at the end.Also, make sure we never crash in the same way - trying to dereference
util
which was never created and has remainedNULL
.
It would happen if the first line ofgit log
output does not begin with 'commit '.Alternative considered but discarded - somehow disable all git configs and behave as if no config is present in the internally executed
git log
, but that does not seem to be possible.
GIT_CONFIG_NOSYSTEM
is the closest to it, but even with that we would still read.git/config
.