Can somebody explain the usage of git range-diff?

后端 未结 4 962
清酒与你
清酒与你 2020-12-17 19:25

Git version 2.19 introduces git range-diff which is supposed to be used in order to compare two commit ranges. I have been reading the documentation, but I cann

相关标签:
4条回答
  • 2020-12-17 19:31

    A "range" in Git parlance is a pair of revision identifiers (start and end).

    The first form of usage for git range-diff is <range1> <range2>. Since we know a range is a pair of revision identifiers, some possible examples are:

    abc1234..def5678 9876foo..5432bar
    HEAD..def5678 my_release_1_1..my_release_1_2
    

    The other two forms of usage are for convenience when some of the four revision identifiers are the same as each other. Namely:

    1. For a case like abc..def def..abc, you can simply specify def...abc.
    2. For a case like abc..def abc..xyz, you can specify abc def xyz. This seems like a common case to me: you want to compare two ranges which start at the same point.
    0 讨论(0)
  • 2020-12-17 19:40

    Range diff becomes very useful right after resolving merge conflicts (after rebase, cherry-pick, etc.),
    especially when you had multiple conflicting commits, and you want to make sure that you haven't accidentally broken something during the process.

    Here is a scenario usually happens in case you are doing multiple commits in a branch.
    Let's say we have a branch called "our", and our branch is behind the master branch:

    m1-m2-m3-m4  <- "master" branch
      \
       o1-o2-o3  <- "our" current branch
    

    Before rebasing we do a backup of our branch (just made a copy branch with a name "our_bkp")

    git branch our_bkp
    

    And now we initiate rebasing with the master

    git rebase master
    

    And solve some merge conflicts on the commit "o1"...
    Note: If the conflicting files on the "o1" were also used/changed in "o2" or "o3",
    then we'll have to re-resolve the same merge conflicts on them as well.

    Now, let's say, after an exhausting rebase process we have something like this:

                 _<- branch "master"
                /
     m1-m2-m3-m4-o1'-o2'-o3'  <- branch "our" (after rebase)
      \
       o1-o2-o3   <- branch "our_bkp"
       
    

    Since there were many merge conflicts, it's not clearly visible whether we've missed something or not.

    And this is where the range-diff shines.

    To make sure that we haven't missed any change or accidentally damaged anything, we can simply compare our commits from the old version of the branch with the newer version:

    git range-diff our_bkp~3..our_bkp our~3..our
    

    or

    git range-diff o1..o3 o1'..o3'
    

    If the diffs are only related to the conflicting parts, then we are good, and haven't changed anything other than them.
    But if we see some other unexpected diffs, then we or git did something wrong, and we'll need to fix them.

    Notes

    • Both of the range-diff commands above are doing exactly the same thing.
    • By saying o1..o3 I mean the numbers of those commits, e.g.: 0277a5883d132bebdb34e35ee228f4382dd2bb7..e415aee3fa53a213dc53ca6a7944301066b72f24
    • The ~3 in our_bkp~3 says git to take the commit that was 3 commits before the last one on our_bkp branch. Replace the number with the amount of the commits you had on your branch and ofcourse don't forget to replace the branch name our_bkp with you'r backup branch's name.
    • You can consider the range-diff as doing a diff of two diff-s. This way it'll be easier to remember and understand what it's doing.
    0 讨论(0)
  • 2020-12-17 19:41

    I have not actually used them yet, but they are meant as an improvement over the old git cherry* flow for analysing / comparing some upstream or downstream change-set vs what you have now. To make the range-sets useful we want some set of "here are my commits" and "here are theirs", expressed as simply as possible.

    A range1 range2 set would be written as, e.g.:

    git range-diff theirs~5..theirs ours~4..ours
    

    if you had, e.g.:

              T1--T2--T3--T4--T5   <-- theirs
             /
    ...--o--*   <-- base
             \
              O1--O2--O3--O4   <-- ours
    

    where the O commits are "ours" and the T commits are "theirs".

    Given this exact same configuration, however, we could also write:

    git range-diff theirs...ours    # or ours...theirs
    

    (note the three dots). (This is the syntax used with git rev-list --cherry-mark --left-right, for instance.)

    Or, again given this same situation, we could write:

    git range-diff base theirs ours   # or base ours theirs
    

    Here base is the stop point for both theirs and ours, and avoids having to count back 5.

    If the situation is more complicated—as in the graph:

              X1--T1--T2--T3   <-- theirs
             /
    ...--o--*   <-- base
             \
              Y1--Y2--O1--O2--O3--O4   <-- ours
    

    neither the three-dot nor the base ours theirs kind of syntax quite works, so the two sets of ranges (theirs~3..theirs ours~4..ours) would be best.

    0 讨论(0)
  • 2020-12-17 19:53

    The command git range-diff, that you can see here comparing two patches, has been revisited in Git 2.23 (Q3 2019), for easier identification of which part of what file the patch shown is about.

    See commit 499352c, commit 444e096, commit b66885a, commit 430be36, commit e1db263, commit 44b67cb, commit 1ca6922, commit ef283b3, commit 80e1841 (11 Jul 2019), and commit 877a833, commit 570fe99, commit 85c3713, commit d6c88c4, commit 5af4087 (08 Jul 2019) by Thomas Gummerer (tgummerer).
    (Merged by Junio C Hamano -- gitster -- in commit 43ba21c, 25 Jul 2019)

    range-diff: add filename to inner diff

    In a range-diff it's not always clear which file a certain funcname of the inner diff belongs to, because the diff header (or section header as added in a previous commit) is not always visible in the range-diff.

    Add the filename to the inner diffs header, so it's always visible to users.

    This also allows us to add the filename + the funcname to the outer diffs hunk headers using a custom userdiff pattern, which will be done in the next commit.

    range-diff: add headers to the outer hunk header

    Add the section headers/hunk headers we introduced in the previous commits to the outer diff's hunk headers.
    This makes it easier to understand which change we are actually looking at. For example an outer hunk header might now look like:

    @@  Documentation/config/interactive.txt
    

    while previously it would have only been

    @@
    

    which doesn't give a lot of context for the change that follows.

    See t3206-range-diff.sh as an example.

    And:

    range-diff: add section header instead of diff header

    Currently range-diff keeps the diff header of the inner diff intact (apart from stripping lines starting with index).
    This diff header is somewhat useful, especially when files get different names in different ranges.

    However there is no real need to keep the whole diff header for that.
    The main reason we currently do that is probably because it is easy to do.

    Introduce a new range diff hunk header, that's enclosed by "##", similar to how line numbers in diff hunks are enclosed by "@@", and give human readable information of what exactly happened to the file, including the file name.

    This improves the readability of the range-diff by giving more concise information to the users.
    For example if a file was renamed in one iteration, but not in another, the diff of the headers would be quite noisy.
    However the diff of a single line is concise and should be easier to understand.

    Again, t3206-range-diff.sh provides an example:

    git range-diff --no-color --submodule=log topic...renamed-file >actual &&
    sed s/Z/\ /g >expected <<-EOF &&
    1:  4de457d = 1:  f258d75 s/5/A/
    2:  fccce22 ! 2:  017b62d s/4/A/
        @@ Metadata
        ZAuthor: Thomas Rast <trast@inf.ethz.ch>
        Z
        Z ## Commit message ##
        -    s/4/A/
        +    s/4/A/ + rename file
        Z
        - ## file ##
        + ## file => renamed-file ##
        Z@@
        Z 1
        Z 2
    

    But: beware of the diff.noprefix config setting: A git range-diff would segfault with Git before 2.24 (Q4 2019)!

    "git range-diff" segfaulted when diff.noprefix configuration was used, as it blindly expected the patch it internally generates to have the standard a/ and b/ prefixes.
    The command now forces the internal patch to be built without any prefix, not to be affected by any end-user configuration.

    See commit 937b76e (02 Oct 2019) by Johannes Schindelin (dscho).
    (Merged by Junio C Hamano -- gitster -- in commit 159cdab, 11 Oct 2019)

    range-diff: internally force diff.noprefix=true

    When parsing the diffs, range-diff expects to see the prefixes a/ and b/ in the diff headers.

    These prefixes can be forced off via the config setting diff.noprefix=true.
    As range-diff is not prepared for that situation, this will cause a segmentation fault.

    Let's avoid that by passing the --no-prefix option to the git log process that generates the diffs that range-diff wants to parse.
    And of course expect the output to have no prefixes, then.


    And "git range-diff" failed to handle mode-only change, which has been corrected with Git 2.24 (Q4 2019):

    See commit 2b6a9b1 (08 Oct 2019) by Thomas Gummerer (tgummerer).
    (Merged by Junio C Hamano -- gitster -- in commit b6d712f, 15 Oct 2019)

    range-diff: don't segfault with mode-only changes

    Reported-by: Uwe Kleine-König
    Signed-off-by: Thomas Gummerer
    Acked-by: Johannes Schindelin

    In ef283b3699 ("apply: make parse_git_diff_header public", 2019-07-11, Git v2.23.0-rc0 -- merge listed in batch #7) the 'parse_git_diff_header' function was made public and useable by callers outside of apply.c.

    However it was missed that its (then) only caller, 'find_header' did some error handling, and completing 'struct patch' appropriately.

    range-diff then started using this function, and tried to handle this appropriately itself, but fell short in some cases.

    This in turn would lead to range-diff segfaulting when there are mode-only changes in a range.

    Move the error handling and completing of the struct into the 'parse_git_diff_header' function, so other callers can take advantage of it.

    This fixes the segfault in 'git range-diff'.


    With Git 2.25 (Q1 2020), "git range-diff" learned to take the "--notes=<ref>" and the "--no-notes" options to control the commit notes included in the log message that gets compared.

    See commit 5b583e6, commit bd36191, commit 9f726e1, commit 3bdbdfb, commit 75c5aa0, commit 79f3950, commit 3a6e48e, commit 26d9485 (20 Nov 2019), and commit 9d45ac4, commit 828e829 (19 Nov 2019) by Denton Liu (Denton-L).
    (Merged by Junio C Hamano -- gitster -- in commit f3c7bfd, 05 Dec 2019)

    range-diff: output ## Notes ## header

    Signed-off-by: Denton Liu

    When notes were included in the output of range-diff, they were just mashed together with the rest of the commit message. As a result, users wouldn't be able to clearly distinguish where the commit message ended and where the notes started.

    Output a ## Notes ## header when notes are detected so that notes can be compared more clearly.

    Note that we handle case of Notes (<ref>): -> ## Notes (<ref>) ## with this code as well. We can't test this in this patch, however, since there is currently no way to pass along different notes refs to git log. This will be fixed in a future patch.

    And:

    See commit abcf857, commit f867534, commit 828765d (06 Dec 2019) by Denton Liu (Denton-L).
    (Merged by Junio C Hamano -- gitster -- in commit d1c0fe8, 16 Dec 2019)

    range-diff: clear other_arg at end of function

    Signed-off-by: Denton Liu

    We were leaking memory by not clearing other_arg after we were done using it.
    Clear it after we've finished using it.

    Note that this isn't strictly necessary since the memory will be reclaimed once the command exits.
    However, since we are releasing the strbufs, we should also clear other_arg for consistency.


    With Git 2.27 (Q2 2020), "git range-diff" is more robust.

    See commit 8d1675e, commit 8cf5156 (15 Apr 2020) by Vasil Dimov (vasild).
    (Merged by Junio C Hamano -- gitster -- in commit 93d1f19, 28 Apr 2020)

    range-diff: fix a crash in parsing git-log output

    Signed-off-by: Vasil Dimov

    git range-diff calls git log internally and tries to parse its output.

    But git log output can be customized by the user in their git config and for certain configurations either an error will be returned by git range-diff or it will crash.

    To fix this explicitly set the output format of the internally executed git log with --pretty=medium.
    Because that cancels --notes, add explicitly --notes at the end.

    Also, make sure we never crash in the same way - trying to dereference util which was never created and has remained NULL.
    It would happen if the first line of git log output does not begin with 'commit '.

    Alternative considered but discarded - somehow disable all git configs and behave as if no config is present in the internally executed git log, but that does not seem to be possible.
    GIT_CONFIG_NOSYSTEM is the closest to it, but even with that we would still read .git/config.

    0 讨论(0)
提交回复
热议问题