How to make 'git diff' ignore comments

前端 未结 6 1483
闹比i
闹比i 2020-12-05 02:38

I am trying to produce a list of the files that were changed in a specific commit. The problem is, that every file has the version number in a comment at the top of the file

6条回答
  •  臣服心动
    2020-12-05 03:12

    Here is a solution that is working well for me. I've written up the solution and some additional missing documentation on the git (log|diff) -G option.

    It is basically using the same solution as in previous answers, but specifically for comments that start with a * or a #, and sometimes a space before the *... But it still needs to allow #ifdef, #include, etc. changes.

    Look ahead and look behind do not seem to be supported by the -G option, nor does the ? in general, and I have had problems with using *, too. + seems to be working well, though.

    (Note, tested on Git v2.7.0)

    Multi-Line Comment Version

    git diff -w -G'(^[^\*# /])|(^#\w)|(^\s+[^\*#/])'
    
    • -w ignore whitespace
    • -G only show diff lines that match the following regex
    • (^[^\*# /]) any line that does not start with a star or a hash or a space
    • (^#\w) any line that starts with # followed by a letter
    • (^\s+[^\*#/]) any line that starts with some whitespace followed by a comment character

    Basically an SVN hook modifies every file in and out right now and modifies multi-line comment blocks on every file. Now I can diff my changes against SVN without the FYI information that SVN drops in the comments.

    Technically this will allow for Python and Bash comments like #TODO to be shown in the diff, and if a division operator started on a new line in C++ it could be ignored:

    a = b
        / c;
    

    Also the documentation on -G in Git seemed pretty lacking, so the information here should help:

    git diff -G

    -G

    Look for differences whose patch text contains added/removed lines that match .

    To illustrate the difference between -S --pickaxe-regex and -G, consider a commit with the following diff in the same file:

    +    return !regexec(regexp, two->ptr, 1, ®match, 0);
    ...
    -    hit = !regexec(regexp, mf2.ptr, 1, ®match, 0);
    

    While git log -G"regexec\(regexp" will show this commit, git log -S"regexec\(regexp" --pickaxe-regex will not (because the number of occurrences of that string did not change).

    See the pickaxe entry in gitdiffcore(7) for more information.

    (Note, tested on Git v2.7.0)

    • -G uses a basic regular expression.
    • No support for ?, *, !, {, } regular expression syntax.
    • Grouping with () and OR-ing groups works with |.
    • Wild card characters such as \s, \W, etc. are supported.
    • Look-ahead and look-behind are not supported.
    • Beginning and ending line anchors ^$ work.
    • Feature has been available since Git 1.7.4.

    Excluded Files v Excluded Diffs

    Note that the -G option filters the files that will be diffed.

    But if a file gets "diffed" those lines that were "excluded/included" before will all be shown in the diff.

    Examples

    Only show file differences with at least one line that mentions foo.

    git diff -G'foo'
    

    Show file differences for everything except lines that start with a #

    git diff -G'^[^#]'
    

    Show files that have differences mentioning FIXME or TODO

    git diff -G`(FIXME)|(TODO)`
    

    See also git log -G, git grep, git log -S, --pickaxe-regex, and --pickaxe-all

    UPDATE: Which regular expression tool is in use by the -G option?

    https://github.com/git/git/search?utf8=%E2%9C%93&q=regcomp&type=

    https://github.com/git/git/blob/master/diffcore-pickaxe.c

    if (opts & (DIFF_PICKAXE_REGEX | DIFF_PICKAXE_KIND_G)) {
        int cflags = REG_EXTENDED | REG_NEWLINE;
        if (DIFF_OPT_TST(o, PICKAXE_IGNORE_CASE))
            cflags |= REG_ICASE;
        regcomp_or_die(®ex, needle, cflags);
        regexp = ®ex;
    
    // and in the regcom_or_die function
    regcomp(regex, needle, cflags);
    

    http://man7.org/linux/man-pages/man3/regexec.3.html

       REG_EXTENDED
              Use POSIX Extended Regular Expression syntax when interpreting
              regex.  If not set, POSIX Basic Regular Expression syntax is
              used.
    

    // ...

       REG_NEWLINE
              Match-any-character operators don't match a newline.
    
              A nonmatching list ([^...])  not containing a newline does not
              match a newline.
    
              Match-beginning-of-line operator (^) matches the empty string
              immediately after a newline, regardless of whether eflags, the
              execution flags of regexec(), contains REG_NOTBOL.
    
              Match-end-of-line operator ($) matches the empty string
              immediately before a newline, regardless of whether eflags
              contains REG_NOTEOL.
    

提交回复
热议问题