I was looking for a definition of "hunk" while reading some git documentation.
I know it means a description of the difference between two files and that it has a well defined format, but I couldn't call to mind a succinct definition.
I tried searching with google, but there were a lot of somewhat spurious hits.
And eventually I found this:
When comparing two files, diff finds sequences of lines common to both files, interspersed with groups of differing lines called hunks.
here: http://www.gnu.org/software/diffutils/manual/html_node/Hunks.html
Which was exactly the kind of succinct definition I was looking for. Hopefully this helps someone else out!
The term "hunk" is indeed not specific to Git, and comes from the Gnu diffutil format. Even more succinctly:
Each hunk shows one area where the files differ.
But the challenge for Git is to determine the right boundaries for a hunk.
The rest of the answer helps illustrates what a hunk looks like in Git:
After various heuristics (like the compaction one, which is gone in Git 2.12), Git maintainers settled on the indent one, which was introduced in Sept. 2016 with Git 2.11, commit 433860f.
Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique.
Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics.
For example, consider the following two diffs.
The first is what standard Git emits:
--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) {
}
if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
+if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
$smtp_server = $_;
The following diff is equivalent, but is obviously preferable from an aesthetic point of view:
--- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
+++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
@@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) {
$initial_reply_to =~ s/(^\s+|\s+$)//g;
}
+if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account.
With Git 2.14 (Q3 2017), that indent heuristic will be the default!
See commit 1fa8a66 (08 May 2017) by Jeff King (peff
).
See commit 33de716 (08 May 2017) by Stefan Beller (stefanbeller
).
See commit 37590ce, commit cf5e772 (08 May 2017) by Marc Branchaud.
(Merged by Junio C Hamano -- gitster
-- in commit 53083f8, 05 Jun 2017)
diff: enable indent heuristic by default
The feature was included in v2.11 (released 2016-11-29) and we got no negative feedback. Quite the opposite, all feedback we got was positive.
Turn it on by default. Users who dislike the feature can turn it off by setting
diff.indentHeuristic
.
For your information, you can read this simple explanation also: https://mvtechjourney.wordpress.com/2014/08/01/git-stage-hunk-and-discard-hunk-sourcetree/
来源:https://stackoverflow.com/questions/37620729/in-the-context-of-git-and-diff-what-is-a-hunk