Git: How can I find a commit that most closely matches a directory?

前提是你 提交于 2019-11-26 15:39:44

问题


Someone took a version (unknown to me) of Moodle, applied many changes within a directory, and released it (tree here).

How can I determine which commit of the original project was most likely edited to form this tree?

this would allow me to form a branch at the appropriate commit with this patch. Surely it came from either the 1.8 or 1.9 branches, probably from a release tag, but diffing between particular commits doesn't help me much.

Postmortem Update: knittl's answer got me as close as I'm going to get. I first added my patch repo as the remote "foreign" (no commits in common, that's OK), then did diffs in loops with a couple format options. The first used the --shortstat format:

for REV in $(git rev-list v1.9.0^..v1.9.5); do 
    git diff --shortstat "$REV" f7f7ad53c8839b8ea4e7 -- mod/assignment >> ~/rdiffs.txt; 
    echo "$REV" >> ~/rdiffs.txt; 
done;

The second just counted the line changes in a unified diff with no context:

for REV in $(git rev-list v1.9.0^..v1.9.5); do 
    git diff -U0 "$REV" f7f7ad53c8839b8ea4e7 -- mod/assignment | wc -l >> ~/rdiffs2.txt;
    echo "$REV" >> ~/rdiffs2.txt; 
done;

There were thousands of commits to dig through, but this one seems to be the closest match.


回答1:


you could write a script, which diffs the given tree against a revision range in your repository.

assume we first fetch the changed tree (without history) into our own repository:

git remote add foreign git://…
git fetch foreign

we then output the diffstat (in short form) for each revision we want to match against:

for REV in $(git rev-list 1.8^..1.9); do
   git diff --shortstat foreign/master $REV;
done

look for the commit with the smallest amount of changes (or use some sorting mechanism)




回答2:


This was my solution:

#!/bin/sh

start_date="2012-03-01"
end_date="2012-06-01"
needle_ref="aaa"

echo "" > /tmp/script.out;
shas=$(git log --oneline --all --after="$start_date" --until="$end_date" | cut -d' ' -f 1)
for sha in $shas
do
    wc=$(git diff --name-only "$needle_ref" "$sha" | wc -l)
    wc=$(printf %04d $wc);
    echo "$wc $sha" >> /tmp/script.out
done
cat /tmp/script.out | grep -v ^$ | sort | head -5



回答3:


How about using git to create a patch from all versions of 1.8. and 1.9 to this new release. Then you could see which patch makes more 'sense'.

For example, if the patch 'removes' many methods, then it is probably not this release, but one before. If the patch has many sections that don't make sense as a single edit, then it probably isn't this release either.

And so on... In reality, unfortunately, there doesn't exist an algorithm to do this perfectly. I will have to be some heuristic.




回答4:


How about using 'git blame'? It will show you, for each line, who changed it, and in which revision.



来源:https://stackoverflow.com/questions/6388283/git-how-can-i-find-a-commit-that-most-closely-matches-a-directory

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!