How can I recover HEAD^'S tree?

情到浓时终转凉″ 提交于 2020-01-01 10:15:10

问题


tl;dr: is it possible to recover HEAD^'s tree if it is deleted and not pushed beforehand and if everything else is intact?

I accidentally deleted part of my .git. I'm not entirely sure what's missing.

Upon discovering that git push didn't work, I ran a git fsck:

Checking object directories: 100% (256/256), done.
Checking objects: 100% (1265/1265), done.
broken link from  commit f3419f630546ba02baf43f4ca760b02c0f4a0e6d
              to    tree 29616dfefd2bff59b7fb3177e99b4a1efc7132fa
broken link from  commit ccfe9502e24d2b5195008005d83155197a2dca25
              to    tree 0580c3675560cbfd3f989878a9524e35f53f08e9
broken link from  commit ccfe9502e24d2b5195008005d83155197a2dca25
              to  commit 0bca9b3a9f1dd9106922f5b4ec59cdc00dd6c049
broken link from    tree 6d33d35870281340c7c2f86c6d48c8f133b836bb
              to    blob 226d8a10a623acd943bb8eddd080a5929f3ccb2c
broken link from  commit db238d4a52ee8f18a04c038809bc6587d7643438
              to    tree 0b69ab3f6940a04684ee8c0c423ae7da89de749c
missing tree 0580c3675560cbfd3f989878a9524e35f53f08e9
dangling commit 05512f9ac09d932e7d9a11d490c8a2f117c0ca11
missing tree 29616dfefd2bff59b7fb3177e99b4a1efc7132fa
dangling commit 578464dde7d7b8628f77e536b4076cfa491d7602
missing blob 5d351b568abb734605ca4bf446e13cfd87ca9ce8
missing tree 0b69ab3f6940a04684ee8c0c423ae7da89de749c
missing commit 0bca9b3a9f1dd9106922f5b4ec59cdc00dd6c049
dangling blob d53a9d0f3364b648edbc4beede022e4594a84c35
missing blob 23db34f729a88c5f5f7fe6e281921f1334f493d1
dangling commit 8dcbde55462ca0c29e0ca339a49db95b43188ef1
dangling blob e59b25b9675625d0e6b8abfa37e955ab46493fd9
missing blob 226d8a10a623acd943bb8eddd080a5929f3ccb2c
dangling commit 85fdaaa579cf1ae2a8874e3e1f3c65d68b478179
dangling commit 075e9d72e90cc8bf3d960edd8376aaae0847f916
missing blob 83fec2ff8cfcaaa06c96917b6973ace96301e932
dangling commit a88e18e1c102d909361738fd70137b3f4a1c7496
dangling blob 9c6f61e0acffe2a1f5322cd2b72c181e95e9de75
dangling commit ca9fe0dd3123a731fc310b2a2285b00ef673de79

So my assumption is that I'm merely missing some information that can be recovered from GitHub. My knee-jerk reaction was to run git fetch, but that returns with no output, because it thinks there's nothing new to fetch.

I tried unpacking .git/objects/pack/pack-ea43d1db155e4502c2250ec1d4608843715c8b1f.pack, several ways, but it never worked. For example:

% git clone --mirror git://github.com/strugee/dots.git # returns bare repo
Cloning into bare repository 'dots.git'...
remote: Counting objects: 1331, done.
remote: Compressing objects: 100% (23/23), done.
remote: Total 1331 (delta 12), reused 0 (delta 0)
Receiving objects: 100% (1331/1331), 402.31 KiB | 197.00 KiB/s, done.
Resolving deltas: 100% (454/454), done.
Checking connectivity... done.
% ls dots.git
config  description  HEAD  hooks  info  objects  packed-refs  refs
% mkdir git-tmp; cd git-tmp
% git init
% git unpack-objects < ../dots.git/objects/pack/pack-ea43d1db155e4502c2250ec1d4608843715c8b1f.pack
error: inflate: data stream error (incorrect data check)
error: inflate returned -3

I got this error every time. (Keep in mind: it's a --mirror, so it's an exact copy of what GitHub has - right? How could it be corrupt then?)

Eventually I realized that I didn't actually need to unpack the packfile. I could just copy it back into the original repo, and Git would pick it up just fine. So:

% cd ../configs
% cp ../dots.git/objects/pack/pack-ea43d1db155e4502c2250ec1d4608843715c8b1f.* .git/objects/pack/

And that seemed to do the trick. Mostly.

% git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (2596/2596), done.
broken link from  commit db238d4a52ee8f18a04c038809bc6587d7643438
              to    tree 0b69ab3f6940a04684ee8c0c423ae7da89de749c
dangling commit 05512f9ac09d932e7d9a11d490c8a2f117c0ca11
dangling commit 578464dde7d7b8628f77e536b4076cfa491d7602
missing blob 5d351b568abb734605ca4bf446e13cfd87ca9ce8
missing tree 0b69ab3f6940a04684ee8c0c423ae7da89de749c
dangling blob d53a9d0f3364b648edbc4beede022e4594a84c35
dangling commit 8dcbde55462ca0c29e0ca339a49db95b43188ef1
dangling commit 85fdaaa579cf1ae2a8874e3e1f3c65d68b478179
dangling commit 075e9d72e90cc8bf3d960edd8376aaae0847f916
missing blob 83fec2ff8cfcaaa06c96917b6973ace96301e932
dangling commit a88e18e1c102d909361738fd70137b3f4a1c7496
dangling commit ca9fe0dd3123a731fc310b2a2285b00ef673de79

As you can see, that repaired all but one missing link. As it turns out, db238d is the id of a commit (which happens to be HEAD^) that I had not yet pushed. Am I correct in assuming that the last two commits in this repository are unrecoverable, and I will need to recreate the contents of those commits? Did I make the right decisions in this scenario?


回答1:


Try git fetch-pack to recover missing objects available from another repository. Instructions below.

For recovery of unpushed commits, specifically HEAD^1 I would start with

git diff-tree -r HEAD~2^{tree} HEAD^{tree}

You'll get a list of all trees/blobs that have changed and their SHAs (which would include the changes from both HEAD and HEAD^1). Depending on how much information is available you may be able to recreate some of all of the missing tree. Missing blobs are more problematic though.

Use of git fetch-pack

Intentionally corrupt repository

me@myvm:/scratch/corrupt/.git  (GIT_DIR!)$ cd objects/
me@myvm:/scratch/corrupt/.git/objects  (GIT_DIR!)$ ll
total 20
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 20
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 22
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 25
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 info
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 pack
me@myvm:/scratch/corrupt/.git/objects  (GIT_DIR!)$ rm -rf 22

Verify head in bad state

me@myvm:/scratch/corrupt/.git/objects  (GIT_DIR!)$ cd ../../
me@myvm:/scratch/corrupt  (master)$ git status
fatal: bad object HEAD

recover missing objects

me@myvm:/scratch/corrupt  (master)$ git fetch-pack --all $(git config --get remote.origin.url)
error: refs/heads/master does not point to a valid object!
error: refs/remotes/origin/HEAD does not point to a valid object!
error: refs/remotes/origin/master does not point to a valid object!
error: refs/heads/master does not point to a valid object!
error: refs/remotes/origin/HEAD does not point to a valid object!
error: refs/remotes/origin/master does not point to a valid object!
remote: Counting objects: 3, done.
remote: Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
22ecde746be79c65b27a5cf1dc421764d8ff6e17 HEAD
22ecde746be79c65b27a5cf1dc421764d8ff6e17 refs/heads/master
me@myvm:/scratch/corrupt  (master)$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

missing objects restored

me@myvm:/scratch/corrupt  (master)$ ll .git/objects/
total 20
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 20
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:05 22
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 25
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 info
drwxrwxr-x 2 andrewc warp 4096 Oct  7 06:03 pack
me@myvm:/scratch/corrupt  (master)$ 


me@myvm:/scratch/corrupt  (master)$ git status
On branch master
Your branch is up-to-date with 'origin/master'.
nothing to commit, working directory clean

If you end up in a state where you can find a broken tree object and a broken blob object you can manually recover those. You can git cat-file -p BLOB_SHA for any blob, this will dump the contents. If you can figure out by looking at the contents what the file is that can help you recover the file. Likewise git cat-file -p TREE_SHA will dump the tree, which tells you file names and blob SHAs. At this point you would be attempting to manually construct tree and commit objects from presumably partial data. If your HEAD commit is OK then you are only missing history and should at least have the most recent state covered.




回答2:


So my assumption is that I'm merely missing some information that can be recovered from GitHub.

Generally true, but it helps if you can identify from where exactly that broken link comes from.

That is what will propose Git 2.10 (Q3 2016) with:

git fsck --name-objects

See commit 90cf590, commit 1cd772c, commit 7b35efd, commit 993a21b (17 Jul 2016) by Johannes Schindelin (dscho).
(Merged by Junio C Hamano -- gitster -- in commit 9db3979, 25 Jul 2016)

fsck: optionally show more helpful info for broken links

When "git fsck" reports a broken link (e.g. a tree object contains a blob that does not exist), both containing object and the object that is referred to were reported with their 40-hex object names.
The command learned the "--name-objects" option to show the path to the containing object from existing refs (e.g. "HEAD~24^2:file.txt").


Three years later, git fsck is being refactored in Git 2.25 (Q1 2020): Crufty code and logic accumulated over time around the object parsing and low-level object access used in "git fsck" have been cleaned up.

This, in turn, fixes how fsck decorates its entries.

See commit b2f2039, commit c5b4269, commit 103fb6d, commit f648ee7, commit cc57900, commit 7854399, commit b8b00f1, commit 6da40b2, commit 3837025, commit f597937, commit 5afc4b1, commit 82ef89b, commit 7339029, commit d40bbc1, commit a59cfb3, commit 23a173a, commit 2175a0c, commit ec65231, commit 1de6007, commit 78d5014, commit 12736d2, commit c78fe00 (18 Oct 2019), and commit 228c78f (25 Oct 2019) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit 0e07c1c, 01 Dec 2019)

fsck: unify object-name code

Signed-off-by: Jeff King

Commit 90cf590f53 ("fsck: optionally show more helpful info for broken links", 2016-07-17, Git v2.10.0-rc0 -- merge listed in batch #7) added a system for decorating objects with names. The code is split across builtin/fsck.c (which gives the initial names) and fsck.c (which adds to the names as it traverses the object graph).
This leads to some duplication, where both sites have near-identical describe_object() functions (the difference being that the one in builtin/fsck.c uses a circular array of buffers to allow multiple calls in a single printf).

Let's provide a unified object_name API for fsck.

That lets us drop the duplication, as well as making the interface boundaries more clear (which will let us refactor the implementation more in a future patch).

We'll leave describe_object() in builtin/fsck.c as a thin wrapper around the new API, as it relies on a static global to make its many callers a bit shorter.

We'll also convert the bare add_decoration() calls in builtin/fsck.c to put_object_name().

This fixes two minor bugs:

  1. We leak many small strings. add_decoration() has a last-one-wins approach: it updates the decoration to the new string and returns the old one. But we ignore the return value, leaking the old string.
    This is quite common to trigger, since we look at reflogs: the tip of any ref will be described both by looking at the actual ref, as well as the latest reflog entry.
    So we'd always end up leaking one of those strings.

  2. The last-one-wins approach gives us lousy names.
    For instance, we first look at all of the refs, and then all of the reflogs.
    So rather than seeing "refs/heads/master", we're likely to overwrite it with "HEAD@{12345678}".
    We're generally better off using the first name we find.

And indeed, the test in t1450 expects this ugly HEAD@{} name.
After this patch, we've switched to using fsck_put_object_name()'s first-one-wins semantics, and we output the more human-friendly "refs/tags/julius" (and the test is updated accordingly).



来源:https://stackoverflow.com/questions/26228281/how-can-i-recover-heads-tree

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!