问题
When I read about git-rebase, I understood the rebased commits should get lost. I say should because I noticed that, knowing the rebased commit sha, I can recall it.
Suppose I have the following three commits
A -> B -> C
where C
's sha is cshaid
. Then, if I interactively rebase fixing-up
C
into B
with git rebase -i HEAD~2
and then I check the result with git log
, I obtain the expected result, meaning
A -> B'
where B'
's sha is different from B
's sha.
However, running git log cshaid
shows again
A -> B -> C
Questions: is this a known behavior? I tried reading git rebase --help
but couldn't find related info. Why rebased commits are not simply forgot? I mean, rebase is kind of a dangerous operation to be performed only if you know what you are doing and you can do it, which is the point in having a dirty index (or wherever these useless commits are kept)? Am I missing something?
Step to reproduce (and to better understand my doubts). If you are willing to reproduce the situation, try with:
mkdir sampledir && cd sampledir && git init
touch file && git add -A . && git commit -m "Initial"
- edit file, then
git commit -am "First modification"
- edit file, then
git commit -am "Second modification"
git log
, you will see three commits, remember the sha forSecond modification
git rebase -i HEAD~2
, thefixup
Second modification
intoFirst modification
git log
, you will see two commits, where the sha forFirst modification
is now different than in step 5- however,
git log sha-for-"Second modification"
will show the exact same tree as point 5 in this list
回答1:
Yes, this is the expected behavior. Unreferenced commits will eventually be garbage collected and thereby purged from disk. They're kept around for a number of days (by default 14), but before that 14-day timer even starts ticking the objects must have expired from the reflog as well (unreachable objects by default expire after 30 days).
Related StackOverflow questions:
- git reflog expire and git fsck --unreachable
- Listing and deleting Git commits that are under no branch (dangling?)
回答2:
... I understood the rebased commits should get lost
They're not lost, they're (deliberately) "abandoned" (my term).
It's true that rebase
copies (the contents of) the old commits. In fact, except for special optimizations and such, it's basically identical to doing git cherry-pick
(and the interactive rebase script uses git cherry-pick
for each "pick" operation, and amend-style commits for "squash" and "fixup" operations).
When and whether commits in a repository are visible, however, is decided by something else entirely. Normally git log
starts with the name of the branch you're on, as recorded in HEAD
(there's a file in your .git
directory called HEAD
, which contains the string ref: refs/heads/master
, and that's how git knows that you're "on branch master").1
Given a branch name, git turns that into a (single) commit by "reading the reference":
$ git rev-parse master # note: you can also rev-parse HEAD directly
676699a0e0cdfd97521f3524c763222f1c30a094
The log
command can then read the commit object by its SHA-1. That commit object has some parent SHA-1s, and git log
reads those too, and so on, until it reaches a commit that has no parents (a "root" commit).
So, given a root commit A
, and second and third commits B
and C
—plus a label, master
, that points to C
:
A <-- B <-- C <-- HEAD=master
(the arrows here show who points to whom, it goes the other way than in your drawing!), git can find (reach) commits A
through C
, starting at C
and working backwards.
The rebase copies B
and folds in C
, giving B'
as you expected:
A <-- B <-- C
^
\
B'
What makes B'
show up with git log
is that the label, master
, is "peeled off" of commit C
and "pasted onto" commit B'
. More precisely, the file for branch master
(.git/refs/heads/master
)2 gets rewritten with the new SHA-1 for B'
:
A <-- B <-- C [no label, "abandoned"]
^
\
B' <-- HEAD=master
As the answers that beat me to it noted, the "abandoned" commits (along with any other abandoned objects in the repository) are eventually removed for real by the "garbage collector", git gc
.
The claim that there's "no" label is a little overblown, though. There's at least one label, hidden away in the "reflog", that keeps commit C
from being garbage-collected. And, if you create a branch or tag label that refers to C
, either before or after the rebase
moves the master
label, that label will also keep C
in the repository, accessible by "ordinary" name, and you'll see it with git log --all
(which looks at all branch and tag names, rather than just the one in HEAD
).
1The HEAD
file can instead contain a raw SHA-1. In this case you have what git calls a "detached HEAD": you're at a commit by its SHA-1, rather than its branch-name.
2Branch and tag names (really, any reference at all) can be "packed", in which case the separate file goes away. This saves space, and you're not supposed to depend on the existence of the separate file. However, once a branch becomes "active"—being updated a lot—the separate file will re-appear since it's faster and easier to update that one file, than to update the packed-refs file.
回答3:
git rebase
, like other commands that alter history in a destructive way, removes the reference to the obsolete commit, but doesn't cause it to be immediately deleted. git gc, which is automatically executed periodically during the course of normal operation, will (eventually) delete the actual commit data from .git/objects
(although the reflog will keep a reference to the commit alive for some time).
This is a safety feature; it makes it quite difficult to actually lose data with git. If you really want to make sure something is gone -- for example, if you've accidentally committed a gigantic file and you want to get back the disk space -- you need to expire the reflog entries and run git gc
manually:
git reflog expire --expire=now --all
git gc --aggressive --prune=now
回答4:
I understood the rebased commits should get lost.
Nope. Commits don't get lost. In a busy repo, git will eventually garbage-collect things that are completely unreachable from any ref and have been for a month or more, but other than git gc
, git operations only add to the history graph.
Moving labels around has no effect at all on the actual histories in your repo.
来源:https://stackoverflow.com/questions/21384773/git-rebase-rebased-commit-still-in-index