What's the difference between git-worktree and git-subtree?

问题

Just when I thought Git couldn't get any more complicated, I just discovered git worktree. Either this is a synonym for subtree or feature I never knew about. Is worktree the same as subtree or are they different. If they are different, how are they different and what problem does worktree solve?

回答1:

These are very different. To understand them properly, let's define work-tree (or "work tree" or "working tree" or pretty much any variant of these spellings), with respect to the index and to commits.

You already know that commits save snapshots, and that each commit has a unique hash ID that names that one particular commit. There can be many other names (branch and/or tag names, for instance) for that same commit, but there's just the one hash ID. You probably also know that commits have metadata: who made them (name and email address), when (timestamp), and why (message for git log to show). Each commit also has a parent hash ID—or rather, a list of parents, usually with just one entry. The parent is the commit that comes just before this one, so that Git can walk backwards through a chain of commits, to show things over time. (A commit that has two parent hash IDs is a merge commit. A commit with no parent hash IDs is a root commit, and there's at least one in any non-empty repository, since the first commit ever made has no commits before it.)

Everything—including the files—inside a commit is totally frozen for all time. You can't change any of it, not one bit, and the reason for this is that the hash ID is actually a cryptographic checksum of all of the commit's contents. If you were to somehow change just one bit, the checksum would be different, so that would be a different commit with a different hash ID.

This means that all the files stored inside any commit are frozen. They are also compressed, into a special Git-only format, that only Git can read. That's great for history, but how will we ever get any work done? This is where the work-tree enters the picture.

To work on files, we have to have Git copy them out of a commit. This puts the files back into their everyday form, where they're readable by everything—editors, compilers, whatever you have on your computer—and of course writable / changeable. That place where you work on / with your files is your work-tree.

Between the current commit (chosen however), and the work-tree, there are therefore two copies of every file: the frozen copy in the commit, and the useful one in the work-tree.

Git could stop here, and other version control systems such as Mercurial (see mercurial) do just that. But for various reasons—many of them having to do with "go really fast"—Git adds a third copy of every file. This third copy goes into what Git calls, variously, the index, the staging area, or the cache. (Which name you see depends on who or which part of Git is doing the calling.) Files in the index are pretty much in the same form they have in commits, except that in the index, they are not frozen. They're more ready-to-freeze, or "slushy", if you will.

The index also keeps tabs on the work-tree, so that they are closely paired: the index "knows" what's in the work-tree, or if it doesn't—if the cache aspect of the index is out of date—it knows that, which helps Git be quick about figuring out what's changed, if anything. Moreover, when you run git commit, Git doesn't really even look at the work-tree (except to add some comments to the file you'll edit for your log message). It just freezes the ready-to-go files out of the index, which is where the index gets its name staging area, to make the new commit.

In the end, when you are working with a commit in Git, you have three active copies at all times:

The HEAD commit copy is frozen and Git-only.
The index copy is slushy: Git-only, but not quite frozen. Initially it matches the HEAD copy but you can overwrite it with git add.
The work-tree copy is normal and fluid, and you can do anything with it.

The index and work-tree are paired up. Moreover, the index takes on an expanded role during merge conflicts: it winds up holding copies of files from three commits, these being the three inputs to the merge. While it's in this expanded mode, you can't even git stash or otherwise get away from a modified index-and-work-tree state, without either completing or aborting the merge.

This leaves us with a problem to solve: what if, in the middle of working on something, we need to fix, rather urgently, some bug in some other branch? We could make another clone, and that was the traditional answer. If we're not in the middle of a conflicted merge, we could use git stash; that was the other answer. One is not terribly satisfactory, and the other is useless if we're in the middle of a merge.

So, enter git worktree add. Using git worktree add, you can add another pair of index-and-work-tree to your existing repository. There is one very strong constraint (for good implementation-specific reason): every added work-tree must be on its own branch, or else use "detached HEAD" mode. That is, if your main work-tree is on branch feature/short, no added work-tree can use this branch. They can use master or hotfix or develop, but not feature/short. (Or, they can use a detached HEAD at any commit anywhere in the repository.)

When you're done with any of the added, secondary work-trees, you can simply rm -rf it, and then run git worktree prune from one of the other secondary work-trees, or the main work-tree, to have Git search for and not-find the added work-tree. That "unlocks" whatever branch the added work-tree had checked-out.

Meanwhile, the git subtree command is a fancy shell script that lets you extract some part of your existing repository to a new one that you will use elsewhere, or take the existing one you're using elsewhere and try to bring stuff back from it. So this is a repository-to-repository transfer—or at least the setup for it, in some cases.

(RomainValeri has also mentioned the git-merge-subtree merge strategy, which is sort of related to git subtree in that it aims to handle subtree renaming in one or two of the three inputs to a merge.)

回答2:

These concepts aren't similar and the comparison seems odd, beyond the similar sounding.

git worktree (doc) is a proper git command (whereas subtree is a contribution, thanks to Chris for the info) which basically helps you manage multiple worktrees on the same repo, with several additional subcommands (list, add, etc.).

Whereas subtree is, additionnally to the aforementionned contribution, one of the available merge strategies.

But as I said, these two aren't especially related, even if one could use a subtree merge in the context of a multi-worktree repo... which, I guess, is not part of your question.

来源：https://stackoverflow.com/questions/54622999/whats-the-difference-between-git-worktree-and-git-subtree

标签

git

subtree

git-worktree