问题
I'm trying to move some files between two git repositories repo1
and repo2
. I have a short list of files I'd like to move (preserving history).
Three files to move from repo1
:
libraryname/file1
libraryname/file2
tests/libraryname/file3
There are other files in libraryname/
and tests/libraryname/
. There are other folders in /
and tests/
My plan is to checkout repo1
, then modify the history tree until it only contains history for the files I'm interested in. Then checkout repo2
, and merge in the output of the previous operation. It seems like git filter-branch
is the right tool for the first step.
So far I've tried git filter-branch --index-filter 'git rm -r --cached <FILES>'
Where <FILES>
lists every unwanted whole folder or file.
But this leaves a lot of folders which no longer exist at HEAD
, but have existed at some point in this repositories lifetime. It seems quite tedious to figure out everything that has existed in the history of this repo - there must be a better way
How do I end up with a git commit tree which only includes these three files?
Is there a better way then I'm suggesting?
Or, is there a way to remove traces of all files which don't currently exist at HEAD
?
回答1:
You said it leaves behind folders; I assume you mean it leaves behind files in those folders (because git doesn't preserve empty folders)...
It seems like you might want to take the approach of clearing the index and then re-adding the entries you want.
git filter-branch ...
--index-filter 'git rm -r --cached * && git reset $GIT_COMMIT -- libraryname/file1 libraryname/file2 tests/libraryname/file3
...
Since you're thinning out the content so much, don't forget that you may want to include a --prune-empty
option
回答2:
With Git 2.24 (Q4 2019), git filter-branch is deprecated.
The equivalent would be, using newren/git-filter-repo, and its example section:
If you have a long list of files, directories, globs, or regular expressions to filter on, you can stick them in a file and use
--paths-from-file
; for example, with a file namedstuff-i-want.txt
with contents of
README.md
guides/
tools/releases
glob:*.py
regex:^.*/.*/[0-9]{4}-[0-9]{2}-[0-9]{2}.txt$
tools/==>scripts/
regex:(.*)/([^/]*)/([^/]*)\.text$==>\2/\1/\3.txt
then you could run
git filter-repo --paths-from-file stuff-i-want.txt
In your case, stuff-i-want.txt
would be:
libraryname/file1
libraryname/file2
tests/libraryname/file3
回答3:
Here is a whitelist-based approach which might be faster (because it only needs to compare whole lines of pre-sorted lists) and easier if a large number of files is involved.
Create a sorted list of all files in all commits of your branch:
$ export LC_COLLATE=C whitelist="$(mktemp)" && git log --name-status | sed 's/^[A-Z][[:space:]]\{1,\}//; t; d' | sort -u > "$whitelist"
Edit that list with your favorite text editor and remove all files which are not of interest for keeping, i. e. create a white list of files to keep.
$ "$EDITOR" -- "$whitelist" # remove from list what you don't want to keep
Perform the actual filter operation:
$ git filter-branch -f --index-filter 'git ls-files -c | sort | comm -23 -- - "$whitelist" | while IFS= read -r f; do git rm --cached -- "$f"; done' --prune-empty
Remove the white list once the filter operation worked without problems.
$ rm -- "$whitelist" && unset LC_COLLATE whitelist
来源:https://stackoverflow.com/questions/45633033/remove-history-for-everything-except-a-list-of-files-using-git-filter-branch