What I want is similar to this question. However, I want the directory that is split into a separate repo to remain a subdirectory in that repo:
I have this:
I wanted to do a similar thing, but since the list of files that i wanted to keep was pretty long, it didn't make sense to do this using countless greps. I wrote a script that reads the list of files from a file:
#!/bin/bash
# usage:
# git filter-branch --prune-empty --index-filter \
# 'this-script file-with-list-of-files-to-be-kept' -- --all
if [ -z $1 ]; then
echo "Too few arguments."
echo "Please specify an absolute path to the file"
echo "which contains the list of files that should"
echo "remain in the repository after filtering."
exit 1
fi
# save a list of files present in the commit
# which is currently being modified.
git ls-tree -r --name-only --full-tree $GIT_COMMIT > files.txt
# delete all files that shouldn't be removed
while read string; do
grep -v "$string" files.txt > files.txt.temp
mv -f files.txt.temp files.txt
done < $1
# remove unwanted files (i.e. everything that remained in the list).
# warning: 'git rm' will exit with non-zero status if it gets
# an invalid (non-existent) filename OR if it gets no arguments.
# If something exits with non-zero status, filter-branch will abort.
# That's why we have to check carefully what is passed to git rm.
if [ "$(cat files.txt)" != "" ]; then
cat files.txt | \
# enclose filenames in "" in case they contain spaces
sed -e 's/^/"/g' -e 's/$/"/g' | \
xargs git rm --cached --quiet
fi
Quite suprisingly, this turned out to be much more work than i initially expected, so i decided to post it here.