I\'m looking for a way to set up git respositories that include subsets of files from a larger repository, and inherit the history from that main repository. My primary motivat
As I understand your question
git subtree or git submodulesOne way to extract the history of just a subset of the files into a dedicated branch (which you then can push into a dedicated repository) is using git filter-branch:
# regex to match the files included in this subproject, used below
file_list_regex='^subproject1/|^shared_file1$|^lib/shared_lib2$'
git checkout -b subproject1 # create new branch from current HEAD
git filter-branch --prune-empty \
--index-filter "git ls-files --cached | grep -v -E '$file_list_regex' | xargs -r git rm --cached" \
HEAD
This will
subproject1 based on the current HEAD (git checkout -b subproject1)git filter-branch [...] HEAD)xargs -r git rm --cached) that are not part of the subproject (git ls-files --cached | grep -v -E '$file_list_regex')--prune-empty).--index-filter/--cached).This is a one-time operation though but as I understand your question you want to continously update the extracted subproject repositories/branches with new commit.
The good news is you could simply repeat this command since git filter-branch will always produce the same commits/history for your subproject branches - given that you don't manually alter them or rewrite your master branch.
The drawback of this is that this would filter-branch the complete history each time and for each subproject again and again.
Given that you only want to add the last 5 commits of the master branch to the tip of your existing subproject1 branch you could adapt the commands like this:
# get the full commit ids for the commits we consider
# to be equivalent in master and subproject1 branch
common_base_commit="$(git rev-parse master~6)"
subproject_tip="$(git rev-parse subproject1)"
# checkout a detached HEAD so we don't change the master branch
git checkout --detach master
git filter-branch --prune-empty \
--index-filter "git ls-files --cached | grep -v -E '$file_list_regex' | xargs -r git rm --cached" \
--parent-filter "sed s/${common_base_commit}/${subproject_tip}/g" \
${common_base_commit}..HEAD
# force reset subproject1 branch to current HEAD
git branch -f subproject1
Explanation:
git filter-branch [...] ${common_base_commit}..HEAD) up to master~6 which we consider to be the equivalent commit to subproject1s current tip.master~6 to subproject1 (--parent-filter 'sed s/${common_base_commit}/${subproject_tip}/g') effectively rebasing the 5 rewritten commits on top of subproject1.subproject1 to include the new commits on top of it.Further optimazation/automation:
$file_list_regex) or actually to exclude (git ls-files --cached | grep -v -E '$file_list_regex') from a given subproject$GIT_COMMIT) or check-in the list to the repository itself in case the files to include per subproject may change over timegit update-project subproject1