I have to create some code review from unmerged branches.
In finding solutions, let\'s not go to local-branch context problem as this will run on a server; there wil
There is no correct answer to this question because it is underspecified.
Git history is simply a directed acyclic graph (DAG), and it's generally impossible to determine semantic relationships between two arbitrary nodes in a DAG unless the nodes are sufficiently labeled. Unless you can guarantee that the commit messages in your example graph follow a reliable, machine-parseable pattern, the commits are not sufficiently labeled—it's impossible to automatically identify the commits you are interested in without additional context (e.g., guarantees that your developers follow certain best practices).
Here's an example of what I mean. You say that commit a1 is associated with branch1, but this can't be determined with certainty just by looking at the nodes of your example graph. It's possible that once upon a time your example repository history looked like this:
* merge branch1 into branch2 - branch2's head
|\
_|/
/ * b1
| |
| |
_|_/
/ |
| * a1
* / m1
|/
|
* start - master's head
Note that branch1 doesn't even exist yet in the above graph. The above graph could have arisen from the following sequence of events:
branch2 is created at start in the shared repositorya1 on his/her local branch2 branchm1 and b1 on his/her local branch2 branchbranch2 branch to the shared repository, causing the branch2 ref in the shared repository to point to a1branch2 branch to the shared repository, but this fails with a non-fast-forward error (branch2 currently points to a1 and can't be fast-forwarded to b1)git pull, merging a1 into b1git commit --amend -m "merge branch1 into branch2" for some inexplicable reasonSome time later, user#1 creates branch1 off of a1 and creates a2, while user#2 fast-forward merges m1 into master, resulting in the following commit history:
* merge a1 into b1 - branch2's head
* |\ a2 - branch1's head
| _|/
|/ * b1
| |
| |
_|_/
/ |
| * a1
* / m1 - master's head
|/
|
* start
Given that this sequence of events is technically possible (although unlikely), how can a human let alone Git tell you which commits "belong" to which branch?
If you can guarantee that users don't change merge commit messages (they always accept the Git default), and that Git has never and will never change the default merge commit message format, then the merge commit's commit message can be used as a clue that a1 started off on branch1. You'll have to write a script to parse the commit messages—there are no simple Git one-liners to do this for you.
Alternatively, if your developers follow best practices (each merge is intentional and is meant to bring in a differently-named branch, resulting in a repository without those stupid merge commits created by git pull), and you are not interested in the commits from a completed child branch, then the commits you're interested in are on the first-parent path. If you know which branch is the parent of the branch you are analyzing, you can do the following:
git rev-list --first-parent --no-merges parent-branch-ref..branch-ref
This command lists the SHA1 identifiers for the commits that are reachable from branch-ref excluding the commits reachable from parent-branch-ref and the commits that were merged in from child branches.
In your example graph above, assuming parent order is determined by your annotations and not by the order of the lines going into a merge commit, git rev-list --first-parent --no-merges master..branch1 would print the SHA1 identifiers for commits a4, a3, a2, and a1 (in that order; use --reverse if you want the opposite order), and git rev-list --first-parent --no-merges master..branch2 would print the SHA1 identifiers for commits b4, b3, b2, and b1 (again, in that order).
If your developers do not follow best practices and your branches are littered with those stupid merges created by git pull (or an equivalent operation), but you have clear parent/child branch relationships, then writing a script to perform the following algorithm may work for you:
Find all commits reachable from the branch of interest excluding all commits from its parent branch, its parent's parent branch, its parent's parent's branch, etc., and save the results. For example:
git rev-list master..branch1 >commit-list
Do the same for all child, grandchild, etc. branches of the branch of interest. For example, assuming branch2 is considered to be a child of branch1:
git rev-list ^master ^branch1 branch2 >commits-to-filter-out
Filter out the results of step #2 from the results of step #1. For example:
grep -Fv -f commits-to-filter-out commit-list
The trouble with this approach is that once a child branch is merged into its parent, those commits are considered to be part of the parent even if development on the child branch continues. Although this makes sense semantically, it does not produce the result you say you want.
Here are some best practices to make this particular problem easier to solve in the future. Most if not all of these can be enforced via clever use of hooks in the shared repository.
--no-ff merges of the children branches (this is trickier than it should be).If all of your developers follow these rules, then a simple:
git rev-list --first-parent --no-merges parent-branch..child-branch
is all you need to see the commits that were made on that branch minus the commits made on its children branches.