git merge-distinct: staging multiple branches with octopus-merge

January 15th 2015 Tim Pettersen in Git, CD

git merge-distinct is a little tool that merges multiple branches containing non-conflicting changes into a single head, using git's octopus-merge strategy. Why would you want such a thing? Because while it's useful to develop in the isolation of a feature branch, it can often be useful to combine branches together for testing or deployment to a staging server.

Due to logical or literal conflicts, this strategy won't work for all branches. But there are a few use cases where this can be really handy. In fact, you're looking at one right now.

Our developer blog

Atlassian's developer blog has two environments: production (you're looking at it) and staging, which is mostly used for review. We treat our blog like code - each article is written on a feature branch and reviewed with a pull request. We also practice continuous deployment at the branch level, which means that any time a feature branch is created or updated, our Bamboo server rebuilds the site and deploys it. It's common practice to link to the staged version of your blog in your pull request, so reviewers can read a rendered version instead of the raw markdown.

Rendered > Raw Rendered vs. Raw

This worked great! Right up until our pool of authors grew and multiple articles were being developed concurrently. Then it became a game of "last update wins": with a single staging server and multiple branches, only the most recently modified branch would be deployed to the server. If your branch was staged and awaiting review, too bad! It would get clobbered by the next push:

It's clobbering time!

This meant that links to staged articles would often 404, so reviewers would often have to build a particular branch locally. Building the site locally was not only a time suck, it also excluded certain non-technical users from being able to participate in the review process.

Last time one of my staged articles was clobbered by a co-worker, I decided that it was time to fix the problem. I came up with three possible solutions that would make my articles constantly available for review:

Option 1: Stand up a new staging server for each branch

If we had multiple staging servers (one for each branch) we'd no longer clobber each others changes:

Multiple staging servers

However, this would quickly blow out our AWS bill. git branch --no-merge shows eight other blogs in development, so we'd need eight staging servers just to deal with the current pool of authors.

Option 2: Write a script that continually pushes updates to my branch

If I push frequently enough, I'll clobber everyone else's changes! Muahahahaha!

Instant clobbering through gratuitous pushing

While this would be a quick and easy fix for my problem, it's in clear violation of our fourth company value. And, quite frankly, a bit of a jerk move.

Option 3: Octopus merge the branches, then stage the result

Git supports a merge strategy named octopus-merge that allows you to merge more than two branches together (in some cases a lot more). I figured that when there was more than one outstanding branch, I could merge them together and then deploy the result to the staging server:

Octopus merge

While it might look complicated, performing an octopus merge is relatively simple (the command is just git merge <branch0> <branch1>.. <branchN>). However there are a few special requirements for our developer blog use case:

  1. The merge must never fail with conflicts. Because the staging job runs non-interactively, there won't be anyone around to resolve them.
  2. The merge must only include branches that contain changes to static content. Code is too dangerous to automatically merge. Even if changes don't literally conflict in a way that git recognizes, you may end up with a logical conflict resulting in a compilation failure or subtler bugs.
  3. There should be a way to opt out of the merge if your content isn't yet ready for review.
  4. Rather than hacking a solution directly into our build script, I wanted to build a general purpose tool for solving similar problems in the future.

With these requirements in mind, I created git merge-distinct. It's written in Node.js and packaged with npm because it exceeds my personal complexity threshold for a shell script. When run with no arguments it will create a new merge commit from your current HEAD and all of the other local branches in your repository that contain non-conflicting changes:

$ git merge-distinct
Merged 3 parents:
  feature/current-branch
  feature/another-branch
  feature/yet-another-branch

$ git log -n 1
commit 2d04b8bd51e3883b0af60defe39a90e568289b1b
Merge: a51aba5 06a467e 8263654
Author: Tim Pettersen <tim@atlassian.com>
Date:   Tue Jan 13 14:28:03 2015 -0800

    Merge result of:
      feature/current-branch
      feature/another-branch
      feature/yet-another-branch

Avoiding git conflicts

The reason git merge-distinct will never fail with conflicts is that it will never try to merge branches that modify the same path. Under the hood, it runs git branch --no-merge to determine which branches to merge into the current HEAD, iterates over them and ignores any branches that contain changes to the same path as a branch that has already been considered.

Avoiding logical conflicts

To ensure only static content is merged, I decided to allow the user to specify which paths are allowed to be modified on branches that are candidates for merging through --exclude and --include options. For example, the following command would merge all branches containing only changes under app/posts/ that didn't modify any .js files:

$ git merge-distinct --include 'app/posts/**' --exclude '**/*.js'

Selectively merging branches

To allow developers to opt out of having their changes merged (and subsequently staged), I decided to let the user provide a pattern specifying which branches to include. For example, the following command would merge all branches starting with feature/:

$ git merge-distinct 'feature/**'

git merge-distinct also supports a couple of other options for customizing the merge commit:

$ git-merge-distinct --help

Usage: git merge-distinct [<options>] [<branch glob>]

Options:

-i, --include <path glob>   only branches with changes modifying paths 
                            matching this pattern will be included
-x, --exclude <path glob>   any branches with changes modifying paths 
                            matching this pattern will be excluded
-n, --no-commit             perform the merge but do not autocommit, to give 
                            the user a chance to inspect and further tweak 
                            the merge result before committing.
-m, --message <message>     override the default commit message

We've incorporated it into the Developer Blog build process using the Bamboo Node.js plugin, and now we're no longer clobbering each others changes with every push.

git merge-distinct is generic enough that it should work for other projects which are wholly or partially static, and possibly for other use cases where you have multiple branches that need to be combined in an automated fashion. You can check out the source or install it locally (assuming you have git, node.js and npm installed) with:

$ npm install -g git-merge-distinct

Git is smart enough to recognize other binaries on your path starting with git-, so you can invoke it just like a standard git command using git merge-distinct.

If you have any feedback, issues or other use cases you think it'd be useful for, let me know on Twitter (I'm @kannonboy).

If you found this article useful, you may also enjoy Reverting an Octopus Merge.