Working with Git and Perforce: integration workflow

February 13th 2015 Nicola Paolucci in Git, Perforce, Workflow

So here's the scenario: your team works independently on a Git repository but a part of the organization still uses Perforce to manage parts of the same code base. Those teams on Perforce are not planning to migrate but your team has already transitioned to Git (for lots of good reasons). It's important that you can keep a two-way code-sharing process ongoing between the code-bases so that improvements developed in either version can be shared, hopefully without costing too much time or bogging down your team too much.

Here is where this guide comes in. At TomTom this sync operation is done once every new release - which in their case happens once every month and a half.

-- This post has been written in collaboration with Andrea Carlevato, Adolfo Bulfoni and Krzysztof Karczewski who were kind enough to share the git process used at TomTom's NavApps Unit. --

Assumptions before we start

We'll make the assumption that you are already acquainted with basic git usage and familiar with a feature branch workflow. If not take the time to watch a hands-on tutorial or a webinar. Come back when you're ready, I'll wait.

Because there are a few subtleties to keep in mind in the process, we suggest to be very careful when performing these integrations. Let's dive in when you're ready!

Installing git p4

The first step is to install the bridge. Check if you already have it installed by typing at a command line:

    git p4

If the system complains that git p4 is not installed, download git-p4.py and put it in a folder in your PATH, for example ~/bin (obviously you will need Python to be installed too for it to work).

Make it executable with:

    chmod +x git-p4.py

Edit ~/.gitconfig file by adding:

    [alias]
    p4 = !~/bin/bit-p4.py

Then run git p4 command again and you should get no errors. The tool is installed.

Overview of workflow

It will be of help to first present an overview of the workflow. The branches involved in the process are the following:

  • develop: Where the git development team actually works, merging to master completed pieces of work (features or bug fixes).
  • master: The branch that is kept in sync with P4 (and is ancestor of develop)
  • p4-integ: The integration branch that contains the latest completed sync with P4 (not ancestor of develop nor master)

The "Eagle Eye" view of the process is the following (after the initial clone and sync):

  • A marker tag named last-p4-integ is used to track the last commit that was sent to P4 (on master) in the previous integration round.
  • When a new integration process starts, commits that are more recent than the last-p4-integ tag are cherry-picked to the p4-integ branch (which syncs to P4).
  • The branch p4-integ is synchronized with the real P4 repository using a fast-sync technique (see below).
  • Tag last-p4-integ is updated.
  • At this point new changes coming from P4 to p4-integ are merged back to develop using a squash merge strategy.
  • Tag last-p4-integ is updated.
  • Rinse and repeat.

Now let's walk through the process with more detail starting from the initial clone.

Initial clone

Because projects in P4 can grow huge histories, the team can pick a cut-off point from which to synchronize, saving a lot of space and time. git p4 allows you to choose the change-list from which to start tracking:

git p4 clone //depot/path/project@<earlier-cutoff-point>,<latest-changelist>
  • Now we can run a sync and verify that we have received all the change-sets locally:
    git p4 sync

The sync command finds new changes in P4 and imports them as Git commits.

  • We name the branch we'll use to directly interface with Perforce p4-integ. At this point we just want to branch it off remotes/p4/master:
    git checkout -b p4-integ origin/p4/master

Follow-up fast synchronization (aka "bait and switch")

After the first import has been completed the subsequent git->p4 synchronizations can be done with the following commands:

git checkout p4-integ
git p4 sync

The above works but it can be slow. A much faster way to execute the sync is by recreating identical refs to the ones used in the latest integration. This is also a neat way to make sure that any new developer tasked with the integration starts at the right commit/change-list.

Here's how to go about it:

  • Remove the old original (or stale) refs for the p4 remote (optional):
git symbolic-ref -d refs/remotes/p4/HEAD
git update-ref -d refs/remotes/p4/master
  • Create artificial (aka fake) remote refs that point to the last commit on p4-integ on origin:
git update-ref refs/remotes/p4/master remotes/origin/p4-integ
git symbolic-ref refs/remotes/p4/HEAD refs/remotes/p4/master

The only drawback of this much faster sync is that we need to specify explicitly the branch in git p4. So here is the final command:

git p4 sync --branch=refs/remotes/p4/master

The way git p4 tracks the mapping between git commit ids and P4's is by annotating commits with meta-data:

Merge pull request #340 in MOB/project from bugfix/PRJ-3185 to develop

    Squashed commit of the following:

    commit c2843b424fb3f5be1ba64be51363db63621162b4
    Author: Some Developer
    Date:   Wed Jan 14 09:26:45 2015 +0100

        [PRJ-3185] The app shows ...

    commit abc135fc1fccf74dac8882d70b1ddd8a4750f078
    Author: Some Developer
    Date:   Tue Jan 13 14:18:46 2015 +0100

        [PRJ-3185] The app shows the rating ...

    [git-p4: depot-paths = "//depot-mobile/project/": change = 1794239]

Note that in a more recent version of git p4 the meta-data to associate a git commit with a P4 change-list is stored in a commit note and not in the commit message. The TomTom team didn't love the change because it made it slightly more work to check the change-list numbers when needed.

Moving changes from git to Perforce

After the fast sync operation above has been completed, you are now ready to push changes from git to Perforce.

The first step is to rebase p4-integ with changes coming from remotes/p4/master:

git checkout p4-integ
git p4 rebase

After this, all new changes from Perforce should be on p4-integ so we can update master:

  • After that you can simply:
git checkout master
git merge develop
  • Make sure you have latest tags locally:
git fetch --tags
  • Use a temporary cleanup in case you need to remove commits already in P4 (see P4 tag in commit). In case no commits are to be skipped an automatic rebase which will linearize the history:
git checkout -b cleanup #branching off from master
git rebase -s recursive -X theirs tag/last-p4-integ
  • Using an interactive rebase this can be done instead with:
git rebase -i tag/last-p4-integ
  • Use cherry-pick to pick the new commits and put them on p4-integ branch. We do it this way because we make no assumption that the git branches master and develop can be kept as proper ancestors of the p4-integ branch. In fact at TomTom this is not the case anymore.
git checkout p4-integ
git cherry-pick tag/last-p4-integ..cleanup
  • Submit to P4 and sync p4-integ:
git p4 submit
git p4 sync --branch=refs/remotes/p4/master
git reset --hard refs/remotes/p4/master
  • Delete temporary rebase branch:
git branch -D cleanup
  • Remove pointer to latest integration point (tag) locally and on the remote:
git tag -d tag/last-p4-integ
git push origin :refs/tags/tag/last-p4-integ
  • Update the tag last-p4-integ to point to the new integration point in P4:
git checkout develop
git tag -a tag/last-p4-integ -m "tag pointer to last develop commit integrated with p4"
git push origin master
git push origin tag/last-p4-integ
git push origin p4-integ

Run tests on P4 code-base to verify the integration didn't introduce problems.

Moving changes from Perforce to git

This should be done after the git->P4 push has been already done. After the tests pass successfully on P4, we can now move changes from P4 to git with the following:

git checkout p4-integ
git p4 sync --branch=refs/remotes/p4/master
git p4 rebase
  • The following is as small trick to perform a robust "theirs" merge strategy, squashing the incoming changes down to a single commit. So here it goes:
git checkout -b p4mergebranch #branching off from p4-integ
git merge -s ours master ## ignoring all changes from master
git checkout master
git merge p4mergebranch --squash
git commit -m "Type your integration message"

Note that if you don't specify an integration message, git will write a very long commit message containing all commits (you can always perform a git commit --amend to fix the final commit message before proceeding).

        git branch -D p4mergebranch
  • Once we are done with the above, merge changes to develop:
git checkout develop
git merge master

Since there might have been some changes since we've picked changes from develop there might be a need to merge them first. It's important, however, to update the last-p4-integ tag to the right commit, especially not the merge commit to develop. To do this in a safe way, it's best to tag the current state of master:

  • Remove old tag locally and on the remote:
git tag -d tag/last-p4-integ
git push origin :refs/tags/tag/last-p4-integ
  • Create tag at new position:
git checkout master
git tag -a tag/last-p4-integ -m "tag pointer to last develop commit integrated with p4"
  • Now push master, develop, p4-integ and tag/last-p4-integ to origin:
git push origin master
git push origin develop
git push origin tag/last-p4-integ

Conclusions

So that's how you sync between two active development teams using Git and Perforce. The process above evolved over time at TomTom and now has been running without major problems for quite some time. It works, but it's a fair amount of overhead to maintain. If you have the option, we recommend migrating completely to Git.

In any case if you follow a different approach to maintaining a two way sync I'd be very curious to hear about it in the comments below. Or send a tweet at @durdn or @atlassiandev.

Thanks again to Andrea Carlevato, Adolfo Bulfoni and Krzysztof Karczewski.