Commit Indexer Plugin Module

Introduction

Bitbucket Data Center has a lightweight indexing process that occurs every time new commits are pushed to its repositories. Commit indexer plugin modules let you extend the indexing pipeline and store additional indexed data. This allows your plugin to, for example, watch commit messages or other commit metadata for particular content or efficiently store additional information about commits for later retrieval.

Example commit indexers might include:

An indexer that looks for particular substring or pattern in a commit message, for example an issue or build tracking number (this is how Bitbucket Data Center's bundled Jira key indexing is implemented).
An indexer that records commit statistics to be used later for generating reports or charts.

Commit indexers should be fast. The indexing process is applied to every commit pushed to Bitbucket Data Center, so long-running or computationally expensive indexing operations can have a serious impact on server performance. Where possible, indexers should avoid making database queries or calling service methods which might fork git processes. If those types of operations are required for the indexer to work, look for ways to do that processing in batches to reduce the overhead from repeatedly accessing the database or spawning new processes.

Commit indexers may be invoked concurrently, so they should avoid storing local state. Indexers are provided with a simple key/value store that can be used to store such state in a way that is associated with a specific indexing run. State stored in the IndexingContext is not added to the CommitIndex! To add new properties to the index, you must call CommitIndex.addProperty(String, String, String). Key/value pairs stored in the IndexingContext are discarded when the indexing run completes.

See the CommitIndexer documentation for more details.

Dos and Don'ts

don't do expensive operations for every commit. It will slow down other indexers. If you must do expensive operations, consider offloading them to background threads. In this context any operation involving I/O is considered expensive.
do handle seeing the same commit twice. Write your indexers in such a way that they can handle getting the same commit multiple times, both in the same repository and across different repositories.
do consider memory usage. Your indexer may get called 100,000 times in a single indexing run. Make sure you don't keep lots of stuff in memory.
do consider if bulk processing is more efficient than processing each commit individually. For example, consider recording added/removed hashes and doing bulk processing in the onAfterIndexing callback, or scheduling additional bulk processing in the ofAfterIndexing callback.
do consider permissions. The indexer gets all commits for all repositories. What you do with it is up to you, but consider that when you present the information in the UI, the user may not have permission to see the commit in any repository.

Configuration

The root element for the Commit Indexer plugin module is <commit-indexer/>. It allows the following attributes for configuration:

Attributes

Name	Required	Description	Default
key	Yes	The identifier of the plugin module. This key must be unique within the plugin where it is defined.	N/A
class	Yes	The fully qualified Java class name of the indexer. This class must implement `CommitIndexer`.	N/A

Example

Here is an example atlassian-plugin.xml file containing a single commit indexer:

1
2
<atlassian-plugin name="My Commit Indexer" key="example.plugin.indexer" plugins-version="2">
    <plugin-info>
        <description>My Commit Indexer Test</description>
        <vendor name="My Company" url="http://www.mycompany.com"/>
        <version>1.0</version>
    </plugin-info>

    <commit-indexer key="myCommitIndexer" class="com.mycompany.example.plugin.MyCommitIndexer"/>
</atlassian-plugin>

And here's an example CommitIndexer implementation that indexes whether commits have been committed by an author from Atlassian:

1
2
public class MyCommitIndexer implements CommitIndexer {

    private final CommitIndex index;

    public MyCommitIndexer(CommitIndex index) {
        this.index = index;
    }

    @Nonnull
    @Override
    public String getId() {
        // must be unique across *all* indexers
        return "atlassian-author-indexer";
    }

    @Override
    public boolean isEnabledForRepository(@Nonnull Repository repository) {
        // allows conditionally indexing repositories
        return true;
    }

    @Override
    public void onAfterIndexing(@Nonnull IndexingContext context) {
        // no-op, no tear down required
    }

    @Override
    public void onBeforeIndexing(@Nonnull IndexingContext context) {
        // no-op, no setup required
    }

    @Override
    public void onCommitAdded(@Nonnull Commit commit, @Nonnull IndexingContext context) {
        // set the 'byAtlassian' property if the author's email address ends in '@atlassian.com'
        String email = commit.getAuthor().getEmailAddress();
        if (email != null && email.endsWith("@atlassian.com")) {
            index.addProperty(commit.getId(), "byAtlassian", "true");
        }
    }

    @Override
    public void onCommitRemoved(@Nonnull Commit commit, @Nonnull IndexingContext context) {
        // no-op, no clean up required - indexed properties are generally cleaned up automatically
    }
}

Usually the methods on CommitIndex and CommitService are sufficient for dealing with properties. However, the Commit Property Config Plugin Module can be used to automatically decorate commits returned by Bitbucket Data Center's Java API and REST API with properties indexed by your plugin.