Last updatedMar 27, 2020

Data Center App Performance Toolkit User Guide For Bitbucket

To use the Data Center App Performance Toolkit, you'll need to first clone its repo.

1
git clone git@github.com:atlassian/dc-app-performance-toolkit.git

Follow installation instructions described in the dc-app-performance-toolkit/README.md file.

If you need performance testing results at a production level, follow instructions in this chapter to set up Bitbucket Data Center with the corresponding dataset.

For spiking, testing, or developing, your local Bitbucket instance would work well. Thus, you can skip this chapter and proceed with Testing scenarios. Still, script adjustments for your local dataset may be required.

Setting up Bitbucket Data Center

We recommend that you use the AWS Quick Start for Bitbucket Data Center to deploy a Bitbucket Data Center testing environment. This Quick Start will allow you to deploy Bitbucket Data Center with a new Atlassian Standard Infrastructure (ASI) or into an existing one.

The ASI is a Virtual Private Cloud (VPC) consisting of subnets, NAT gateways, security groups, bastion hosts, and other infrastructure components required by all Atlassian applications, and then deploys Bitbucket into this new VPC. Deploying Bitbucket with a new ASI takes around 50 minutes. With an existing one, it'll take around 30 minutes.

Using the AWS Quick Start for Bitbucket

If you are a new user, perform an end-to-end deployment. This involves deploying Bitbucket into a new ASI.

If you have already deployed the ASI separately by using the ASI Quick Start or by deploying another Atlassian product (Jira, Bitbucket, or Confluence Data Center), deploy Bitbucket into your existing ASI.

You are responsible for the cost of the AWS services used while running this Quick Start reference deployment. There is no additional price for using this Quick Start. For more information, go to aws.amazon.com/pricing.

To reduce costs, we recommend you to keep your deployment up and running only during the performance runs.

AWS cost estimation

SIMPLE MONTHLY CALCULATOR provides an estimate of usage charges for AWS services based on certain information you provide. Monthly charges will be based on your actual usage of AWS services, and may vary from the estimates the Calculator has provided.

*The prices below are approximate and may vary depending on factors such as (region, instance type, deployment type of DB, etc.)

StackEstimated hourly cost ($)
One Node Bitbucket DC1 - 1.3
Two Nodes Bitbucket DC1.5 - 1.8
Four Nodes Bitbucket DC2.1 - 2.5

Quick Start parameters

All important parameters are listed and described in this section. For all other remaining parameters, we recommend using the Quick Start defaults.

Bitbucket setup

ParameterRecommended Value
Version6.10.0

The Data Center App Performance Toolkit officially supports:

Cluster nodes

ParameterRecommended Value
Bitbucket cluster node instance typec5.2xlarge
Maximum number of cluster nodes1
Minimum number of cluster nodes1

We recommend c5.2xlarge to strike the balance between cost and hardware we see in the field for our enterprise customers. More info could be found in public recommendations.

The Data Center App Performance Toolkit framework is also set up for concurrency we expect on this instance size. As such, underprovisioning will likely show a larger performance impact than expected.

File server

ParameterRecommended Value
File server instance typem4.xlarge
Home directory size1000

Database

ParameterRecommended Value
Database instance classdb.m4.large
RDS Provisioned IOPS1000
Master passwordPassword1!
Enable RDS Multi-AZ deploymenttrue
Bitbucket database passwordPassword1!
Database storage100

The Master (admin) password will be used later when restoring the SQL database dataset. If password value is not set to default, you'll need to change DB_PASS value manually in the restore database dump script (later in Preloading your Bitbucket deployment with an enterprise-scale dataset).

Elasticsearch

ParameterRecommended Value
Elasticsearch instance typem4.xlarge.elasticsearch
Elasticsearch disk-space per node (GB)1000

Networking (for new ASI)

ParameterRecommended Value
Trusted IP range0.0.0.0/0 (for public access) or your own trusted IP range
Availability ZonesSelect two availability zones in your region
Permitted IP range0.0.0.0/0 (for public access) or your own trusted IP range
Make instance internet facingtrue
Key NameThe EC2 Key Pair to allow SSH access. See Amazon EC2 Key Pairs for more info.

Networking (for existing ASI)

ParameterRecommended Value
Make instance internet facingtrue
Permitted IP range0.0.0.0/0 (for public access) or your own trusted IP range
Key NameThe EC2 Key Pair to allow SSH access. See Amazon EC2 Key Pairs for more info.

Running the setup wizard

After successfully deploying Bitbucket Data Center in AWS, you'll need to configure it:

  1. In the AWS console, go to Services > CloudFormation > Stack > Stack details > Select your stack.
  2. On the Outputs tab, copy the value of the LoadBalancerURL key.
  3. Open LoadBalancerURL in your browser. This will take you to the Bitbucket setup wizard.
  4. On the Bitbucket setup page, populate the following fields:
    • Application title: any name for your Bitbucket Data Center deployment
    • Base URL: your stack's Elastic LoadBalancer URL
    • License key: select new evaluation license or existing license checkbox Click Next.
  5. On the Administrator account setup page, populate the following fields:
    • Username: admin (recommended)
    • Full name: any full name of the admin user
    • Email address: email address of the admin user
    • Password: admin (recommended)
    • Confirm Password: admin (recommended) Click Go to Bitbucket.

After Preloading your Bitbucket deployment with an enterprise-scale dataset, the admin user will have admin/admin credentials.

Preloading your Bitbucket deployment with an enterprise-scale dataset

Data dimensions and values for an enterprise-scale dataset are listed and described in the following table.

Data dimensionsValue for an enterprise-scale dataset
Projects~25 000
Repositories~52 000
Users~25 000
Pull Requests~ 1 000 000
Total files number~750 000

All the datasets use the standard admin/admin credentials.

Pre-loading the dataset is a three-step process:

  1. Importing the main dataset. To help you out, we provide an enterprise-scale dataset you can import either via the populate_db.sh script or restore from xml backup file.
  2. Restoring attachments. We also provide attachments, which you can pre-load via an upload_attachments.sh script.

The following subsections explain each step in greater detail.

Importing the main dataset

You can load this dataset directly into the database (via a populate_db.sh script).

Loading the dataset via populate_db.sh script (~2 hours)

We recommend doing this via the CLI.

To populate the database with SQL:

  1. In the AWS console, go to Services > EC2 > Instances.
  2. On the Description tab, do the following:
    • Copy the Public IP of the Bastion instance.
    • Copy the Private IP of the Bitbucket node instance.
    • Copy the Private IP of the Bitbucket NFS Server instance.
  3. Using SSH, connect to the Bitbucket node via the Bastion instance:

    For Windows, use Putty to connect to the Bitbucket node over SSH. For Linux or MacOS:

    1
    2
    3
    4
    5
    ssh-add path_to_your_private_key_pem
    export BASTION_IP=bastion_instance_public_ip
    export NODE_IP=node_private_ip
    export SSH_OPTS='-o ServerAliveInterval=60 -o ServerAliveCountMax=30'
    ssh ${SSH_OPTS} -o "proxycommand ssh -W %h:%p ${SSH_OPTS} ec2-user@${BASTION_IP}" ec2-user@${NODE_IP}

    For more information, go to Connecting your nodes over SSH.

  4. Stop Bitbucket Server:

    1
    sudo systemctl stop bitbucket
  5. In a new terminal session connect to the Bitbucket NFS Server over SSH:

    For Windows, use Putty to connect to the Bitbucket node over SSH. For Linux or MacOS:

    1
    2
    3
    4
    5
    ssh-add path_to_your_private_key_pem
    export BASTION_IP=bastion_instance_public_ip
    export NFS_SERVER_IP=nfs_server_private_ip
    export SSH_OPTS='-o ServerAliveInterval=60 -o ServerAliveCountMax=30'
    ssh ${SSH_OPTS} -o "proxycommand ssh -W %h:%p ${SSH_OPTS} ec2-user@${BASTION_IP}" ec2-user@${NFS_SERVER_IP}

    For more information, go to Connecting your nodes over SSH.

  6. Download the populate_db.sh script and make it executable:

    1
    wget https://raw.githubusercontent.com/atlassian/dc-app-performance-toolkit/master/app/util/bitbucket/populate_db.sh && chmod +x populate_db.sh
  7. Review the following Variables section of the script:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    INSTALL_PSQL_CMD="amazon-linux-extras install -y postgresql10"
    DB_CONFIG="/media/atl/bitbucket/shared/bitbucket.properties"
    
    # Depending on BITBUCKET installation directory
    BITBUCKET_CURRENT_DIR="/opt/atlassian/bitbucket/current/"
    BITBUCKET_VERSION_FILE="/media/atl/bitbucket/shared/bitbucket.version"
    
    # DB admin user name, password and DB name
    BITBUCKET_DB_NAME="bitbucket"
    BITBUCKET_DB_USER="postgres"
    BITBUCKET_DB_PASS="Password1!"
    
    # Datasets AWS bucket and db dump name
    DATASETS_AWS_BUCKET="https://centaurus-datasets.s3.amazonaws.com/bitbucket"
    DATASETS_SIZE="large"
  8. Run the script:

    1
    ./populate_db.sh | tee -a populate_db.log

Do not close or interrupt the session. It will take about an hour to restore SQL database. When SQL restoring is finished, an admin user will have admin/admin credentials.

In case of a failure, check the Variables section and run the script one more time.

Restoring attachments (~2 hours)

After Importing the main dataset, you'll now have to pre-load an enterprise-scale set of attachments.

  1. Using SSH, connect to the Bitbucket NFS Server via the Bastion instance:

    For Windows, use Putty to connect to the Bitbucket node over SSH. For Linux or MacOS:

    1
    2
    3
    4
    5
    ssh-add path_to_your_private_key_pem
    export BASTION_IP=bastion_instance_public_ip
    export NFS_SERVER_IP=nfs_server_private_ip
    export SSH_OPTS='-o ServerAliveInterval=60 -o ServerAliveCountMax=30'
    ssh ${SSH_OPTS} -o "proxycommand ssh -W %h:%p ${SSH_OPTS} ec2-user@$BASTION_IP" ec2-user@${NFS_SERVER_IP}

    For more information, go to Connecting your nodes over SSH.

  2. Download the upload_attachments.sh script and make it executable:

    1
    wget https://raw.githubusercontent.com/atlassian/dc-app-performance-toolkit/master/app/util/bitbucket/upload_attachments.sh && chmod +x upload_attachments.sh
  3. Review the following Variables section of the script:

    1
    2
    3
    4
    5
    6
    DATASETS_AWS_BUCKET="https://centaurus-datasets.s3.amazonaws.com/bitbucket"
    ATTACHMENTS_TAR="attachments.tar.gz"
    DATASETS_SIZE="large"
    ATTACHMENTS_TAR_URL="${DATASETS_AWS_BUCKET}/${BITBUCKET_VERSION}/${DATASETS_SIZE}/${ATTACHMENTS_TAR}"
    NFS_DIR="/media/atl/bitbucket/shared"
    ATTACHMENT_DIR_DATA="data"
  4. Run the script:

    1
    ./upload_attachments.sh | tee -a upload_attachments.log

Do not close or interrupt the session. It will take about two hours to upload attachments.

Start Bitbucket Server

  1. Using SSH, connect to the Bitbucket node via the Bastion instance:

    For Windows, use Putty to connect to the Bitbucket node over SSH. For Linux or MacOS:

    1
    2
    3
    4
    5
    ssh-add path_to_your_private_key_pem
    export BASTION_IP=bastion_instance_public_ip
    export NODE_IP=node_private_ip
    export SSH_OPTS='-o ServerAliveInterval=60 -o ServerAliveCountMax=30'
    ssh ${SSH_OPTS} -o "proxycommand ssh -W %h:%p ${SSH_OPTS} ec2-user@${BASTION_IP}" ec2-user@${NODE_IP}

    For more information, go to Connecting your nodes over SSH.

  2. Start Bitbucket Server:

    1
    sudo systemctl start bitbucket
  3. Wait 10-15 minutes until Bitbucket Server is started.

  4. Open browser and navigate to LoadBalancerURL.
  5. Login with admin user.
  6. Go to cog icon > Server settings, set Base URL to LoadBalancerURL value and click Save.

Elasticsearch Index

If your app does not use Bitbucket search functionality just skip this section.

Otherwise, if your app is depending on Bitbucket search functionality you need to wait until Elasticsearch index is finished. Bitbucket-project index and bitbucket-repository index usually take about 10 hours on a User Guide recommended configuration, bitbucket-search index (search by repositories content) could take up to a couple of days.

To check status of indexing:

  1. Open LoadBalancerURL in your browser.
  2. Login with admin user.
  3. Navigate to LoadBalancerURL/rest/indexing/latest/status page.

If case of any difficulties with Index generation, contact us for support in the community Slack #data-center-app-performance-toolkit channel.

Testing scenarios

Using the Data Center App Performance Toolkit for Performance and scale testing your Data Center app involves two test scenarios:

Each scenario will involve multiple test runs. The following subsections explain both in greater detail.

Scenario 1: Performance regression

This scenario helps to identify basic performance issues without a need to spin up a multi-node Bitbucket DC. Make sure the app does not have any performance impact when it is not exercised.

Run 1 (~1 hour)

To receive performance baseline results without an app installed:

  1. On the computer where you cloned the Data Center App Performance Toolkit, navigate to dc-app-performance-toolkit/app folder.
  2. Open the bitbucket.yml file and fill in the following variables:
    • application_hostname: your_dc_bitbucket_instance_hostname without protocol
    • application_protocol: HTTP or HTTPS
    • application_port: for HTTP - 80, for HTTPS - 443, or your instance-specific port. The self-signed certificate is not supported.
    • admin_login: admin user username
    • admin_password: admin user password
    • concurrency: number of concurrent users for JMeter scenario - we recommend you use the defaults to generate full-scale results.
    • test_duration: duration of the performance run - we recommend you use the defaults to generate full-scale results.
    • ramp-up: amount of time it will take JMeter to add all test users to test execution - we recommend you use the defaults to generate full-scale results.
  3. Run bzt.

    1
    bzt bitbucket.yml
  4. View the following main results of the run in the dc-app-performance-toolkit/app/results/bitbucket/YY-MM-DD-hh-mm-ss folder:

    • results.csv: aggregated .csv file with all actions and timings
    • bzt.log: logs of the Taurus tool execution
    • jmeter.*: logs of the JMeter tool execution
    • pytest.*: logs of Pytest-Selenium execution

When the execution is successfully completed, the INFO: Artifacts dir: line with the full path to results directory will be displayed in console output. Save this full path to the run results folder. Later you will have to insert it under runName: "without app" for report generation.

Run 2 (~1 hour)

To receive performance results with an app installed:

  1. Install the app you want to test.
  2. Run bzt.

    1
    bzt bitbucket.yml

When the execution is successfully completed, the INFO: Artifacts dir: line with the full path to results directory will be displayed in console output. Save this full path to the run results folder. Later you will have to insert it under runName: "with app" for report generation.

Generating a performance regression report

To generate a performance regression report:

  1. Navigate to the dc-app-performance-toolkit/app/reports_generation folder.
  2. Edit the performance_profile.yml file:
    • Under runName: "without app", in the fullPath key, insert the full path to results directory of Run 1.
    • Under runName: "with app", in the fullPath key, insert the full path to results directory of Run 2.
  3. Run the following command:

    1
    python csv_chart_generator.py performance_profile.yml
  4. In the dc-app-performance-toolkit/app/results/reports/YY-MM-DD-hh-mm-ss folder, view the .csv file (with consolidated scenario results) and the .png file.

Analyzing report

Once completed, you will be able to review the action timings with and without your app to see its impact on the performance of the instance. If you see a significant impact (>10%) on any action timing, we recommend taking a look into the app implementation to understand the root cause of this delta.

Scenario 2: Scalability testing

The purpose of scalability testing is to reflect the impact on the customer experience when operating across multiple nodes. For this, you have to run scale testing on your app.

For many apps and extensions to Atlassian products, there should not be a significant performance difference between operating on a single node or across many nodes in Bitbucket DC deployment. To demonstrate performance impacts of operating your app at scale, we recommend testing your Bitbucket DC app in a cluster.

Extending the base action

Extension scripts, which extend the base Selenium (bitbucket-ui.py) scripts, are located in a separate folder (dc-app-performance-toolkit/extension/bitbucket). You can modify these scripts to include their app-specific actions.

Modifying Selenium

You can extend Selenium scripts to measure the end-to-end browser timings.

We use Pytest to drive Selenium tests. The bitbucket-ui.py executor script is located in the app/selenium_ui/ folder. This file contains all browser actions, defined by the test_ functions. These actions are executed one by one during the testing.

In the bitbucket-ui.py script, view the following block of code:

1
2
# def test_1_selenium_custom_action(webdriver, datasets, screen_shots):
#     custom_action(webdriver, datasets)

This is a placeholder to add an extension action. The custom action can be moved to a different line, depending on the required workflow, as long as it is between the login (test_0_selenium_a_login) and logout (test_2_selenium_z_log_out) actions.

To implement the custom_action function, modify the extension_ui.py file in the extension/bitbucket/ directory. The following is an example of the custom_action function, where Selenium navigates to a URL, clicks on an element, and waits until an element is visible:

1
2
3
4
5
6
7
8
def custom_action(webdriver, datasets):
    @print_timing
    def measure(webdriver, interaction):
        @print_timing
        def measure(webdriver, interaction):
            webdriver.get(f'{APPLICATION_URL}/plugins/servlet/some-app/reporter')
            WebDriverWait(webdriver, timeout).until(EC.visibility_of_element_located((By.ID, 'plugin-element')))
        measure(webdriver, 'selenium_app_custom_action:view_report')

To view more examples, see the modules.py file in the selenium_ui/bitbucket directory.

Running tests with your modification

To ensure that the test runs without errors in parallel, run your extension scripts with the base scripts as a sanity check.

Run 3 (~1 hour)

To receive scalability benchmark results for one-node Bitbucket DC with app-specific actions, run bzt:

1
bzt bitbucket.yml

When the execution is successfully completed, the INFO: Artifacts dir: line with the full path to results directory will be displayed. Save this full path to the run results folder. Later you will have to insert it under runName: "Node 1" for report generation.

Run 4 (~1 hour)

To receive scalability benchmark results for two-node Bitbucket DC with app-specific actions:

  1. In the AWS console, go to CloudFormation > Stack details > Select your stack.
  2. On the Update tab, select Use current template, and then click Next.
  3. Enter 2 in the Maximum number of cluster nodes and the Minimum number of cluster nodes fields.
  4. Click Next > Next > Update stack and wait until stack is updated.
  5. Run bzt.

    1
    bzt bitbucket.yml

When the execution is successfully completed, the INFO: Artifacts dir: line with the full path to results directory will be displayed in console output. Save this full path to the run results folder. Later you will have to insert it under runName: "Node 2" for report generation.

Run 5 (~1 hour)

To receive scalability benchmark results for four-node Bitbucket DC with app-specific actions:

  1. Scale your Bitbucket Data Center deployment to 4 nodes the same way as in Run 4.
  2. Check Index is synchronized to new nodes the same way as in Run 4.
  3. Run bzt.

    1
    bzt bitbucket.yml

When the execution is successfully completed, the INFO: Artifacts dir: line with the full path to results directory will be displayed in console output. Save this full path to the run results folder. Later you will have to insert it under runName: "Node 4" for report generation.

Generating a report for scalability scenario

To generate a scalability report:

  1. Navigate to the dc-app-performance-toolkit/app/reports_generation folder.
  2. Edit the scale_profile.yml file:
    • For runName: "Node 1", in the fullPath key, insert the full path to results directory of Run 3.
    • For runName: "Node 2", in the fullPath key, insert the full path to results directory of Run 4.
    • For runName: "Node 4", in the fullPath key, insert the full path to results directory of Run 5.
  3. Run the following command:

    1
    python csv_chart_generator.py scale_profile.yml
  4. In the dc-app-performance-toolkit/app/results/reports/YY-MM-DD-hh-mm-ss folder, view the .csv file (with consolidated scenario results) and the .png file.

Analyzing report

Once completed, you will be able to review action timings on Bitbucket Data Center with different numbers of nodes. If you see a significant variation in any action timings between configurations, we recommend taking a look into the app implementation to understand the root cause of this delta.

After completing all your tests, delete your Bitbucket Data Center stacks.

Support

In case of technical questions, issues or problems with DC Apps Performance Toolkit, contact us for support in the community Slack #data-center-app-performance-toolkit channel.