Last updated Aug 13, 2024

Integration Roles and Responsibilities

Document control

Current version: 2023/08/10

DateChange log
Aug 10, 2023Further clarification on handling of UGC/PII in source data
Aug 8, 2023Clarify read-through requirements in Fedramp
Aug 4, 2023Minor changes re - how to backfilling and read-through multi-region requirement
Aug 3, 2023Add a section about transformer ownership for external ingestion data provider

Purpose

In the original CP/CS path, data for TCS is sourced from various CP providers and centrally managed by CS with clear and well defined expectations. DROID allows teams to feed their data into TCS without having to integrate with CP if their data should not or could not be part of CS records. Successful integration with DROID requires collaboration between Context team and requesting teams not just in building but also in maintaining the integration. The key to that success is a clear and well defined set of expectations for all teams involved. In order to scale DROID, our goal is to inspire requesting teams to be owners of their integration to save Context team from having to scale their BAU load (and team size) linearly with number of use cases.

Roles

RoleOverview
DROID ownerContext team. We own all the services (TS, TCS), tooling (TCS sidecar) and infrastructure that form DROID the solution.
DROID data providerA team within Atlassian that owns some focused data that they want to distribute to other team(s) within the company to consume that have strict requirements around high availability, low latency and at very large scale.
DROID data consumerA team within Atlassian that needs to consume some data in DROID.

Responsibilities

DROID owner

  1. Run and maintain DROID the service (TCS) and supporting tool (TCS sidecar)
  2. Work with data providers to onboard their data into TCS
  3. Work with data consumers so they can access data from TCS
    • Make sure they use TCS sidecars to access TCS
    • Make sure they’re aware of best practices
  4. Publish and maintain SLOs to ensure that both data providers and data consumers can entrust DROID with:
    • TCS availability
    • TCS /entity latency
    • TCS sidecar /entity latency :question_mark: This is partly controlled by sidecar owners
    • End-to-end latency (between source data changes and they being visible in the sidecars) :question_mark: Also partly controlled by sidecar owners more?

DROID data provider

Common

  1. Work with Context team to onboard their data into TCS
    • Have a full-time owner to represent the provider during onboarding
    • Fill out a DROID questionnaire to the best of their ability
    • Follow best practices and recommendations from DROID owner
  2. Ensure UGC or PII is not published to TCS in plain text. More detail.
  3. Have up-to-date contact information (on-call roster, slack channel, service owner) in Microscope/Compass. Maintain a 24/7 on-call roster and a Slack channel for non-urgent support during business hours. Assist DROID owners with incident response. Resolving any problems causing customer impact or alert noise within reasonable timeframes.
  4. Manage accesses to DROID consumers of their data. Inform DROID owners when adding consumers.

Provider using external ingestion path

  1. Work with DROID owner to agree upon StreamHub event schema, AVI and authentication/authorisation arrangements.
  2. Develop and maintain transformers (if needed) to denormalise their source data into formats most suitable for their consumers.
  3. Keep ingestion rate to a mutually agreed level. Inform DROID owner in advance if increased rate is required.
  4. Have reasonable monitoring for their StreamHub events publishing
  5. Have reasonable integration/contract testing to ensure their events are always successfully ingested by DROID
  6. Run and maintain an endpoint to bootstrap data for backfilling
    • Work with DROID owner on bootstrapping contracts for the endpoint
    • The bootstrapping endpoint must be able to sustain a sufficient throughput in order to ensure that all data can be restored into DROID in a reasonable timeframe in case of catastrophic accidents.
  7. Perform backfilling of their own data when necessary by using a self-serve tool provided by DROID owner or raise a backfilling request in DROID help channel here. Aug 4, 2023 This self-serve functionality is yet to be built. Please raise a backfilling request when needed.
  8. When the number of transformation/ingestion failures is greater than zero but less than or equal to the threshold, the disturbed person on the roster will be asked by the DROID owner to investigate and resolve the issue.
  9. When the number of transformation/ingestion failures exceeds the threshold, an incident will be raised, and the on-call person on the roster will be paged by the DROID owner to investigate and resolve the issue.

Provider using read-through path

  1. Implement HTTP endpoints following integration guide here: DROID - Read-through Backing Store.
  2. Inform #help-tcs of any significant changes over time including to contract, forecast growth and usage patterns
  3. If cache invalidation is required, publish events to StreamHub (event schema)
  4. Maintain a Tier-1 (or better) service-level, with deployments in at least 2 regions
  5. Define SLOs for reliability and latency (in conversation with TCS owners) and configure monitoring for the endpoints called by DROID
  6. Define a triage process/runbook, outlining possible failure modes and points of contact.

DROID data consumer

  1. Work with Context team to get data from TCS
    • Integrate with TCS sidecar
    • Follow best practices provided by DROID owner
  2. Have reasonable monitoring for queries made against TCS sidecars.
  3. Avoid making keyspace walking type of queries against DROID
  4. Inform DROID owner of significant increases in throughput of queries

Rate this page: