Developer
News and Updates
Get Support
Sign in
Get Support
Sign in
DOCUMENTATION
Cloud
Data Center
Resources
Sign in
Sign in
DOCUMENTATION
Cloud
Data Center
Resources
Sign in
Runtimes
Web triggers
Async events
Dynamic Modules (EAP)
Atlassian app REST APIs
Fetch APIs
Last updated Feb 12, 2026

Forge LLMs API

Forge LLMs lets your Forge app call Atlassian‑hosted large language models (LLMs) to add secure AI features without leaving the Atlassian platform. Apps using this API are badged as Runs on Atlassian, indicating they leverage Atlassian’s security, compliance, and scalability. The API provides optimized, governed access to supported models so you can focus on creating innovative AI experiences while Atlassian handles model integration and infrastructure.

Manifest reference for LLM module

See the LLM module reference for details on the llm module for your manifest.yml.

Important:

The app retains its Runs on Atlassian eligibility after the module is added.

Versioning

The llm module is required to enable Forge LLMs. When you add the llm module to an app's manifest.yml, it triggers a major version upgrade and requires administrators of existing installations to review and approve the update.

EAP limitations

During the EAP, you are blocked from deploying your app to the production and staging environments. You cannot distribute your app or list it on the Atlassian Marketplace.

Tutorials and example apps

Node.js SDK

The @forge/llm SDK gives you a lightweight, purpose-built client for invoking Atlassian-hosted LLMs directly from Forge runtime functions.

Use chat() for structured multi-turn exchanges. Use stream() to incrementally receive LLM responses as smaller chunks. Provide 'tool' definitions so the model can call typed functions, and inspect returned usage to guide adaptive behaviour.

For runnable examples (tool wiring, retries, error handling), see the Forge LLMs tutorials and example apps section above.

Method signature

Please refer to the request and the response schemas.

1
2
list() => Promise<ModelListResponse>
chat(Prompt) => Promise<LlmResponse>
stream(Prompt) => Promise<StreamResponse>

Example usage

Using chat

1
2
import { chat } from '@forge/llm';
try {
  const response = await chat({
    model: 'claude-3-7-sonnet-20250219',
    messages: [
      {
        role: 'user', content: 'Write a short poem about Forge LLMs.'
      }
    ],
  });

  console.log("#### LLM response:", JSON.stringify(response));
} catch (err) {
  console.error('#### LLM request failed:', { error:  err.context?.responseText });
  throw err;
}

Using stream

1
2
import { stream } from '@forge/llm';
try {
  const response = await stream({
    model: 'claude-3-7-sonnet-20250219',
    messages: [
      {
        role: 'user', content: 'Write a short poem about Forge LLMs.'
      }
    ],
  });

  for await (const chunk of response) {
    console.log("#### LLM response:", JSON.stringify(chunk));
  }

  response.close();

} catch (err) {
  console.error('#### LLM request failed:', { error:  err.context?.responseText });
  throw err;
}

Module validation

The SDK requires the llm module to be defined in your manifest.yml. If the SDK is used without declaring this module, linting will fail with an error like:

1
2
Error: LLM package is used but 'llm' module is not defined in the manifest

The SDK can automatically fix your manifest. After linting, the manifest will include:

Example of corrected manifest.yml:

1
2
modules:
  llm:
    - key: llm-app
      model:
        - claude

Please refer to the LLM module reference for details on how to define the module.

Request and response schemas

Request

1
2
interface Prompt {
  model: string;
  messages: {
    role: "system" | "user" | "assistant" | "tool";
    content: string | { type: 'text'; text: string; }[];
  }[];
  max_completion_tokens?: number;
  temperature?: number;
  top_p?: number;
  tools?: {
    type: "function";
    function: {
      name: string;
      description: string;
      parameters: object;
    };
  }[];
  tool_choice?: "auto" | "none" | "required" | { type: "function"; function: { name: string } };
}

Response

1
2
interface LlmResponse {
  choices: {
    finish_reason: string;
    index?: number;
    message: {
      content: string | { type: "text"; text: string; }[];
      role: "assistant";
      tool_calls?: {
        id: string;
        type: "function";
        index: number;
        function: { name: string; arguments: object; };
      }[];
    };
  }[];
  usage?: { input_token?: number; output_token?: number; total_token?: number; };
}

interface StreamResponse extends AsyncIterable<LlmResponse> {
  close(): Promise<void> | undefined;
}

interface ModelListResponse {
  models: {
    model: string;
    status: "active" | "deprecated";
  }[];
}

Important validation rules

The following request validation rules apply to specific models:

RuleModels
When adjusting sampling parameters, modify either temperature or top_p. Do not modify both at the same time.claude-haiku-4-5-20251001, claude-sonnet-4-5-20250929

Model selection

We plan to launch with support for three Claude variants: Sonnet, Opus, and Haiku. You choose the model per request, allowing you to balance latency, capability, and cost for each use case.

Supported models

You can use the list method from the SDK to dynamically fetch the list of supported models and their respective status.

Model IDVariantsFamilyStatusEOL
claude-haiku-4-5-20251001HaikuClaudeACTIVE
claude-sonnet-4-20250514SonnetClaudeACTIVE
claude-sonnet-4-5-20250929SonnetClaudeACTIVE
claude-opus-4-1-20250805OpusClaudeACTIVE
claude-opus-4-5-20251101OpusClaudeACTIVE
claude-opus-4-6OpusClaudeACTIVE

As AI models evolve quickly, check regularly for deprecated status and associated EOL dates so your apps do not break.

Claude - Opus

  • Most capable (best for complex, deep reasoning tasks)
  • Slowest (higher latency due to depth)
  • Highest cost

Claude - Sonnet

  • Balanced capability
  • Moderate speed
  • Moderate cost

Claude - Haiku

  • Fast and efficient (best for lightweight or high‑volume tasks)
  • Lowest cost

AI models evolve quickly, so specific versions may change before launch. Initially only text input/output is supported; multimodal support may be considered later.

Admin experience

Administrators will be informed (Marketplace listing and during installation) when an app uses Forge LLMs. Adding Forge LLMs—or a new model family—to an existing app triggers a major version upgrade requiring admin approval.

Usage tracking

The Forge LLM API reports usage data per request (the number of input and output tokens consumed) in the API response.

During the EAP, usage tracking is limited to the data provided in the API response.

Pricing

LLMs will become a paid Forge feature soon. Usage (token input/output volume) will appear in the developer console under usage and costs. Specific pricing will be published before preview.

Responsible AI

Requests to Forge LLMs undergo the same moderation checks as Atlassian first‑party AI and Rovo features. High‑risk messages (per the Acceptable Use Policy) are blocked.

Handling streaming errors

Streaming responses from LLMs increase the risk of delivering incomplete output to the user, particularly in cases of interruptions such as timeouts or network failures.

Instead of resubmitting the original prompt, a better way to recover is to prompt the LLM with prior context. For example, if chunked text responses have been accumulated into a variable storedOutput, you can use the following user prompt to recover:

1
2
`You were interrupted in your previous attempt.
Your original instruction was "${originalUserPrompt}".
Continue from the following interrupted output: ${storedOutput}`

We’ve observed that Claude Sonnet 4.5 typically responds in one of two ways:

  • It successfully resumes text generation by continuing from the previously interrupted output.
  • It explains that the previous output already fulfilled the original instruction, and why.

Detecting when to retry

Platform interruptions can cause streaming to conclude before a complete response is delivered. In other words, not only can the LLM’s text output cut off prematurely, but the client can also fail to receive finalising streaming messages. One way to detect incomplete responses, and therefore attempt a retry, is to check whether a completion choice object with a finish_reason property is missing when the stream ends:

1
2
let isStreamComplete = false;
const checkIfFinishReasonExists = (chunk) =>
  !!chunk.choices.find(({ finish_reason }) => finish_reason !== undefined);

try {
  const response = await stream(myPrompt);

  for await (const chunk of response) {
    if (checkIfFinishReasonExists(chunk)) {
      isStreamComplete = true;
    }
  }
} catch (e) {
  // Exceptions are not thrown for finishing streams with incomplete responses.
} finally {
  response.close();
}

console.log(`Is the stream complete? ${isStreamComplete}`);

Rate this page: