OpenAI-Compatible API

Start the Server

# Run the engine + web UI
blastai serve

# Run only the engine
blastai serve engine

# Run only the web UI
blastai serve web

# Run the CLI
blastai serve cli

You can then use any OpenAI API client:

from openai import OpenAI

client = OpenAI(
    api_key="not-needed",
    base_url="http://127.0.0.1:8000"
)

Chat Completions API

1. Basic Usage

response = client.chat.completions.create(
    model="not-needed",
    messages=[
        {"role": "user", "content": "Find the 10 heaviest gorillas"}
    ]
)
print(response.choices[0].message.content)

2. Streaming

Enable streaming to receive real-time updates:

response = client.chat.completions.create(
    model="not-needed",
    messages=[
        {"role": "user", "content": "Find the 10 heaviest gorillas"}
    ],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

The streaming response includes:

Initial role message (role: "assistant")
Each chunk’s delta.content contains either:
- Thought (if the string contains " ")
- Screenshot (no spaces in content)
- Final result
Final chunk with finish_reason: "stop"

3. Conversation

BLAST lets you run multi-turn conversations. The engine’s prefix caching ensures that already-computed browser actions are not repeated unless needed.

response = client.chat.completions.create(
    model="not-needed",
    messages=[
        {"role": "user", "content": "Go to python.org"},
        {"role": "assistant", "content": "I've navigated to python.org"},
        {"role": "user", "content": "Click on Documentation"}
    ]
)

4. Caching

BLAST will by default cache both results and the LLM-generated steps to create those results (in the case of queries that have results that change but the steps to access to new result value doesn’t). Control caching behavior with cache_control options:

response = client.chat.completions.create(
    model="not-needed",
    messages=[
        {
            "role": "user",
            "content": "Find the 10 heaviest gorillas",
            "cache_control": "no-cache,no-store"
        }
    ]
)

Available cache control options:

no-cache - Skip results cache lookup
no-store - Don’t store in results cache
no-cache-plan - Skip plan cache lookup
no-store-plan - Don’t store plan in cache

Responses API

1. Basic Usage

response = client.responses.create(
    model="not-needed",
    input="Find the 10 heaviest gorillas"
)
print(response.output[0].content[0].text)

2. Streaming

Enable streaming to receive detailed event updates:

stream = client.responses.create(
    model="not-needed",
    input="Find the 10 heaviest gorillas",
    stream=True
)

for event in stream:
    if event.type == "response.output_text.delta":
        # Skip screenshots (no spaces in delta)
        if ' ' in event.delta:
            print(event.delta, end="", flush=True)

BLAST emits a sequence of events during streaming:

response.created - Initial event when the response is created
response.in_progress - Task processing has started
response.output_text.delta - Each streaming event’s delta is either:
- Thought (if the content contains " ")
- Screenshot (no spaces in content)
response.output_text.done - An event is complete
response.completed - Indicates all events are sent.

3. Conversation

Support for stateful conversations using previous response IDs:

# First response
response1 = client.responses.create(
    model="not-needed",
    input="Go to python.org"
)

# Follow-up using previous response ID
response2 = client.responses.create(
    model="not-needed",
    input="Click on Documentation",
    previous_response_id=response1.id
)

4. Caching

response = client.responses.create(
    model="not-needed",
    input="Find the 10 heaviest gorillas",
    cache_control="no-cache,no-store"
)

Next Steps

Learn about the Engine API for direct access
Understand Concurrency and Parallelism
Configure Settings and Constraints

Get Started

Guides

Contributing

OpenAI-Compatible API

Start the Server

Chat Completions API

1. Basic Usage

2. Streaming

3. Conversation

4. Caching

Responses API

1. Basic Usage

2. Streaming

3. Conversation

4. Caching

Next Steps

Get Started

Guides

Contributing

​Start the Server

​Chat Completions API

​1. Basic Usage

​2. Streaming

​3. Conversation

​4. Caching

​Responses API

​1. Basic Usage

​2. Streaming

​3. Conversation

​4. Caching

​Next Steps

Start the Server

Chat Completions API

1. Basic Usage

2. Streaming

3. Conversation

4. Caching

Responses API

1. Basic Usage

2. Streaming

3. Conversation

4. Caching

Next Steps