Documentation Index
Fetch the complete documentation index at: https://docs.blastproject.org/llms.txt
Use this file to discover all available pages before exploring further.
Start the Server
# Run the engine + web UI
blastai serve
# Run only the engine
blastai serve engine
# Run only the web UI
blastai serve web
# Run the CLI
blastai serve cli
You can then use any OpenAI API client:
from openai import OpenAI
client = OpenAI(
api_key="not-needed",
base_url="http://127.0.0.1:8000"
)
Chat Completions API
1. Basic Usage
response = client.chat.completions.create(
model="not-needed",
messages=[
{"role": "user", "content": "Find the 10 heaviest gorillas"}
]
)
print(response.choices[0].message.content)
2. Streaming
Enable streaming to receive real-time updates:
response = client.chat.completions.create(
model="not-needed",
messages=[
{"role": "user", "content": "Find the 10 heaviest gorillas"}
],
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)
The streaming response includes:
- Initial role message (
role: "assistant")
- Each chunk’s
delta.content contains either:
- Thought (if the string contains
" ")
- Screenshot (no spaces in content)
- Final result
- Final chunk with
finish_reason: "stop"
3. Conversation
BLAST lets you run multi-turn conversations. The engine’s prefix caching ensures that already-computed browser actions are not repeated unless needed.
response = client.chat.completions.create(
model="not-needed",
messages=[
{"role": "user", "content": "Go to python.org"},
{"role": "assistant", "content": "I've navigated to python.org"},
{"role": "user", "content": "Click on Documentation"}
]
)
4. Caching
BLAST will by default cache both results and the LLM-generated steps to create those results (in the case of queries that have results that change but the steps to access to new result value doesn’t).
Control caching behavior with cache_control options:
response = client.chat.completions.create(
model="not-needed",
messages=[
{
"role": "user",
"content": "Find the 10 heaviest gorillas",
"cache_control": "no-cache,no-store"
}
]
)
Available cache control options:
no-cache - Skip results cache lookup
no-store - Don’t store in results cache
no-cache-plan - Skip plan cache lookup
no-store-plan - Don’t store plan in cache
Responses API
1. Basic Usage
response = client.responses.create(
model="not-needed",
input="Find the 10 heaviest gorillas"
)
print(response.output[0].content[0].text)
2. Streaming
Enable streaming to receive detailed event updates:
stream = client.responses.create(
model="not-needed",
input="Find the 10 heaviest gorillas",
stream=True
)
for event in stream:
if event.type == "response.output_text.delta":
# Skip screenshots (no spaces in delta)
if ' ' in event.delta:
print(event.delta, end="", flush=True)
BLAST emits a sequence of events during streaming:
response.created - Initial event when the response is created
response.in_progress - Task processing has started
response.output_text.delta - Each streaming event’s delta is either:
- Thought (if the content contains
" ")
- Screenshot (no spaces in content)
response.output_text.done - An event is complete
response.completed - Indicates all events are sent.
3. Conversation
Support for stateful conversations using previous response IDs:
# First response
response1 = client.responses.create(
model="not-needed",
input="Go to python.org"
)
# Follow-up using previous response ID
response2 = client.responses.create(
model="not-needed",
input="Click on Documentation",
previous_response_id=response1.id
)
4. Caching
response = client.responses.create(
model="not-needed",
input="Find the 10 heaviest gorillas",
cache_control="no-cache,no-store"
)
Next Steps