Start the Server
Chat Completions API
1. Basic Usage
2. Streaming
Enable streaming to receive real-time updates:- Initial role message (
role: "assistant"
) - Each chunk’s
delta.content
contains either:- Thought (if the string contains
" "
) - Screenshot (no spaces in content)
- Final result
- Thought (if the string contains
- Final chunk with
finish_reason: "stop"
3. Conversation
BLAST lets you run multi-turn conversations. The engine’s prefix caching ensures that already-computed browser actions are not repeated unless needed.4. Caching
BLAST will by default cache both results and the LLM-generated steps to create those results (in the case of queries that have results that change but the steps to access to new result value doesn’t). Control caching behavior withcache_control
options:
no-cache
- Skip results cache lookupno-store
- Don’t store in results cacheno-cache-plan
- Skip plan cache lookupno-store-plan
- Don’t store plan in cache
Responses API
1. Basic Usage
2. Streaming
Enable streaming to receive detailed event updates:response.created
- Initial event when the response is createdresponse.in_progress
- Task processing has startedresponse.output_text.delta
- Each streaming event’s delta is either:- Thought (if the content contains
" "
) - Screenshot (no spaces in content)
- Thought (if the content contains
response.output_text.done
- An event is completeresponse.completed
- Indicates all events are sent.
3. Conversation
Support for stateful conversations using previous response IDs:4. Caching
Next Steps
- Learn about the Engine API for direct access
- Understand Concurrency and Parallelism
- Configure Settings and Constraints