Why is this needed?
BLAST automatically maintains a prefix cache, similar to most LLM serving engines. The difference is
that browser-augmented LLM prefix cache must be aware of the underlying browser resources required to
reuse cache and continue execution.
Caching Options
"" # Cache everything (default)
"no-cache" # Skip results cache lookup
"no-store" # Don't store results in cache
"no-cache-plan" # Skip plan cache lookup
"no-store-plan" # Don't store plan in cache
Options can be combined:
"no-cache,no-store" # Skip all caching
"no-cache-plan,no-store-plan" # Skip plan cache but use result cache
"no-store,no-store-plan" # Skip storing but check cache
Using Cache Control
1. OpenAI-Compatible API
Using /chat/completions
:
from openai import OpenAI
client = OpenAI(
api_key="not-needed",
base_url="http://127.0.0.1:8000"
)
# With cache control
response = client.chat.completions.create(
model="not-needed",
messages=[{
"role": "user",
"content": "Search Python docs",
"cache_control": "no-cache,no-store" # Skip all caching
}]
)
# Default caching
response = client.chat.completions.create(
model="not-needed",
messages=[{
"role": "user",
"content": "Search Python docs"
}]
)
Using /responses
:
# With cache control
response = client.responses.create(
model="not-needed",
input="Search Python docs",
cache_control="no-cache-plan" # Skip plan cache
)
# Clear specific response cache
client.responses.delete("resp_<task_id>")
# Clear by task description
client.responses.delete("Search Python docs")
2. Engine API
from blastai import Engine
async with Engine() as engine:
# With cache control
result = await engine.run(
"Search Python docs",
cache_control="no-cache,no-store"
)
# Multiple tasks with different cache settings
results = await engine.run([
"Search Python docs", # Uses default caching
"Click Documentation" # Uses default caching
], cache_control=["no-cache", ""]) # Different settings per task
Cache Persistence
Enable cache persistence in settings:
# config.yaml
settings:
persist_cache: true # Keep cache between engine sessions
When persistence is enabled:
- Results are stored in
<appdata>/cache/results/
- Plans are stored in
<appdata>/cache/plans/
- Cache survives between engine restarts
Clearing Cache
1. Through API
# Clear by response ID
client.responses.delete("resp_<task_id>")
client.responses.delete("chatcmpl-<task_id>")
# Clear by task description
client.responses.delete("Search Python docs")
2. Through Engine
Clear all caches:
engine = await Engine.create()
await engine.cache_manager.clear()
3. Manually
Remove cache directories:
# Remove all cache
rm -rf ~/.local/share/blast/cache/*
# Remove specific caches
rm -rf ~/.local/share/blast/cache/results/* # Clear results
rm -rf ~/.local/share/blast/cache/plans/* # Clear plans
Next Steps