Guides
Caching
Avoid re-computing
Why is this needed?
BLAST automatically maintains a prefix cache, similar to most LLM serving engines. The difference is that browser-augmented LLM prefix cache must be aware of the underlying browser resources required to reuse cache and continue execution.
Caching Options
Options can be combined:
Using Cache Control
1. OpenAI-Compatible API
Using /chat/completions
:
Using /responses
:
2. Engine API
Cache Persistence
Enable cache persistence in settings:
When persistence is enabled:
- Results are stored in
<appdata>/cache/results/
- Plans are stored in
<appdata>/cache/plans/
- Cache survives between engine restarts
Clearing Cache
1. Through API
2. Through Engine
Clear all caches:
3. Manually
Remove cache directories:
Next Steps
- Configure Settings
- Learn about Parallelism
- Understand Constraints for resource management