Why is this needed?
BLAST automatically maintains a prefix cache, similar to most LLM serving engines. The difference is that browser-augmented LLM prefix cache must be aware of the underlying browser resources required to reuse cache and continue execution.Caching Options
Using Cache Control
1. OpenAI-Compatible API
Using/chat/completions
:
/responses
:
2. Engine API
Cache Persistence
Enable cache persistence in settings:- Results are stored in
<appdata>/cache/results/
- Plans are stored in
<appdata>/cache/plans/
- Cache survives between engine restarts
Clearing Cache
1. Through API
2. Through Engine
Clear all caches:3. Manually
Remove cache directories:Next Steps
- Configure Settings
- Learn about Parallelism
- Understand Constraints for resource management