Search & Chatbot
The Blu docs site has two complementary lookup mechanisms.
Local search (@easyops-cn/docusaurus-search-local)
A fully client-side, hashed search index built at site-build time. No network calls; works offline. Configured in docusaurus.config.ts under themes.
The search bar appears in the navbar after npm run build. In dev (npm start) the index isn't built — that's expected.
AI chatbot (Claude API)
A floating chat widget grounded in the full docs corpus. Implemented as:
| Layer | File | Purpose |
|---|---|---|
| UI | src/components/ChatWidget/index.tsx | Floating button + chat panel, streaming response display |
| Theme integration | src/theme/Root.tsx | Mounts ChatWidget on every page |
| Backend | server/index.ts | Express + @anthropic-ai/sdk, streams via SSE |
| Reverse proxy | scripts/blu-docs.panville.com-chat.conf | Apache /api/ → localhost:3941/api/ |
| Service | scripts/blu-docs-chat.service | systemd unit running tsx index.ts |
How the system prompt is built
On startup, the chat server reads every .md file under docs/, extracts frontmatter title, and concatenates them into a single document context wrapped in <documentation>...</documentation> tags. This becomes part of the system prompt sent on every request.
Because the docs context is identical across requests, it's marked with cache_control: { type: "ephemeral" } so the Anthropic prompt cache covers it — repeated questions hit the cache and respond faster + cheaper.
Streaming
The server uses SSE (text/event-stream). Each content_block_delta from the SDK is forwarded as data: {"text":"..."}\n\n. The widget reads the response body as a stream and accumulates text into the most recent assistant message.
Reloading docs after a deploy
The scripts/deploy.sh rsyncs the latest docs/ to the chat-server host and restarts the systemd unit. On restart, loadDocs() re-reads the directory and rebuilds the system prompt — no manual cache-bust needed.
Local development
cd server
npm install
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
npm run dev
# http://localhost:3941
The widget detects window.location.hostname === "localhost" and points to http://localhost:3941/api/chat directly. In production it uses same-origin /api/chat, which Apache reverse-proxies.
Cost considerations
Each request sends the full docs context. With Anthropic prompt caching on the system prompt, only the first request after a server restart pays the full input cost; subsequent requests pay the cached-read rate (~10× cheaper). Restart frequency therefore directly affects cost — avoid restarting the chat-server unit unnecessarily.
To monitor token usage: every streamed response ends with a done event including usage (input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens). These are not currently logged — wiring them into a metrics endpoint would be a small follow-up.