Skip to main content

Search & Chatbot

The Blu docs site has two complementary lookup mechanisms.

Local search (@easyops-cn/docusaurus-search-local)

A fully client-side, hashed search index built at site-build time. No network calls; works offline. Configured in docusaurus.config.ts under themes.

The search bar appears in the navbar after npm run build. In dev (npm start) the index isn't built — that's expected.

AI chatbot (Claude API)

A floating chat widget grounded in the full docs corpus. Implemented as:

LayerFilePurpose
UIsrc/components/ChatWidget/index.tsxFloating button + chat panel, streaming response display
Theme integrationsrc/theme/Root.tsxMounts ChatWidget on every page
Backendserver/index.tsExpress + @anthropic-ai/sdk, streams via SSE
Reverse proxyscripts/blu-docs.panville.com-chat.confApache /api/localhost:3941/api/
Servicescripts/blu-docs-chat.servicesystemd unit running tsx index.ts

How the system prompt is built

On startup, the chat server reads every .md file under docs/, extracts frontmatter title, and concatenates them into a single document context wrapped in <documentation>...</documentation> tags. This becomes part of the system prompt sent on every request.

Because the docs context is identical across requests, it's marked with cache_control: { type: "ephemeral" } so the Anthropic prompt cache covers it — repeated questions hit the cache and respond faster + cheaper.

Streaming

The server uses SSE (text/event-stream). Each content_block_delta from the SDK is forwarded as data: {"text":"..."}\n\n. The widget reads the response body as a stream and accumulates text into the most recent assistant message.

Reloading docs after a deploy

The scripts/deploy.sh rsyncs the latest docs/ to the chat-server host and restarts the systemd unit. On restart, loadDocs() re-reads the directory and rebuilds the system prompt — no manual cache-bust needed.

Local development

cd server
npm install
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
npm run dev
# http://localhost:3941

The widget detects window.location.hostname === "localhost" and points to http://localhost:3941/api/chat directly. In production it uses same-origin /api/chat, which Apache reverse-proxies.

Cost considerations

Each request sends the full docs context. With Anthropic prompt caching on the system prompt, only the first request after a server restart pays the full input cost; subsequent requests pay the cached-read rate (~10× cheaper). Restart frequency therefore directly affects cost — avoid restarting the chat-server unit unnecessarily.

To monitor token usage: every streamed response ends with a done event including usage (input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens). These are not currently logged — wiring them into a metrics endpoint would be a small follow-up.