Search & Chatbot

The Blu docs site has two complementary lookup mechanisms.

Local search (`@easyops-cn/docusaurus-search-local`)

A fully client-side, hashed search index built at site-build time. No network calls; works offline. Configured in docusaurus.config.ts under themes.

The search bar appears in the navbar after npm run build. In dev (npm start) the index isn't built — that's expected.

AI chatbot (Claude API)

A floating chat widget grounded in the full docs corpus. Implemented as:

Layer	File	Purpose
UI	`src/components/ChatWidget/index.tsx`	Floating button + chat panel, streaming response display
Theme integration	`src/theme/Root.tsx`	Mounts `ChatWidget` on every page
Backend	`server/index.ts`	Express + `@anthropic-ai/sdk`, streams via SSE
Reverse proxy	`scripts/blu-docs.panville.com-chat.conf`	Apache `/api/` → `localhost:3941/api/`
Service	`scripts/blu-docs-chat.service`	systemd unit running `tsx index.ts`

How the system prompt is built

On startup, the chat server reads every .md file under docs/, extracts frontmatter title, and concatenates them into a single document context wrapped in <documentation>...</documentation> tags. This becomes part of the system prompt sent on every request.

Because the docs context is identical across requests, it's marked with cache_control: { type: "ephemeral" } so the Anthropic prompt cache covers it — repeated questions hit the cache and respond faster + cheaper.

Streaming

The server uses SSE (text/event-stream). Each content_block_delta from the SDK is forwarded as data: {"text":"..."}\n\n. The widget reads the response body as a stream and accumulates text into the most recent assistant message.

Reloading docs after a deploy

The scripts/deploy.sh rsyncs the latest docs/ to the chat-server host and restarts the systemd unit. On restart, loadDocs() re-reads the directory and rebuilds the system prompt — no manual cache-bust needed.

Local development

cd server
npm install
echo "ANTHROPIC_API_KEY=sk-ant-..." > .env
npm run dev
# http://localhost:3941

The widget detects window.location.hostname === "localhost" and points to http://localhost:3941/api/chat directly. In production it uses same-origin /api/chat, which Apache reverse-proxies.

Cost considerations

Each request sends the full docs context. With Anthropic prompt caching on the system prompt, only the first request after a server restart pays the full input cost; subsequent requests pay the cached-read rate (~10× cheaper). Restart frequency therefore directly affects cost — avoid restarting the chat-server unit unnecessarily.

To monitor token usage: every streamed response ends with a done event including usage (input_tokens, output_tokens, cache_read_input_tokens, cache_creation_input_tokens). These are not currently logged — wiring them into a metrics endpoint would be a small follow-up.

Local search (@easyops-cn/docusaurus-search-local)​

AI chatbot (Claude API)​

How the system prompt is built​

Streaming​

Reloading docs after a deploy​

Local development​

Cost considerations​