Drop-in AI chat for any website.
One <script> tag. Browser-side RAG. BYOK LLM. Source-citable answers. The AI chat widget that doesn't charge you per query.
Why another chat widget?
10 KB widget
Tiny bundle. Transformers.js + the embedding model load lazily from a CDN only when a visitor opens the chat — never on page load.
No per-query SaaS bill
Inkeep and Kapa.ai charge $50–$500/mo for what amounts to an LLM proxy. You pay only for LLM tokens, only when someone chats.
BYOK LLM
The widget calls YOUR backend, never an LLM provider directly. Your key, your rate limits, your prompts. OpenRouter, Anthropic, OpenAI — all fine.
Browser-side embedding
Query embedding (all-MiniLM-L6-v2, q8) runs on the visitor's device. No embedding-API bill, no vector DB to host, no PII leaving the browser.
Source-cited answers
Every reply links back to the pages it drew from. Visitors verify with one click.
Themed, accessible
Light / dark / auto, primary colour, position, custom strings — all via data attributes. Namespaced CSS, no iframe.
Install in 3 steps
From zero to a working AI chat on your site in a few minutes.
Build your content index
Write a siteask.config.json pointing at your markdown directory or URLs, then run the CLI to generate a static siteask-index.json.
{
"out": "./public/siteask-index.json",
"sources": [
{ "kind": "markdown", "dir": "./content/blog", "urlPrefix": "/blog" },
{ "kind": "urls", "urls": ["https://your-site.com/about"] }
]
}Deploy the tiny backend
A reference implementation for Next.js, Vercel Edge, Cloudflare Workers, and Express ships in the repo. Set OPENROUTER_API_KEY and you're done.
POST /api/siteask
{ query, contexts: [...] }
→ text/plain (streamed)Drop the script tag
One script tag with data-index-url and data-api-url — the widget auto-mounts on page load.
<script src="https://cdn.jsdelivr.net/npm/@siteask/widget@latest/dist/widget.js" data-index-url="/siteask-index.json" data-api-url="/api/siteask" data-title="Ask the docs" data-suggestions="What is X?|How do I install Y?|Pricing" async ></script>
Bring the backend you already use
Reference implementations for every common stack. ~100 LOC each — copy, set a key, ship.
See it in action — on this very site
The floating chat button in the bottom-right of this page is the SiteAsk pattern hand-implemented for the portfolio. Same architecture: browser-side embedding via Transformers.js, retrieval over a pre-built index, an LLM call against a rate-limited backend route. Click it, ask anything about Amit's work, watch the source-cited answer stream in.
The dogfooded backend the public widget would call lives at /api/siteask and follows the exact public contract.
Roadmap
- ✓Widget bundle (IIFE + ESM) with auto-init from script attributes and
SiteAsk.init()programmatic API. - ✓Node CLI to generate
siteask-index.jsonfrom markdown directories or URLs. - ✓Reference backends for Next.js (Node), Next.js (Edge), Cloudflare Workers, and Express.
- ○
npm publishto@siteask/widgetso the install becomes a single script tag instead of clone-and-build. - ○Optional WebGPU acceleration (~5× faster embedding on supported devices).
- ○Streaming retrieval — rerank as the visitor types.
- ○Conversation memory across turns.
- ○Adapter for Cloudflare Vectorize / Pinecone for site owners who want server-side indices.