Shipped, not promised

Changelog

What shipped, and when. Most entries link to the live tool or page; a few are infrastructure (eval coverage, publish prep) that don't have their own URL but still count.

May 2026
@siteask/react — React companion for the SiteAsk widget
release
- Thin React wrapper around @siteask/widget. Reactive props (theme switches without reload), full TypeScript types, SSR-safe.
- <SiteAsk indexUrl="..." apiUrl="..." /> in your tree, or useSiteAsk(...) hook for programmatic control.
- Lazy-loads @siteask/widget from jsDelivr — no React-runtime overhead in the widget itself.
- Eighth OSS package in this monorepo. Examples for Next.js App Router and Vite.
May 2026
Workflow Replay — flame-graph timeline at /workflow/replay
feature
- Visualise any workflow JSON trace as a Gantt-style timeline — parallel batches share an x-axis range so fan-out is literally visible.
- Three metric cards show wall-clock total vs sum of per-step latencies vs the effective speedup multiplier.
- Click any bar to drill into that step's output, assertion results, and error message.
May 2026
/api/workflow/run — workflows as REST APIs
feature
- POST a workflow JSON, get a typed trace back. Same step semantics as the browser runner and workflow-cli.
- Rate-limited, SSRF-guarded, 20-step cap, 600-token output cap per step.
- A new "Run via API" button on /workflow invokes the endpoint inline and surfaces the curl equivalent for external callers.
May 2026
Workflow cost estimator with model picker
feature
- Per-step token-count + USD estimate, switchable between Gemini Flash Lite / Flash / Pro, Claude Haiku 4.5 / Sonnet 4.6, GPT-5 Mini / GPT-5.
- Toolbar pill shows the typical-to-max cost range for the whole workflow before you click Run.
- Flags runIf-conditional steps as may-not-fire and excludes them from the headline total.
May 2026
llm-devtools — Chrome DevTools for LLM API calls
release
- Local HTTP proxy + browser inspector that captures every LLM call your app makes. Multi-provider (OpenAI, Anthropic, OpenRouter, Gemini, Groq, Together).
- Zero setup — `npx llm-devtools start`, set one env var, see live captures with full request/response, per-role token counts, and estimated cost.
- Streaming responses are special-cased — chunks pipe to your app immediately AND get captured for replay.
- Built to fill the gap between heavyweight production observability (Langfuse, Arize) and bare-bones SDK logging.
May 2026
commit-changelog-cli — git log → polished changelog
release
- Two modes: a free heuristic that groups Conventional Commits, and `--ai` that hands the commits to an LLM to rewrite each entry as a user-facing one-liner.
- Reads `git log` directly or accepts a piped log from stdin. Writes to stdout or a file.
- Slot into release CI: --since=<previous tag> --until=<new tag> --ai --out=NOTES.md.
May 2026
workflow-cli — same workflows, on the command line
release
- New OSS package that runs the exact JSON shape the browser builder produces — design in /workflow, ship the JSON to CI.
- Supports every step type and modifier (llm/fetch/assert · parallel · runIf · contains/regex/equals/llm-judge).
- Non-zero exit on assertion failure so it drops into any GitHub Action / Jenkins / GitLab CI.
May 2026
Workflow Builder: self-improving + trace export + AI generator
feature
- Describe a workflow in plain English; an LLM emits the JSON, validated against the schema, and loads into the builder.
- After a run, "Suggest improvements" sends the workflow + actual run trace back to an LLM that proposes a revised JSON; a diff summary previews the changes before applying.
- Share a customised workflow as a base64url ?w= link; export a run as a markdown post-mortem (prompts + outputs + latencies + assertion results); auto-rendered Mermaid flow diagram updates with every edit.
May 2026
Workflow Builder v0.4 — parallel step execution
feature
- Any step can declare `parallel: true` — consecutive parallel-flagged steps run as a single Promise.allSettled batch.
- New preset: three reviewers (correctness · security · style) fan out concurrently, then a synthesiser sees all three outputs.
- UI shows wall-clock vs would-be-sequential time so the speedup is visible.
May 2026
Prompt A/B Tester — new AI Lab
feature
- Test two system prompts side by side against the same input. Both stream in parallel; differing tokens are highlighted.
- Optional LLM-as-judge declares a winner with a one-sentence reason.
May 2026
/api-to-mcp now ships a downloadable .zip
feature
- Goes from "paste an OpenAPI URL" to "unzip && npm install && run" in two clicks.
- Bundles src/index.ts + package.json (with SDK pinned) + tsconfig + README with Claude Desktop snippet + .gitignore.
- Backed by lib/build-zip.ts — a dependency-free stored-mode ZIP encoder with a real CRC-32 table.
May 2026
PR Reviewer eval coverage — 4/4 dimensions
infra
- Added security, style, and tests suites alongside the existing correctness suite.
- Each suite tests both detection (does it catch real bugs?) and lane discipline (does it refuse to comment on out-of-scope issues?).
- Every reviewer dimension in components/pr-reviewer.tsx now has a regression net.
May 2026
Workflow Builder v0.3 — runIf conditional steps
feature
- Any step can declare a `runIf` condition referencing a prior output — false condition → skipped.
- Two mutually-exclusive conditions in sequence form an if/else branch without needing a DAG.
- New preset: classify an email then route to apology / answer / thanks based on the label.
May 2026
/toolkit landing page + live embedded demo
feature
- A single narrative page tying every AI tool into one end-to-end story (design → evaluate → review → integrate → embed).
- Embedded a real 2-step workflow that runs in-browser so visitors see the chaining principle working, not just described.
May 2026
Four OSS packages — npm-publish ready
infra
- ai-eval, create-mcp-server, @amitshrivastava/portfolio-mcp, @siteask/widget now have repository, bugs, homepage, prepack, and publishConfig fields.
- @siteask/widget tsup config moved to tsup.config.ts so outExtension is a function (newer tsup API).
- Root PUBLISH.md documents the end-to-end runbook.
May 2026
Workflow Builder v0.2 — fetch + assert step types
feature
- Added `fetch` step (server proxy with SSRF guards, 8s timeout, 50KB cap, per-IP rate limit).
- Added `assert` step with contains / not-contains / regex / equals / llm-judge assertions.
- Added "Export as ai-eval YAML" so a designed workflow can be regression-tested in CI.
May 2026
Self-dogfooded eval suites
infra
- Started running ai-eval against this portfolio's own system prompts — Ask Amit grounding, PR Reviewer correctness, Commit Message Generator format.
- Same harness shipped as OSS now catches its own author's regressions.
April 2026
Workflow Builder v0.1 — initial release
release
- Three curated presets (Research → Summarise → Tweet · Code → Review → Refactor · Email → Classify → Reply).
- Per-step streaming, in-place editing of prompts and inputs, {{stepId.output}} substitution.
April 2026
ai-eval — open-source LLM eval harness
release
- CLI + GitHub Action + web viewer. YAML test cases, five assertion types, OpenRouter-backed.
- Non-zero exit on failure for CI gating.
April 2026
Multi-Agent PR Reviewer
release
- Four specialist reviewers (correctness · security · style · tests) run in parallel on any public PR.
- A lead reviewer synthesises a structured verdict (LGTM / LGTM_WITH_NITS / NEEDS_CHANGES / BLOCK).
April 2026
AI Commit Message Generator
release
- Paste a git diff, get a Conventional Commits message with the right type, scope, subject, and a body explaining the WHY.
March 2026
create-mcp-server — scaffolder CLI
release
- `npm create mcp-server@latest` walks you through naming and template selection, then writes a complete TypeScript MCP server.
- Two templates: blank (one example tool) and markdown-rag (exposes ./content/ as MCP resources with a keyword search tool).
March 2026
portfolio-mcp — real working MCP server
release
- Exposes this portfolio's blog posts to Claude Desktop, Cursor, and Cline via stdio MCP.
- Companion to the MCP Inspector demo at /ai-labs/mcp-inspector.
March 2026
SiteAsk — drop-in AI chat widget
release
- Browser-side RAG with all-MiniLM-L6-v2 embeddings. BYOK LLM. Source-citable answers. One <script> tag.
- No vector DB, no per-chat SaaS fees.
March 2026
AI Labs — 15 browser-runnable demos
release
- Multi-Agent Orchestrator · Tool-Use Simulator · Agent Memory · Reasoning Visualizer · RAG Playground · Hallucination Detector · …
- All demos run client-side (or through the rate-limited /api/demo/generate proxy).

The whole story is on /toolkit

This changelog is one timeline; the toolkit page arranges the same pieces by role — design, evaluate, review, integrate, embed.

Open the toolkit

@siteask/react — React companion for the SiteAsk widget

Workflow Replay — flame-graph timeline at /workflow/replay

/api/workflow/run — workflows as REST APIs

Workflow cost estimator with model picker

llm-devtools — Chrome DevTools for LLM API calls

commit-changelog-cli — git log → polished changelog

workflow-cli — same workflows, on the command line

Workflow Builder: self-improving + trace export + AI generator

Workflow Builder v0.4 — parallel step execution

Prompt A/B Tester — new AI Lab

/api-to-mcp now ships a downloadable .zip

PR Reviewer eval coverage — 4/4 dimensions

Workflow Builder v0.3 — runIf conditional steps

/toolkit landing page + live embedded demo

Four OSS packages — npm-publish ready

Workflow Builder v0.2 — fetch + assert step types

Self-dogfooded eval suites

Workflow Builder v0.1 — initial release

ai-eval — open-source LLM eval harness

Multi-Agent PR Reviewer

AI Commit Message Generator

create-mcp-server — scaffolder CLI

portfolio-mcp — real working MCP server

SiteAsk — drop-in AI chat widget

AI Labs — 15 browser-runnable demos

The whole story is on /toolkit