A real working multi-step LLM workflow tool. Three curated presets ship in v0.1 (Research → Summarise → Tweet · Code → Review → Refactor · Email → Classify → Reply) but every prompt and input is editable in place. Each step streams independently; outputs flow into the next via {{stepId.output}} templating. Sits alongside the existing AI ops toolkit: ai-eval evaluates single calls, PR Reviewer is a fixed 4+1 workflow, this is the user-defined version. Lives at /workflow.
Working tool with three useful real-world chains out of the box
Built on the existing /api/demo/generate route — no new server plumbing required
Natural sibling of ai-eval (evaluate single steps) and PR Reviewer (fixed workflow)
CLI + web viewer that runs your prompts against test cases with contains / regex / equals / llm-judge assertions, produces a JSON report, and fails the build on regressions in CI.
View ProjectPaste any public GitHub PR — four specialised AI agents review it in parallel (correctness, security, style, tests), a lead reviewer synthesises a severity-graded verdict.
View Project