Note: This project is under active development. APIs, configuration fields, and workflow behaviour may change between commits.
A multi-agent system that produces high-quality software specifications through adversarial review. Specialised AI agents collaborate and compete — discovering requirements, drafting specs, reviewing through multiple lenses, revising, judging convergence, and decomposing into task graphs — while human gates ensure alignment at critical decision points.
The system supports dual-provider execution (Claude + Codex in parallel) across discovery, drafting, and review phases, with intelligent merging of outputs. A separate code review workflow provides automated code auditing with fix-review loops. A code documentation workflow auto-generates and maintains code documentation.
%%{init: {"theme": "neutral", "flowchart": {"defaultRenderer": "elk"}}}%%
flowchart TD
SRC([Source Documents])
DISC[DISCOVERY<br/>Extract actors · scope · constraints · requirements<br/><i>dual-provider: Claude + Codex</i>]
HG1[/HUMAN GATE 1<br/>Confirm / correct requirements\]
DRAFT[DRAFTING<br/>Produce spec + holdout test dataset<br/><i>dual-provider: Claude + Codex</i>]
HG2[/HUMAN GATE 2<br/>Resolve ambiguity warnings\]
subgraph ADVLOOP["Adversarial Review Loop · 2–5 rounds"]
direction TB
REV[REVIEWING<br/>4 parallel reviewer agents<br/>8 lenses across 4 groups + optional Codex]
REVIS[REVISING<br/>Address findings<br/>Judge block feedback on prior BLOCK]
JUDG[JUDGING<br/>Convergence check · anti-gaming pre-checks]
end
HGF[/HUMAN GATE FINAL<br/>Only if critical findings remain\]
FIN[FINALIZED]
TASK[TASKIFY<br/>Decompose spec into structured task graph<br/>validation + retry with schema/DAG checks]
subgraph TASKLOOP["Task Review Loop · up to 3 rounds"]
direction TB
TR[TASK REVIEW<br/>Dual-provider task graph quality review]
TRV[TASK REVISION<br/>Address task findings]
end
THG[/TASK HUMAN GATE<br/>Approve / correct / re-decompose\]
TAPPR[TASKS APPROVED]
COMP([COMPLETE])
SRC --> DISC --> HG1 --> DRAFT --> HG2 --> REV
REV --> REVIS --> JUDG
JUDG -- REVISE --> REV
JUDG -- BLOCK --> REVIS
JUDG -- PASS --> HGF --> FIN --> TASK --> TR --> TRV --> TR
TRV --> THG
THG -- approve --> TAPPR --> COMP
THG -- re-decompose --> TASK
classDef agent fill:#ffffff,stroke:#000000,color:#000000
classDef gate fill:#e8e8e8,stroke:#000000,color:#000000,stroke-dasharray:5 3
classDef terminal fill:#1a1a1a,stroke:#1a1a1a,color:#ffffff
class DISC,DRAFT,REV,REVIS,JUDG,FIN,TASK,TR,TRV,TAPPR agent
class HG1,HG2,HGF,THG gate
class SRC,COMP terminal
The dashboard displays this as a visual pipeline stepper showing all stages, with completed stages in green, the current stage pulsing, and future stages grayed out.
When rewinding to the discovery phase, the system detects existing artefacts and offers three choices:
- Skip to gate — jump directly to HUMAN_GATE_1 with the existing merged output
- Replay merge — re-run the merge step from existing per-provider outputs without re-dispatching agents
- Restart fresh — re-run discovery agents from scratch
A separate workflow for automated code auditing:
%%{init: {"theme": "neutral", "flowchart": {"defaultRenderer": "elk"}}}%%
flowchart TD
CP([Code Path])
CRINIT[CR_INIT]
CRHGS[/CR_HUMAN_GATE_SCOPE<br/>Confirm review scope\]
subgraph CRLOOP["Fix-Review Loop · configurable rounds"]
direction TB
CRREV[CR_REVIEWING<br/>Dual-provider code review]
CRFIX[CR_FIXING<br/>Automated fix application]
CRHGF[/CR_HUMAN_GATE_FIXES<br/>Human approval of fixes\]
end
CRDONE([CR_COMPLETE / CR_ESCALATED])
CP --> CRINIT --> CRHGS --> CRREV --> CRFIX --> CRHGF
CRHGF -- continue --> CRREV
CRHGF -- done --> CRDONE
classDef agent fill:#ffffff,stroke:#000000,color:#000000
classDef gate fill:#e8e8e8,stroke:#000000,color:#000000,stroke-dasharray:5 3
classDef terminal fill:#1a1a1a,stroke:#1a1a1a,color:#ffffff
class CRINIT,CRREV,CRFIX agent
class CRHGS,CRHGF gate
class CP,CRDONE terminal
A workflow for auto-generating and maintaining code documentation:
%%{init: {"theme": "neutral", "flowchart": {"defaultRenderer": "elk"}}}%%
flowchart TD
CP([Code Path])
CDINIT[CD_INIT]
CDDISC[CD_DISCOVERY<br/>Inventory modules · entry points · existing docs<br/><i>dual-provider: Claude + Codex</i>]
CDHGS[/CD_HUMAN_GATE_SCOPE<br/>Confirm / adjust scope\]
CDDRAFT[CD_DRAFTING<br/>Generate documentation + architecture diagrams<br/><i>dual-provider: Claude + Codex</i>]
CDSAN[CD_SANITISING<br/>Secret scan · redact before human review]
CDHGD[/CD_HUMAN_GATE_DRAFT<br/>Approve / redraft\]
subgraph CDLOOP["Review Loop · 1–3 rounds"]
direction TB
CDREV[CD_REVIEWING<br/>4 parallel reviewer groups · 7 lenses]
CDREVIS[CD_REVISING<br/>Address findings]
CDJUDG[CD_JUDGING<br/>Convergence check]
end
CDHGF[/CD_HUMAN_GATE_FINAL<br/>Only if unresolved CRITICAL/MAJOR remain\]
CDWRITE[CD_WRITING<br/>Write docs to disk · create manifest]
CDDONE([CD_COMPLETE / CD_ESCALATED])
CP --> CDINIT --> CDDISC --> CDHGS
CDHGS -- confirm --> CDDRAFT
CDHGS -- correct --> CDDISC
CDDRAFT --> CDSAN --> CDHGD
CDSAN -- secrets found --> CDDRAFT
CDHGD -- approve --> CDREV
CDHGD -- redraft --> CDDRAFT
CDREV --> CDREVIS --> CDJUDG
CDJUDG -- REVISE --> CDREV
CDJUDG -- PASS --> CDWRITE
CDJUDG -- unresolved findings --> CDHGF
CDHGF -- accept --> CDWRITE
CDHGF -- review again --> CDREV
CDWRITE --> CDDONE
classDef agent fill:#ffffff,stroke:#000000,color:#000000
classDef gate fill:#e8e8e8,stroke:#000000,color:#000000,stroke-dasharray:5 3
classDef terminal fill:#1a1a1a,stroke:#1a1a1a,color:#ffffff
class CDINIT,CDDISC,CDDRAFT,CDSAN,CDREV,CDREVIS,CDJUDG,CDWRITE agent
class CDHGS,CDHGD,CDHGF gate
class CP,CDDONE terminal
Supports full and incremental modes. Incremental mode reads the .codedoc-manifest.json from the previous run and only re-processes changed modules. The writer creates/updates this manifest on completion, enabling efficient subsequent runs.
| Agent | Role | Lenses |
|---|---|---|
| Discovery | Extracts requirements from source documents | -- |
| Discovery Merge | Intelligently merges dual-provider discovery outputs | -- |
| Drafter | Produces specification and holdout test data | -- |
| Drafter Combine | Merges dual-provider drafter outputs | -- |
| Reviewer (Clarity) | Ambiguity, Incompleteness | AMB, INC |
| Reviewer (Consistency) | Consistency, Feasibility | CON, FEA |
| Reviewer (Security) | Security, Operability | SEC, OPS |
| Reviewer (Correctness) | Correctness, Complexity | COR, CPX |
| Reviser | Addresses findings from reviewers | -- |
| Judge | Evaluates convergence, renders PASS/REVISE/BLOCK verdict | -- |
| Taskify | Decomposes finalized spec into structured task graph | -- |
| Task Reviewer | Reviews task graph for quality and completeness | -- |
| Task Reviser | Addresses task review findings | -- |
| Codedoc Discovery | Inventories modules, entry points, dependencies, existing docs | -- |
| Codedoc Discovery Merge | Merges dual-provider codedoc discovery outputs | -- |
| Codedoc Drafter | Generates documentation and architecture diagrams | -- |
| Codedoc Drafter Combine | Merges dual-provider codedoc drafter outputs | -- |
| Codedoc Reviewer (Accuracy) | Accuracy, Currency | ACC, CUR |
| Codedoc Reviewer (Completeness) | Completeness, Clarity | CMP, CLA |
| Codedoc Reviewer (Architecture) | Architecture, Structure | ARC, STR |
| Codedoc Reviewer (Audit) | Audit, Consistency, Secrets | AUD, CON, SEC |
| Codedoc Reviser | Addresses findings from codedoc reviewers | -- |
| Codedoc Judge | Evaluates convergence, renders PASS/REVISE/BLOCK verdict | -- |
| Codedoc Writer | Writes approved documentation to disk, creates manifest | -- |
All JSON-producing agents use a validation+retry loop via outvalid: agents are instructed to draft JSON output, run bin/outvalid --schema workflow-templates/<workflow>/<agent>-output.schema.json --input <draft> --writeTo <dest>, read the numbered errors, fix the draft, and retry. If the agent cannot produce a valid document within max_retries attempts, validation errors are fed back into the orchestrator prompt and the agent is re-dispatched. Schema files for all agent roles live under workflow-templates/.
| Verdict | Meaning | What happens |
|---|---|---|
PASS |
All findings adequately addressed | Proceeds to FINALIZED |
REVISE |
Minor issues remain | Returns to REVIEWING |
BLOCK |
Reviser under-delivered; critical findings not addressed | Returns to REVISING with the judge's full rationale as feedback |
A BLOCK does not escalate. The reviser receives the judge's output file and must address every flagged finding before the next judging round.
The judge's PASS verdict is subject to deterministic anti-gaming checks:
- All CRITICAL findings must be closed or dismissed
- Revision change logs must reference every CRITICAL and MAJOR finding
- Minimum round count must be met
- Authority limits per round: max 2 severity downgrades, max 3 dismissals
- Cumulative escalation: total downgrades + dismissals > 5 triggers escalation
The workflow halts automatically when any limit is exceeded:
- Max rounds -- round count exceeds configured maximum (default: 5)
- Max findings -- cumulative finding count exceeds threshold (default: 60)
- Staleness -- CRITICAL/MAJOR findings stuck for N consecutive rounds (default: 2)
- Wall clock -- elapsed time exceeds budget (default: 60 minutes)
- Cost -- cumulative API cost exceeds budget (default: $50)
The only manual prerequisite is Claude CLI:
| Dependency | Required | Install |
|---|---|---|
| Claude CLI | Yes — runs all AI agents | claude.ai/install.sh |
| Codex CLI | No — enables dual-provider mode | github.com/openai/codex |
# Verify Claude is installed and authenticated
claude --version
claude auth login # if not already doneThe installer handles everything else: server binary, bd, taskval, jq, check-jsonschema, and skills.
curl -fsSL https://raw.githubusercontent.com/nixlim/spec_system/main/install.sh | bashThe script:
- Downloads a pre-built binary (no Go required), or builds from source if Go is available
- Installs bd (Beads issue tracking) and taskval (task graph validation)
- Installs jq and check-jsonschema (required by
outvalidfor agent output validation) - Copies the bundled
plan-spec,grill-spec, andoutvalidskills to~/.claude/skills/ - Writes a default
config.yamland creates the workspace directory
# Options
./install.sh --help # All flags
./install.sh --skip-beads # Skip bd installation
./install.sh --skip-taskval # Skip taskval installation
./install.sh --skip-outvalid-deps # Skip jq + check-jsonschema installation
./install.sh --dir ~/bin # Custom binary location
./install.sh --dry-run # Preview without making changesIf bd or taskval are not on your PATH, those features are silently disabled and the workflow continues without them.
Note:
outvalid(bin/outvalid) is a bash script in the repo. Addbin/to your PATH or invoke it as./bin/outvalid.
go build -o specworkflow ./cmd/specworkflow./specworkflow --config config.yaml --workspace ./workspaceOpen http://localhost:8080 for the dashboard.
| Flag | Default | Description |
|---|---|---|
--port |
8080 |
HTTP listen port |
--workspace |
./workspace |
Directory for spec files, uploads, and metrics |
--config |
(none) | Path to YAML configuration file |
--otel-port |
4317 |
gRPC OTLP receiver port for Claude Code telemetry (0 to disable) |
The system requires two Claude skill directories containing the templates that govern spec structure and review criteria:
plan-spec must contain:
spec-template.md— Specification format and section structurebdd-template.md— BDD scenario formattest-dataset-template.md— Test dataset format
grill-spec must contain:
review-constitution.md— Review lenses and scoring criteriareport-template.md— Report format for the judge
The server searches for skills in these locations in order:
- Path given in
config.yamlunderskill_paths ~/.claude/skills/~/.codex/skills/.claude/skills/(relative to working directory).agents/skills/
Create a config.yaml in your working directory. All fields are optional — sensible defaults apply.
# ─────────────────────────────────────────────
# Skill paths (required if not auto-discovered)
# ─────────────────────────────────────────────
skill_paths:
plan_spec: "/path/to/.claude/skills/plan-spec"
grill_spec: "/path/to/.claude/skills/grill-spec"
# ─────────────────────────────────────────────
# Review loop limits
# ─────────────────────────────────────────────
max_rounds: 5 # Maximum review/revise/judge iterations (default: 5)
min_rounds: 2 # Minimum iterations required before PASS is accepted (default: 2)
max_total_findings: 60 # Cumulative findings before escalation (default: 60)
staleness_threshold: 2 # Consecutive rounds with no CRIT/MAJ progress before halt (default: 2)
# ─────────────────────────────────────────────
# Budget limits
# ─────────────────────────────────────────────
max_wall_clock_minutes: 60 # Elapsed time budget per workflow (default: 60)
max_cost_usd: 50.0 # API cost budget per workflow (default: 50.0)
# ─────────────────────────────────────────────
# Human gate configuration
# ─────────────────────────────────────────────
max_gate_corrections: 3 # Max correction rounds at Gate 1 (post-discovery) (default: 3)
max_gate2_redrafts: 1 # Max redraft rounds at Gate 2 (post-draft) (default: 1)
# ─────────────────────────────────────────────
# Agent reliability
# ─────────────────────────────────────────────
max_retries: 2 # Retry attempts per agent on validation failure (default: 2)
# ─────────────────────────────────────────────
# Agent timeouts (seconds)
# ─────────────────────────────────────────────
agent_timeout_seconds: 300 # Discovery, drafting, taskify agents (default: 300)
reviewer_timeout_seconds: 300 # Reviewer agents — Claude and Codex (default: 300)
holdout_timeout_seconds: 300 # Holdout generation agents (default: 300)
# ─────────────────────────────────────────────
# Model selection
# ─────────────────────────────────────────────
# Empty string means the Claude CLI picks its default model.
# Set per-role to use different models for different tasks.
claude_models:
default: "" # Fallback for any role not explicitly set
reviewer: "" # All 4 reviewer agents
holdout: "" # Holdout generation agents
reviser: "" # Reviser agent
judge: "" # Judge agent
discovery: "" # Discovery agent
drafter: "" # Drafter agent
taskify: "" # Taskify agent
task_reviewer: "" # Task reviewer agent
task_reviser: "" # Task reviser agent
# ─────────────────────────────────────────────
# Dual-provider (Codex CLI) — requires codex on PATH
# ─────────────────────────────────────────────
enable_codex_reviewers: true # Parallel Codex reviewers + holdout (default: true)
enable_codex_discovery: false # Parallel Codex discovery agent (default: false)
enable_codex_drafting: false # Parallel Codex drafting agent (default: false)
codex_model: "gpt-5.4" # Model ID passed to the Codex CLI (default: "gpt-5.4")
# ─────────────────────────────────────────────
# Task decomposition
# ─────────────────────────────────────────────
taskify_max_retries: 3 # Max validation+retry attempts for task graph (default: 3)
task_review_max_rounds: 3 # Max task review/revision rounds (default: 3)
# ─────────────────────────────────────────────
# Beads issue tracking — requires bd on PATH
# ─────────────────────────────────────────────
# If bd is not installed, these settings have no effect.
beads_gate_poll_interval: 5s # How often to poll gate task status in Beads (default: 5s)
beads_gate_timeout: 24h # Advisory warning threshold for gates left open (default: 24h)
# ─────────────────────────────────────────────
# Code review workflow
# ─────────────────────────────────────────────
code_review:
max_rounds: 3 # Fix-review iterations (default: 3)
max_cost_usd: 50.0 # Cost budget (default: 50.0)
max_wall_clock_minutes: 120 # Time budget (default: 120)
fixer_timeout_seconds: 600 # Fixer agent timeout (default: 600)
commit_mode: branch_per_round # "branch_per_round" or "direct_commit"
staleness_threshold: 2
max_retries: 2
reviewer_timeout_seconds: 300
claude_models:
default: ""
reviewer: ""
fixer: ""
# ─────────────────────────────────────────────
# Code documentation workflow
# ─────────────────────────────────────────────
codedoc:
max_rounds: 3 # Review/revise iterations (default: 3)
min_rounds: 1 # Minimum rounds before human gate allowed (default: 1)
max_cost_usd: 50.0 # Cost budget (default: 50.0)
max_wall_clock_minutes: 90 # Time budget (default: 90)
max_gate_corrections: 3 # Max scope gate correction rounds (default: 3)
max_gate_draft_redrafts: 2 # Max redraft rounds at draft gate (default: 2)
staleness_threshold: 2 # Consecutive unchanged rounds before escalation (default: 2)
agent_timeout_seconds: 600 # General agent timeout (default: 600)
discovery_timeout_seconds: 1200 # Discovery timeout — larger codebases (default: 1200)
reviewer_timeout_seconds: 300 # Reviewer agent timeout (default: 300)
max_retries: 2 # Retry attempts per agent on failure (default: 2)
default_mode: full # "full" or "incremental" (default: full)
docs_output_dir: docs # Output directory for generated docs (default: docs)
backup_before_write: true # Create .bak files before overwriting (default: true)
drift_warning_threshold: 0.20 # Fraction of changed files that triggers a drift warning (default: 0.20)
enable_codex_codedoc_discovery: false # Dual-provider discovery (default: false)
enable_codex_codedoc_drafting: false # Dual-provider drafting (default: false)
enable_codex_reviewers: true # Dual-provider reviewers (default: true)
claude_models:
default: ""
discovery: ""
drafter: ""
reviewer: ""
judge: ""skill_paths:
plan_spec: "~/.claude/skills/plan-spec"
grill_spec: "~/.claude/skills/grill-spec"Everything else uses defaults. This gets you a single-provider Claude-only workflow with sensible round limits and budgets.
The web dashboard provides real-time visibility into workflow execution. Multiple workflows can run concurrently, each tracked independently.
- Controls — Active workflow list, start new workflows (spec, code review, codedoc), upload source documents, assign documents to workflows, manage workspace
- Running Agents — Live table of all agent subprocesses (Feature, Role, PID, Start Time, Status); real-time updates via WebSocket; Kill button sends SIGTERM → SIGKILL
- Spec — View and diff spec versions as they evolve through rounds
- Issues — Track findings with severity/status/lens filtering; shows round raised and round closed for each finding
- Convergence — Monitor review/revision convergence metrics and round history
- Workspace Files — Browse all files in a workflow's workspace directory; download individual files or view raw content
- Messages — Filtered workflow log (OTEL, Orchestrator, Claude Runner, Agent Events, State Transitions)
A persistent top panel shows aggregate metrics updated in real-time via WebSocket:
- Pipeline stepper — visual chain of all workflow stages with progress indication
- Feature name, round number, workflow state badge, workflow type badge (SPEC/CR)
- Cost (from OTEL telemetry), elapsed wall clock time
- Token usage (input, output, cache read), API call count, agent cost
- Activity feed of individual tool and API events
- Source document list per workflow
Gate panels appear when the workflow requires human input:
- Gate 1 — Review discovery output, answer open questions, provide corrections
- Gate 2 — Resolve ambiguity warnings (accept/answer/defer per warning)
- Gate Final — Approve or reject when critical findings persist
- Task Gate — Review task graph, approve or request re-decomposition
Workflows can be rewound to any previous stage from the UI. Setting the state field in workflow-state.json directly is also honoured — the system respects an explicit non-terminal state rather than re-deriving it from artefacts on disk.
Individual phases can be replayed without re-dispatching agents:
- Discovery merge — re-run the intelligent merge from existing per-provider outputs
- Drafting combine — re-run the combine agent from existing per-provider drafts
- Review merge — re-run findings dedup from existing reviewer outputs
- Task review merge — re-run task findings dedup from existing task reviewer outputs
When bd (Beads) is installed and a .beads/ workspace exists in the working directory, the system automatically enables issue tracking integration.
| Item | Beads Artefact | Content |
|---|---|---|
| Each workflow run | Epic issue | Feature name, run ID, start time |
| Each reviewer finding | Child issue (type: finding) |
ID, severity, lens, affected section, round, agent |
| Human gate points | Task issue (gate proxy) | Gate name, feature, accept/reject instructions |
| Review round | Molecule | Steps: reviewing → revising → judging |
| State snapshots | KV store | Current state, round, run ID, step IDs |
When the workflow reaches a human gate (Gate 1, Gate 2, Gate Final, Task Gate), it creates a Beads task issue. The orchestrator polls bd show <id> every beads_gate_poll_interval (default 5s). Close the task with reason ACCEPT: <comment> or REJECT: <comment> to advance the workflow.
This mirrors the gate panel in the dashboard — both mechanisms work and either one unblocks the workflow.
If the server restarts mid-workflow, the orchestrator reads the run epic's children from Beads to rebuild the in-memory issue tracker. All finding statuses are restored from Beads state.
If bd is not on your PATH, Beads integration is silently disabled. The message [orchestrator] Beads integration disabled: bd not found appears in the server log. The workflow runs identically without it.
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/workflow/start |
Start new workflow |
| POST | /api/workflow/cancel |
Cancel running workflow |
| GET | /api/workflow/status |
Poll workflow status |
| POST | /api/workflow/resume |
Resume from ESCALATED/ERROR state |
| POST | /api/workflow/rewind |
Rewind to target state and round |
| POST | /api/workflow/replay |
Replay a specific phase |
| POST | /api/workflow/finalize |
Force transition to FINALIZED |
| POST | /api/workflow/reset |
Delete feature directory |
| POST | /api/workflow/restart |
Stop, delete, and restart |
| POST | /api/workflow/retry |
Clear stale state file |
| GET | /api/workflow/agents |
List active agents |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/codereview/start |
Start new code review |
| GET | /api/codereview/{feature}/status |
Poll code review status |
| POST | /api/codereview/{feature}/gate |
Submit gate decision |
| POST | /api/codereview/{feature}/cancel |
Cancel running code review |
| POST | /api/codereview/{feature}/resume |
Resume from ERROR state |
| POST | /api/codereview/{feature}/reset |
Delete code review feature directory |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/codedoc/start |
Start new codedoc workflow |
| GET | /api/codedoc/{feature}/status |
Poll codedoc status |
| POST | /api/codedoc/{feature}/gate |
Submit gate decision |
| POST | /api/codedoc/{feature}/cancel |
Cancel running codedoc workflow |
| POST | /api/codedoc/{feature}/resume |
Resume from ERROR state |
| POST | /api/codedoc/{feature}/reset |
Delete codedoc feature directory |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/upload |
Upload documents to global library |
| GET | /api/uploads |
List uploaded files |
| POST | /api/workflow/{feature}/source-docs |
Assign documents to a workflow |
| GET | /api/workflow/{feature}/source-docs |
List assigned documents |
| Method | Endpoint | Description |
|---|---|---|
| POST | /api/tasks/{id}/approve |
Approve gate (with corrections/resolutions) |
| POST | /api/tasks/{id}/reject |
Reject gate (cancel workflow) |
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/workspace/features |
List all features with metadata |
| GET | /api/workspace/features/{name}/state |
Feature workflow state |
| GET | /api/workspace/features/{name}/files/{f} |
Specific feature file |
| GET | /api/spec/* |
Spec versions, diffs, issues, convergence |
| GET | /api/metrics |
Persisted OTEL telemetry |
| GET | /api/messages |
Workflow log messages |
| GET | /api/logs/server |
Server log ring buffer |
| GET | /ws |
WebSocket event stream |
cmd/specworkflow/main.go CLI entry point, HTTP routing
internal/api/
workflow_handler.go HTTP handlers, WorkflowManager
codereview_handlers.go Code review HTTP handlers
codedoc_handlers.go Code documentation HTTP handlers
otel_receiver.go OTLP gRPC receiver for Claude telemetry
metrics_store.go SQLite persistence for telemetry
websocket.go WebSocket hub and broadcasting
spec_endpoints.go Spec/issue/convergence REST endpoints
internal/specworkflow/
orchestrator.go Main workflow loop and state coordination
orchestrator_discovery.go Discovery phase + Gate 1
orchestrator_drafting.go Drafting phase + Gate 2
orchestrator_review.go Review dispatch + revision + judging
orchestrator_beads.go Beads integration: epics, findings, gates, molecules
orchestrator_taskify.go Task graph decomposition + validation loop
orchestrator_task_review.go Task review/revision loop
statemachine.go State machine with guarded transitions
claude_runner.go Claude CLI subprocess execution
codex_runner.go Codex CLI subprocess execution
beads_client.go Beads CLI client (BeadsClientInterface + BeadsClient)
beads_client_mock.go Mock Beads client for tests
issues.go Issue tracker with lifecycle transitions + ExportLiveState
prompts.go Prompt construction for all agents
convergence.go Anti-gaming pre-checks and convergence
breakers.go Circuit breaker evaluation
config.go Configuration parsing and validation
types.go Core type definitions and workflow states
recovery.go Agent failure detection and retry
bin/
outvalid JSON schema validator for agent output (requires check-jsonschema, jq)
workflow-templates/ JSON Schema files for all agent output types
specworkflow/ Spec workflow agent schemas
codedoc/ Code documentation agent schemas
codereview/ Code review agent schemas
static/
index.html Dashboard HTML
app.js Dashboard JavaScript (SPA)
style.css Dashboard styles
workspace/
metrics.db SQLite telemetry database
source-docs/ Uploaded reference documents
specs/{feature}/
source-docs/ Per-workflow document copies
workflow-state.json Persisted workflow state (edit to rewind)
workflow-log.jsonl Structured workflow log
# Discovery phase
discovery-output.json
discovery-output-claude-v{N}.json
discovery-output-codex-v{N}.json
discovery-output-merged-v{N}.json
# Drafting phase
spec-v0.md Initial draft
spec-v{N}.md Revised spec per round
{feature}-holdouts.md
# Review/revise/judge loop
review-{a,b,c,d}-round-{N}.json
merged-findings-round-{N}.json Frozen findings snapshot (all statuses = open)
issue-tracker-round-{N}.json Live tracker state (accurate statuses) — used by judge
revision-round-{N}.json
judge-round-{N}.json
# Finalized output
spec-final.md
.tasks/
{feature}.task.json Structured task graph
Rewinding manually: Edit workflow-state.json and change "state" to the desired active state (e.g. "REVISING"). The server respects this on resume and will not override it with artefact-based inference.
OTEL telemetry from Claude Code is persisted to workspace/metrics.db:
- Aggregate token/cost counters per feature (upserted on every OTEL update)
- Individual tool invocations and API calls with duration and cost
- 90-day retention with automatic cleanup on startup
go test ./...Test coverage includes: state machine, orchestrator, convergence, circuit breakers, issue lifecycle, agent output validation, prompt construction, persistence, recovery, resume, rewind, replay, security, configuration, JSON validation+retry, Beads client and integration, discovery resume, code review state machine, and HTTP/WebSocket handlers.
internal/specworkflow/— Core spec workflow engine (pure Go, no HTTP dependencies)internal/codereview/— Code review workflow engineinternal/codedoc/— Code documentation workflow engineinternal/api/— HTTP/WebSocket/gRPC layercmd/specworkflow/— CLI entry pointstatic/— Dashboard frontend (vanilla JS, no build step)
spec_version, issue_update, convergence_update, gate_request, gate_response, circuit_breaker, agent_error, state_transition, agent_dispatch, agent_complete, workflow_status, agent_metrics, agent_tool_event, agent_api_event
MIT License — see LICENSE for the full text.
