Mister Smith is a multi-agent orchestration operating system built in Rust. It combines supervised agent execution, NATS and JetStream messaging, PostgreSQL-backed workflow state, MCP integration, durable sessions, and operator-facing CLI, HTTP, and desktop surfaces into one runtime-focused system.
The current repo is centered on building a real agent runtime, not just a prompt wrapper. On
main, Mister Smith already ships the runtime substrate, autonomy inspection surfaces, same-agent
session continuity, bounded live runtime proof on the openai_chatgpt / gpt-5.4 path, and a
local macOS operator console for managing the stack and inspecting runs.
Most agent frameworks stop at orchestration helpers around model calls. Mister Smith is trying to be the operating layer behind long-running, inspectable, failure-tolerant agent work:
- Supervised execution rather than optimistic retries
- Durable workflow and session state rather than best-effort memory
- Operator-visible autonomy rather than hidden internal heuristics
- Explicit runtime truth and proof boundaries rather than vague "agent succeeded" summaries
- Strong execution boundaries across tools, transports, and external capabilities
This repo should be read as a runtime system with real operator surfaces, not as a development workflow around Linear, Symphony, or other external tools.
- Chat-first CLI shell. Running
mister-smithwith no subcommand opens the retained-session shell, and active sessions stay inside one chat-first loop withresume,sessions,conversation,autonomy, and provider auth helpers alongside it. - Durable same-agent conversations. Sessions retain
session_id, coordinator continuity, turn history, and follow-up control surfaces across CLI and HTTP. - Operator control plane. Runtime submission, task inspection, autonomy inspection, and live event streaming are exposed through CLI, HTTP, and the local desktop operator console.
- Runtime-backed orchestration substrate. Supervised planner and executor lifecycles, ToolBus execution boundaries, NATS/JetStream transport, and PostgreSQL persistence are all part of the landed runtime path.
- MCP in the product boundary. The repo ships
mister-smith-mcpas a real workspace crate for tool exposure, compatibility, and external capability mediation. - Recent landed packet surfaces. Packets
022through026are landed onmainfor durable workflow ownership, runtime-truth and run-trace projection, agent-boundary hardening, deterministic step policy projection, and first bounded coordinator-runtime delegation. Packets030and031are also landed onmainfor the session-first and chat-first CLI shell path.
The current repo-wide truth lives in docs/current-state.md. At a high level, the repo currently has:
- the Rust workspace substrate through Phase 10
- one-shot runtime execution through
mister-smith runandPOST /api/v1/tasks - autonomy inspection through
mister-smith autonomy list,mister-smith autonomy status, and related HTTP views - durable session handling through CLI flows and
POST /api/v1/sessions - a bounded live runtime-proof baseline on
openai_chatgpt/gpt-5.4 - a local macOS Tauri operator console under
apps/operator-console/
The newest landed packet authorities are:
specs/023-runtime-truth-and-run-trace/specs/024-agent-boundary-security-hardening/specs/025-step-level-intelligence-v2/specs/026-first-real-coordinator-subagent-runtime/specs/030-session-first-cli-shell/specs/031-chat-first-cli-loop/
Remaining unpromoted packet material currently sits under specs/027-* through specs/029-*.
Those packet artifacts are still draft or pre-spec and are not yet part of the default runtime
story.
- CONTRIBUTING.md for setup, validation, and pull request expectations
- CODE_OF_CONDUCT.md for community standards
- SECURITY.md for private vulnerability reporting
- SUPPORT.md for where to ask questions and when to file issues
- CLI home shell for starting, resuming, and browsing retained sessions
- direct conversation commands for create, continue, inspect, and end
- autonomy inspection for workflow state, proof wording, and operator-facing status
- HTTP API plus WebSocket event feed for external operators and local tooling
- local Tauri operator console for stack bootstrap, run inspection, and session/task actions
- Erlang-inspired supervision trees with restart strategy support
- coordinator, planner, executor, verifier, and runtime evidence projection seams
- durable workflow lifecycle, event-history, and effect-boundary ownership
- deterministic step-policy projection and coordinator-owned delegation surfaces
- explicit runtime-truth, run-trace, and proof-boundary views
| Surface | Technology | Role |
|---|---|---|
| NATS + JetStream | async-nats | agent transport, event distribution, queues, and KV-backed runtime state |
| HTTP | Axum | task, session, autonomy, health, metrics, and websocket surfaces |
| gRPC | Tonic | typed service boundaries |
| MCP | rmcp | capability exposure, compatibility, and external tool mediation |
| ToolBus | repo-native execution boundary | bounded tool execution and capability control |
- JWT auth plus policy-aware authorization seams
- quarantine and external-capability enforcement surfaces
- structured runtime proof and delegated-work provenance
- Prometheus metrics, OpenTelemetry traces, Grafana dashboards, and alert rules
- health probes at
/health/liveand/health/ready
- Rust 1.88.0 or later
- Docker and Docker Compose (for PostgreSQL and NATS)
- An LLM provider API key (or use the mock provider for testing)
git clone https://github.com/MattMagg/MisterSmith.git
cd MisterSmith
cargo build --workspacedocker compose -f deploy/docker-compose.yml up -d postgres natsThis starts PostgreSQL on port 5432 and NATS (with JetStream) on port 4222.
Create a config.toml (or use environment variables):
[runtime]
worker_threads = 4
[transport]
nats_url = "nats://localhost:4222"
http_port = 8080
[monitoring]
health_check_interval = "30s"
log_level = "info"
[llm]
provider_kind = "openai_chatgpt"
model_id = "gpt-5.4"Environment variable overrides:
MISTER_SMITH_LOG_LEVEL— log level (trace, debug, info, warn, error)MISTER_SMITH_NATS_URL— NATS connection URLMISTER_SMITH_DATABASE_URL— PostgreSQL connection stringANTHROPIC_API_KEY— Anthropic provider credentialsOPENAI_API_KEY— OpenAI provider credentials
# Start the runtime
mister-smith run --config config.toml
# Or with defaults
mister-smith run
# Open the session-first CLI shell
mister-smith# Open the home shell
mister-smith
# Resume the last retained session
mister-smith resume --last
# Browse retained sessions
mister-smith sessions listMulti-turn sessions maintain a stable coordinator across turns:
# Start a session
mister-smith conversation start --message "Help me debug this memory leak"
# Continue with follow-up turns
mister-smith conversation continue \
--session-id <session_id> \
--message "Here are the heap dumps from the last 3 runs"
# Inspect the full session history
mister-smith conversation inspect --session-id <session_id>
# End the session
mister-smith conversation end --session-id <session_id># Via HTTP API
curl -X POST http://localhost:8080/api/v1/tasks \
-H "Content-Type: application/json" \
-d '{"description": "Analyze the error logs from the last hour", "priority": "high"}'
# Check task status
curl http://localhost:8080/api/v1/tasks/{task_id}See what the operator-facing autonomy control plane is doing:
# List all active workflows
mister-smith autonomy list
# Detailed status for one workflow
mister-smith autonomy status --workflow-id <workflow_id># ChatGPT browser-based login
mister-smith auth openai-chatgpt login
mister-smith auth openai-chatgpt status
# Claude subscription status
mister-smith auth claude status| Method | Endpoint | Description |
|---|---|---|
GET |
/health/live |
Liveness probe (always 200) |
GET |
/health/ready |
Readiness probe (503 during startup) |
GET |
/metrics |
Prometheus metrics |
POST |
/api/v1/tasks |
Submit a new task |
GET |
/api/v1/tasks/{id} |
Get task status and results |
GET |
/api/v1/agents |
List registered agents |
POST |
/api/v1/sessions |
Start a durable session |
POST |
/api/v1/sessions/{id}/turns |
Add a turn to a session |
GET |
/api/v1/sessions/{id} |
Inspect session state |
POST |
/api/v1/sessions/{id}/end |
End a session |
GET |
/api/v1/events/ws |
WebSocket event stream |
The multi-stage Dockerfile produces a distroless image under 100MB:
docker build -f deploy/Dockerfile -t mister-smith .docker compose -f deploy/docker-compose.yml up -dThis starts PostgreSQL, NATS, the Mister Smith runtime, OpenTelemetry collector, and Grafana.
Manifests for Deployment, Service, ConfigMap, and Secrets are in deploy/kubernetes/.
A local macOS desktop app (Tauri + React) for visual operation:
- Boots the local PostgreSQL + NATS stack automatically
- Launches the Mister Smith runtime as a managed sidecar
- Real-time workflow timeline via WebSocket
- Task submission and session management
- NATS monitor integration
- Session and run-detail inspection for the current local stack
cd apps/operator-console
npm install && npm run tauri dev ┌──────────────────────────────┐
│ CLI / HTTP API / gRPC │
└──────────────┬───────────────┘
│
┌──────────────▼───────────────┐
│ Agent Orchestrator │
│ (dynamic team composition) │
└──────────────┬───────────────┘
│
┌────────────────────────┼────────────────────────┐
│ │ │
┌─────────▼─────────┐ ┌─────────▼─────────┐ ┌─────────▼─────────┐
│ Supervision │ │ LLM Router │ │ Tool Bus │
│ (fault tolerance) │ │ (cascade routing) │ │ (function calling) │
└─────────┬─────────┘ └─────────┬─────────┘ └─────────┬─────────┘
│ │ │
┌─────────▼─────────────────────────▼────────────────────────▼─────────┐
│ NATS Messaging Layer │
│ (pub/sub, JetStream, request-reply) │
└─────────┬──────────────────────────────────────────────┬─────────────┘
│ │
┌─────────▼─────────┐ ┌─────────▼─────────┐
│ PostgreSQL │ │ JetStream KV │
│ (authoritative) │ │ (cache) │
└───────────────────┘ └───────────────────┘
20 crates organized by architectural layer:
| Layer | Crates |
|---|---|
| Foundation | mister-smith-core, mister-smith-config |
| Runtime | mister-smith-runtime, mister-smith-monitoring, mister-smith-events, mister-smith-async, mister-smith-resources |
| Actor System | mister-smith-actor, mister-smith-supervision |
| Transport | mister-smith-transport, mister-smith-nats, mister-smith-http, mister-smith-grpc, mister-smith-mcp |
| Security | mister-smith-security |
| Persistence | mister-smith-persistence |
| LLM | mister-smith-llm |
| Agents | mister-smith-agents |
| Application | mister-smith-app |
| Testing | mister-smith-integration-tests |
- Rust runtime and workspace architecture
- Tokio async execution
- NATS + JetStream messaging and bounded state projection
- Axum HTTP and websocket operator surfaces
- Tonic gRPC boundaries
- rmcp MCP support
- PostgreSQL + sqlx durable state
- OpenTelemetry, tracing, Prometheus, Grafana observability
# Build everything
cargo build --workspace
# Test a specific crate
cargo test -p mister-smith-agents
# Lint
cargo clippy --workspace -- -D warningsThis repository uses dual licensing under MIT or Apache-2.0.