feat(#51): AI-powered model recommendations for cost optimization by GalDayan · Pull Request #73 · GENWAY-AI/clawatch

GalDayan · 2026-03-25T03:23:43Z

Summary

Implements the backend for AI-powered model recommendations (closes #51).

What's included

3-layer architecture:

Classifier (modelClassifier.ts) — Analyzes session complexity (simple/moderate/complex) using heuristics:
- Message length and count
- Tool usage patterns (file ops, code execution, sub-agents)
- Session duration and model diversity
Mapper (modelRecommendations.ts) — Maps complexity to cheaper model alternatives:
- Opus → Haiku for simple tasks, Sonnet for moderate
- GPT-4 → GPT-3.5 for simple, GPT-4-Turbo for moderate
- Calculates potential cost savings with rounding
API Endpoints (routes.ts):
- GET /api/recommendations/:sessionId — Per-session recommendation
- GET /api/recommendations/summary — Bulk savings across all agents (the money view)

Edge cases handled

Zero-cost sessions
Unknown/empty model names
Sessions with no messages
Savings rounding to 4 decimal places
Route ordering (static /summary before param /:sessionId)

Tests

23 unit tests (vitest) — 10 classifier, 13 recommendations
All passing, clean tsc --noEmit

What's next

Frontend: Anas building the Recommendations dashboard tab/card
v1 is heuristic-based — no auto-switching, no per-message analysis

cc @GalDayan

…API endpoints - modelClassifier.ts: session complexity classifier (simple/moderate/complex) based on message length, tool usage patterns, and feature analysis - modelRecommendations.ts: maps complexity to cheaper model alternatives with cost savings estimates - routes.ts: GET /api/recommendations/:sessionId and /api/recommendations/summary Implements backend for AI-powered model recommendations (issue #51)

…ge cases - 23 tests covering classifier and recommendation mapper - Edge cases: zero-cost sessions, empty models, no messages, unknown models - Savings rounding to 4 decimal places - Bulk recommendations filter completed sessions, respect limits - Added vitest as test framework - All tests passing, build clean

Prevents Express from matching 'summary' as a sessionId param.

Copilot

Pull request overview

Adds backend support for AI-assisted “cheaper model” recommendations by classifying session complexity and estimating potential cost savings, exposed via new recommendations API endpoints.

Changes:

Introduces heuristic session complexity classification (simple|moderate|complex).
Adds model recommendation + savings estimation logic for single sessions and recent-session summaries.
Exposes new API routes for per-session and summary recommendations, and adds vitest unit tests.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
backend/src/routes.ts	Adds `/api/recommendations/summary` and `/api/recommendations/:sessionId` endpoints.
backend/src/modelRecommendations.ts	Implements complexity→model mapping and savings estimation (single + bulk).
backend/src/modelClassifier.ts	Implements heuristic session complexity classifier (detail + summary).
backend/src/tests/modelRecommendations.test.ts	Adds unit tests for recommendation mapping and savings edge cases.
backend/src/tests/modelClassifier.test.ts	Adds unit tests for complexity classification heuristics.
backend/package.json	Adds vitest scripts and dev dependency.
backend/package-lock.json	Locks vitest and transitive dependencies.

Files not reviewed (1)

backend/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-25T03:28:56Z

+  // Calculate potential savings
+  const currentCost = session.costUsd;
+  const estimatedNewCost = estimateSessionCost(session.tokenCount, recommendation.model);
+  const savings = Math.max(0, currentCost - estimatedNewCost);
+  const savingsPercentage = currentCost > 0 ? (savings / currentCost) * 100 : 0;
+


Savings are computed even when the recommendation doesn’t actually change the model (or changes only the label, e.g. anthropic/claude-haiku-3 → claude-haiku). Because estimateSessionCost is only a rough estimate, this can report non-zero “savings” for a no-op recommendation. Consider short-circuiting: if the recommended tier/normalized model is the same as the current tier, set savings to 0 (and percentage to 0) instead of estimating.

Copilot · 2026-03-25T03:28:56Z

+      const complexity = classifySessionSummary(session);
+      const recommendation = mapComplexityToModel(complexity, session.model);
+
+      const currentCost = session.costUsd;
+      const estimatedNewCost = estimateSessionCost(session.tokenCount, recommendation.model);
+      const savings = Math.max(0, currentCost - estimatedNewCost);
+      const savingsPercentage = currentCost > 0 ? (savings / currentCost) * 100 : 0;
+


In bulk mode, savings are estimated even when mapComplexityToModel effectively recommends the same model tier (e.g. already on Haiku / GPT-3.5, or default “no change”). Since estimateSessionCost is approximate, this can produce misleading positive savings for a no-op. Consider detecting “no real change” (e.g. comparing normalized model ids) and forcing savings/percentage to 0 in those cases.

Copilot · 2026-03-25T03:28:56Z

+router.get("/recommendations/summary", async (req: Request, res: Response) => {
+  try {
+    const profile = req.query.profile as string | undefined;
+    const limit = parseInt(req.query.limit as string) || 20;


limit is parsed without a radix and without clamping/validation. Elsewhere in this file (e.g. the earlier /sessions listing) limit is parsed with base 10 and clamped to a safe range. Consider matching that pattern here (e.g. parseInt(..., 10) + min/max bounds) to avoid negative/NaN/overly large limits affecting behavior/perf.

Suggested change

const limit = parseInt(req.query.limit as string) || 20;

const rawLimit = parseInt(req.query.limit as string, 10);

const limit =

!Number.isNaN(rawLimit) ? Math.max(1, Math.min(rawLimit, 100)) : 20;

Copilot · 2026-03-25T03:28:57Z

+import { SessionDetail, SessionSummary } from "./sessions.js";
+import { classifySession, classifySessionSummary, SessionComplexity } from "./modelClassifier.js";


This relative import uses a .js extension ("./sessions.js"), but the source file is sessions.ts and the project is compiled as module: "commonjs" without NodeNext-style module resolution. This is likely to fail type-check/build with "Cannot find module". Use the same extensionless style used across the backend (e.g. "./sessions").

Suggested change

import { SessionDetail, SessionSummary } from "./sessions.js";

import { classifySession, classifySessionSummary, SessionComplexity } from "./modelClassifier.js";

import { SessionDetail, SessionSummary } from "./sessions";

import { classifySession, classifySessionSummary, SessionComplexity } from "./modelClassifier";

Copilot · 2026-03-25T03:28:57Z

@@ -0,0 +1,148 @@
+import { SessionDetail, SessionSummary } from "./sessions.js";


This relative import uses a .js extension ("./sessions.js"), but the codebase generally uses extensionless relative imports and the backend tsconfig is module: "commonjs" (not NodeNext). To avoid module resolution/build issues, import "./sessions" instead.

Suggested change

import { SessionDetail, SessionSummary } from "./sessions.js";

import { SessionDetail, SessionSummary } from "./sessions";

Copilot · 2026-03-25T03:28:57Z

+      return {
+        sessionId: session.id,
+        title: session.title,
+        recommendation: {
+          currentModel: session.model,
+          recommendedModel: recommendation.model,
+          complexity,
+          confidence: 0.6, // Lower confidence without full detail
+          reasons: [recommendation.reason],
+          potentialSavings: {
+            costUsd: savings,
+            percentage: savingsPercentage,
+          },
+        },
+      };
+    })
+    .filter((r) => r.recommendation.potentialSavings.costUsd > 0.001); // Filter out negligible savings
+
+  const totalSavings = recommendations.reduce((sum, r) => sum + r.recommendation.potentialSavings.costUsd, 0);
+
+  return {
+    totalSessions: recentSessions.length,
+    potentialTotalSavings: totalSavings,
+    recommendations,


Bulk recommendations return potentialSavings.costUsd and percentage without rounding, while the per-session endpoint rounds (4dp for cost / 1dp for percentage) and the PR description calls out rounding as an edge case. Consider rounding here as well (and rounding potentialTotalSavings) to keep the API consistent and avoid long floating-point decimals in the UI.

- Add BulkRecommendationSummary, SessionRecommendation types - Add API functions: getRecommendationSummary, getSessionRecommendation - Create RecommendationsTab component with summary card + session list - Wire tab into dashboard page with fetch effect - Responsive desktop/mobile layouts, complexity badges, savings display - Shows total potential savings, avg per session, individual recommendations

- SessionClient: stack stats on mobile, full-width messages, responsive timeline dots - ProjectClient: hide dividers on mobile, stack stats, truncate model names, responsive timeline - Both: sm: breakpoints, break-words for long text, min-h touch targets on buttons - Pattern matches dashboard tabs (mobile cards, desktop layouts)

GalDayan · 2026-03-25T05:08:00Z

Frontend review ✅

Reviewed the UI integration:

RecommendationsTab renders cleanly — summary card + session list with responsive layouts
Types align with backend response shapes
API functions use same fetchJson pattern as existing endpoints
Tab wired with lazy-fetch on tab switch (no unnecessary API calls)
Complexity badges, savings display, confidence indicators all working
Mobile/desktop layouts follow existing dashboard patterns

Ready for merge from the frontend side.

…ndary thresholds, GPT/Gemini paths, savings precision - Classifier: boundary thresholds (100/300 char, 0.3/1.0 tool ratio, 20/21 msg count, 30/31 min duration) - Classifier: case-sensitive tool names (Read/Write/Edit), null content handling, confidence caps - Recommendations: GPT-4 → gpt-3.5/gpt-4-turbo paths, Gemini pro → flash paths - Recommendations: cheap model upgrade hints for complex tasks (haiku, gpt-3.5) - Bulk: negligible savings filter, mixed model families, idle sessions, confidence level - Savings: non-negative guarantee, percentage validation for known cost reductions Coverage: 23 → 88 tests (4 test files)

GalDayan

🎨 Frontend Review — PR #73 (Model Recommendations)

Reviewed all 6 frontend files. Overall: solid work, merge-ready. A few notes:

✅ What's Good

RecommendationsTab.tsx — Clean component structure. Skeleton loading states are well done (matching existing patterns). Empty state handles both zero-sessions and all-optimal cases. Nice touch.
Responsive layouts — The dual desktop/mobile layout in RecommendationsTab is good UX. The responsive fixes in ProjectClient and SessionClient (flex-col on mobile, hidden dividers, break-words) are exactly the kind of polish we need.
Types — types.ts additions are well-typed. SessionComplexity as a union type is clean. BulkRecommendationSummary shape matches the API contract.
API layer — api.ts functions follow existing patterns (fetchJson, URLSearchParams). Good.
Tab integration — The useEffect with cancellation token for recommendations fetch is correct. Only fetches when tab is active — no wasted calls.

🟡 Minor Observations (non-blocking)

formatModel() could live in a shared util — It's useful outside RecommendationsTab (e.g., session detail views). Not urgent, but worth extracting later.
Confidence dot — The inline style={{ backgroundColor: ... }} with ternary chains works but could be a complexityConfig-style lookup for consistency. Fine for now.
overflow-x-auto on tab bar — Smart addition for the 5th tab. Good catch preventing horizontal overflow on narrow viewports.
getSessionRecommendation() in api.ts — Exported but not used anywhere in this PR. Presumably for the per-session view later? No issue, just noting it.
Average percentage calculation — Line ~140: Math.round() on the reduce result — the division by recommendations.length is outside Math.round(). Works correctly, just dense. A variable would improve readability.

🟢 Verdict

Ship it. Clean component, good responsive handling, proper TypeScript, follows existing patterns. The responsive fixes in ProjectClient/SessionClient are a nice bonus — those pages needed the mobile love.

— Anas 🎨

GalDayan added 3 commits March 25, 2026 04:45

fix(#51): reorder routes — static /summary before param /:sessionId

5b6a431

Prevents Express from matching 'summary' as a sessionId param.

Copilot AI review requested due to automatic review settings March 25, 2026 03:23

Copilot started reviewing on behalf of GalDayan March 25, 2026 03:24 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

GalDayan added 2 commits March 25, 2026 05:38

GalDayan commented Mar 26, 2026

View reviewed changes

GalDayan mentioned this pull request Apr 15, 2026

fix(#58): Timeline data now accurately reflects usage patterns #74

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(#51): AI-powered model recommendations for cost optimization#73

feat(#51): AI-powered model recommendations for cost optimization#73
GalDayan wants to merge 6 commits intomainfrom
feature/issue-51-model-recommendations

GalDayan commented Mar 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

GalDayan commented Mar 25, 2026

Uh oh!

GalDayan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		import { SessionDetail, SessionSummary } from "./sessions.js";
		import { classifySession, classifySessionSummary, SessionComplexity } from "./modelClassifier.js";

		@@ -0,0 +1,148 @@
		import { SessionDetail, SessionSummary } from "./sessions.js";

Conversation

GalDayan commented Mar 25, 2026

Summary

What's included

Edge cases handled

Tests

What's next

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

GalDayan commented Mar 25, 2026

Uh oh!

GalDayan left a comment

Choose a reason for hiding this comment

🎨 Frontend Review — PR #73 (Model Recommendations)

✅ What's Good

🟡 Minor Observations (non-blocking)

🟢 Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants