-
Notifications
You must be signed in to change notification settings - Fork 45
Add VIBE√ safety check to /cq:reflect (#240) #270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c476eb9
3723214
88b5ff3
3a40a79
45df56c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -63,30 +63,72 @@ For each candidate, assign: | |||||
|
|
||||||
| If the session contained no events meeting the above criteria, skip Steps 3–5 and follow the "no candidates" instruction in Step 6. | ||||||
|
|
||||||
| ### Step 2.5 — Run the VIBE√ safety check on each candidate | ||||||
|
|
||||||
| Apply the VIBE√ safety check as defined in the cq skill against every candidate from Step 2. Classify each finding as clean, soft-concern, or hard-finding; for hard findings, generate the sanitized rewrite. Record the classification per candidate — Steps 3 and 6 use these results for presentation and the final summary. | ||||||
|
|
||||||
| `/cq:reflect` never drops candidates automatically; the user owns the final decision about what to submit. | ||||||
|
|
||||||
| ### Step 3 — Present candidates to the user | ||||||
|
|
||||||
| Open with: | ||||||
|
|
||||||
| ``` | ||||||
| I identified {N} potential learning candidates from this session worth sharing with the commons. | ||||||
| I identified {N_total} potential learning candidates from this session. | ||||||
| {N_hard} have hard concerns and are shown with both the original and a sanitized rewrite — pick which (if either) to store. | ||||||
| {N_soft} have soft concerns flagged with ⚠️ for your awareness. | ||||||
| {N_clean} passed the VIBE√ check cleanly. | ||||||
| ``` | ||||||
|
|
||||||
| Present each candidate as a numbered entry. Use one of three templates depending on what Step 2.5 produced. | ||||||
|
|
||||||
| **Clean candidate:** | ||||||
|
|
||||||
| ``` | ||||||
| {N}. {summary} | ||||||
| Domains: {domain tags} | ||||||
| Relevance: {estimated_relevance} | ||||||
| --- | ||||||
| {detail} | ||||||
| Action: {action} | ||||||
| ``` | ||||||
|
|
||||||
| Present each candidate as a numbered entry: | ||||||
| **Soft-concern candidate** (add the `⚠️` line above the divider): | ||||||
|
|
||||||
| ``` | ||||||
| {N}. {summary} | ||||||
| Domains: {domain tags} | ||||||
| Relevance: {estimated_relevance} | ||||||
| ⚠️ {one-line concern} | ||||||
| --- | ||||||
| {detail} | ||||||
| Action: {action} | ||||||
| ``` | ||||||
|
|
||||||
| **Hard-finding candidate** (show both versions side by side, with the concern annotated): | ||||||
|
|
||||||
| ``` | ||||||
| {N}. {summary} | ||||||
| Domains: {domain tags} | ||||||
| Relevance: {estimated_relevance} | ||||||
| ⚠️ Hard concern: {one-line concern} | ||||||
| --- | ||||||
| Original: | ||||||
| {original detail} | ||||||
| Action: {original action} | ||||||
| Sanitized: | ||||||
| {rewritten detail} | ||||||
| Action: {rewritten action} | ||||||
|
Comment on lines
+111
to
+121
|
||||||
| ``` | ||||||
|
|
||||||
| If the sanitized rewrite is not coherent (per the Step 2.5 fallback), substitute the Sanitized block with: `Sanitized: (no sanitized version possible — original would not generalize once stripped)`. | ||||||
|
|
||||||
| After listing all candidates, ask: | ||||||
|
|
||||||
| ``` | ||||||
| Reply with a number to approve, "skip {N}" to discard, or "edit {N}" to revise. | ||||||
| You can also reply "all" to approve everything, or "none" to discard everything. | ||||||
| For candidates with both an Original and a Sanitized version shown, use "{N} original" or "{N} sanitized" to choose which to store. | ||||||
| You can also reply "all" to approve everything (sanitized version where applicable), or "none" to discard everything. | ||||||
|
||||||
| You can also reply "all" to approve everything (sanitized version where applicable), or "none" to discard everything. | |
| You can also reply "all" to approve everything using the sanitized version where one is shown; if a candidate says `Sanitized: (no sanitized version possible — original would not generalize once stripped)`, "all" skips that candidate, and storing its original requires an explicit "{N} original". Reply "none" to discard everything. |
Copilot
AI
Apr 20, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This edge-case text says the user can "keep the original locally", but /cq:reflect’s flow only offers approve/skip/edit and then calls propose. Consider rewording to match the actual actions available here (e.g., "store the original" vs "skip") so users aren’t told about an option the command doesn’t provide.
| - **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to keep the original locally or skip; do not silently drop the candidate. | |
| - **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to store the original as-is or skip; do not silently drop the candidate. |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -136,6 +136,39 @@ Provide all three insight fields: | |||||
| - **detail** — Fuller explanation with enough context to understand the issue. Include a timestamp and source where possible. | ||||||
| - **action** — Concrete instruction on what to do about it. Prefer principle + verification method over exact values. | ||||||
|
|
||||||
| #### VIBE√ safety check | ||||||
|
|
||||||
| Before calling `propose`, evaluate every candidate against four safety dimensions. This applies to all propose calls — those triggered by `/cq:reflect` and direct proposes made while working on a task. | ||||||
|
|
||||||
| - **V — Vulnerabilities**: Does the candidate contain or reveal credentials, API keys, tokens, internal hostnames, IP addresses, file paths that disclose user identity, or any other secret? Does the action it recommends introduce a security risk if applied blindly (e.g. disabling auth checks, weakening TLS, executing untrusted input)? | ||||||
| - **I — Impact**: If another agent applied this candidate verbatim in an unrelated codebase, what is the worst plausible outcome? Could it cause data loss, production incidents, or cascading failures? | ||||||
| - **B — Biases**: Is the framing tied to a specific person, team, vendor, or commercial product in a way that isn't load-bearing for the lesson? Does it present one tool/approach as universally correct when the evidence supports only a narrow context? | ||||||
| - **E — Edge cases**: Was the lesson learned from a single observation, or has it been validated across multiple cases? Are there obvious conditions (OS, version, scale, concurrency) under which it would not hold and that the candidate fails to acknowledge? | ||||||
|
|
||||||
| Classify each finding into one of two tiers. Candidates are never dropped automatically — the user owns the final decision. | ||||||
|
|
||||||
| **Hard findings** — produce a sanitized rewrite before calling `propose`: | ||||||
|
|
||||||
| - Literal credentials, API keys, access tokens, private keys, or session cookies. | ||||||
| - Personally identifying information: real names, email addresses, phone numbers, government IDs, physical addresses. | ||||||
| - Internal-only identifiers that uniquely fingerprint a private system: non-public hostnames, internal service names, customer IDs, ticket numbers from private trackers. | ||||||
| - Recommendations whose primary effect is to weaken security (disable auth, skip signature verification, suppress sandboxing) without a clearly scoped, defensive justification. | ||||||
|
|
||||||
| Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. If no coherent lesson survives sanitization, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip. | ||||||
|
||||||
| Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. If no coherent lesson survives sanitization, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip. | |
| Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. Apply that sanitization to every `propose` field that could carry the hard-finding content — at minimum `summary`, `detail`, and `action` — so no unchanged field leaks the original sensitive or unsafe content. If no coherent lesson survives sanitization, or if any required field cannot be rewritten coherently without retaining the violation, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Placeholder naming is inconsistent: the Step 3 opener uses {N_total} while the final summary later uses {total}. Using one consistent placeholder name throughout will reduce prompt ambiguity for the agent when following these templates.