mozilla-ai · dni138 · Apr 13, 2026 · Apr 20, 2026 · Apr 20, 2026 · Apr 20, 2026
diff --git a/plugins/cq/commands/reflect.md b/plugins/cq/commands/reflect.md
@@ -63,30 +63,72 @@ For each candidate, assign:
 
 If the session contained no events meeting the above criteria, skip Steps 3–5 and follow the "no candidates" instruction in Step 6.
 
+### Step 2.5 — Run the VIBE√ safety check on each candidate
+
+Apply the VIBE√ safety check as defined in the cq skill against every candidate from Step 2. Classify each finding as clean, soft-concern, or hard-finding; for hard findings, generate the sanitized rewrite. Record the classification per candidate — Steps 3 and 6 use these results for presentation and the final summary.
+
+`/cq:reflect` never drops candidates automatically; the user owns the final decision about what to submit.
+
 ### Step 3 — Present candidates to the user
 
 Open with:
 
 ```
-I identified {N} potential learning candidates from this session worth sharing with the commons.
+I identified {N_total} potential learning candidates from this session.
-I identified {N_total} potential learning candidates from this session.
+I identified {total} potential learning candidates from this session.
-I identified {N_total} potential learning candidates from this session.
+I identified {total} potential learning candidates from this session.
+{N_hard} have hard concerns and are shown with both the original and a sanitized rewrite — pick which (if either) to store.
+{N_soft} have soft concerns flagged with ⚠️ for your awareness.
+{N_clean} passed the VIBE√ check cleanly.
+```
+
+Present each candidate as a numbered entry. Use one of three templates depending on what Step 2.5 produced.
+
+**Clean candidate:**
+
+```
+{N}. {summary}
+   Domains: {domain tags}
+   Relevance: {estimated_relevance}
+   ---
+   {detail}
+   Action: {action}
 ```
 
-Present each candidate as a numbered entry:
+**Soft-concern candidate** (add the `⚠️` line above the divider):
 
 ```
 {N}. {summary}
    Domains: {domain tags}
    Relevance: {estimated_relevance}
+   ⚠️ {one-line concern}
    ---
    {detail}
    Action: {action}
 ```
 
+**Hard-finding candidate** (show both versions side by side, with the concern annotated):
+
+```
+{N}. {summary}
+   Domains: {domain tags}
+   Relevance: {estimated_relevance}
+   ⚠️ Hard concern: {one-line concern}
+   ---
+   Original:
+     {original detail}
+     Action: {original action}
+   Sanitized:
+     {rewritten detail}
+     Action: {rewritten action}
+```
+
+If the sanitized rewrite is not coherent (per the Step 2.5 fallback), substitute the Sanitized block with: `Sanitized: (no sanitized version possible — original would not generalize once stripped)`.
+
 After listing all candidates, ask:
 
 ```
 Reply with a number to approve, "skip {N}" to discard, or "edit {N}" to revise.
-You can also reply "all" to approve everything, or "none" to discard everything.
+For candidates with both an Original and a Sanitized version shown, use "{N} original" or "{N} sanitized" to choose which to store.
+You can also reply "all" to approve everything (sanitized version where applicable), or "none" to discard everything.
-You can also reply "all" to approve everything (sanitized version where applicable), or "none" to discard everything.
+You can also reply "all" to approve everything using the sanitized version where one is shown; if a candidate says `Sanitized: (no sanitized version possible — original would not generalize once stripped)`, "all" skips that candidate, and storing its original requires an explicit "{N} original". Reply "none" to discard everything.
-You can also reply "all" to approve everything (sanitized version where applicable), or "none" to discard everything.
+You can also reply "all" to approve everything using the sanitized version where one is shown; if a candidate says `Sanitized: (no sanitized version possible — original would not generalize once stripped)`, "all" skips that candidate, and storing its original requires an explicit "{N} original". Reply "none" to discard everything.
 ```
 
 ### Step 4 — Handle edits
@@ -123,13 +165,24 @@ Stored: {id} — "{summary}"
 ## Session Reflect Complete
 
 {approved} of {total} candidates proposed to cq.
-{skipped} skipped.
+{skipped} skipped by user.
+
+VIBE√ findings this session:
+- Hard concerns (candidates {numbers}): {one-line concern per candidate}
+- Soft concerns (candidates {numbers}): {one-line concern per candidate}
 
 IDs stored this session:
-- {id}: "{summary}"
+- {id}: "{summary}" [{clean | soft | sanitized | original}]
 - ...
 ```
 
+The bracketed annotation on each stored ID records the VIBE√ provenance of what was stored:
+
+- `clean` — no VIBE√ findings; stored as identified.
+- `soft` — soft concern present; stored as-is after the user weighed the flag.
+- `sanitized` — hard finding; the user picked the sanitized rewrite.
+- `original` — hard finding; the user explicitly picked the unmodified version.
+
 If no candidates were identified, display:
 
 ```
@@ -141,3 +194,4 @@ No shareable learnings identified in this session. Sessions with debugging, work
 - **Empty session** — If the session contained only routine tasks, say so and stop after Step 2.
 - **All candidates skipped** — Display the summary with 0 proposed.
 - **`propose` error** — Report the error inline for that candidate and continue with the next one. Do not abort.
+- **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to keep the original locally or skip; do not silently drop the candidate.
- **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to keep the original locally or skip; do not silently drop the candidate.
+- **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to store the original as-is or skip; do not silently drop the candidate.
- **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to keep the original locally or skip; do not silently drop the candidate.
+- **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to store the original as-is or skip; do not silently drop the candidate.
diff --git a/plugins/cq/skills/cq/SKILL.md b/plugins/cq/skills/cq/SKILL.md
@@ -136,6 +136,39 @@ Provide all three insight fields:
 - **detail** — Fuller explanation with enough context to understand the issue. Include a timestamp and source where possible.
 - **action** — Concrete instruction on what to do about it. Prefer principle + verification method over exact values.
 
+#### VIBE√ safety check
+
+Before calling `propose`, evaluate every candidate against four safety dimensions. This applies to all propose calls — those triggered by `/cq:reflect` and direct proposes made while working on a task.
+
+- **V — Vulnerabilities**: Does the candidate contain or reveal credentials, API keys, tokens, internal hostnames, IP addresses, file paths that disclose user identity, or any other secret? Does the action it recommends introduce a security risk if applied blindly (e.g. disabling auth checks, weakening TLS, executing untrusted input)?
+- **I — Impact**: If another agent applied this candidate verbatim in an unrelated codebase, what is the worst plausible outcome? Could it cause data loss, production incidents, or cascading failures?
+- **B — Biases**: Is the framing tied to a specific person, team, vendor, or commercial product in a way that isn't load-bearing for the lesson? Does it present one tool/approach as universally correct when the evidence supports only a narrow context?
+- **E — Edge cases**: Was the lesson learned from a single observation, or has it been validated across multiple cases? Are there obvious conditions (OS, version, scale, concurrency) under which it would not hold and that the candidate fails to acknowledge?
+
+Classify each finding into one of two tiers. Candidates are never dropped automatically — the user owns the final decision.
+
+**Hard findings** — produce a sanitized rewrite before calling `propose`:
+
+- Literal credentials, API keys, access tokens, private keys, or session cookies.
+- Personally identifying information: real names, email addresses, phone numbers, government IDs, physical addresses.
+- Internal-only identifiers that uniquely fingerprint a private system: non-public hostnames, internal service names, customer IDs, ticket numbers from private trackers.
+- Recommendations whose primary effect is to weaken security (disable auth, skip signature verification, suppress sandboxing) without a clearly scoped, defensive justification.
+
+Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. If no coherent lesson survives sanitization, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip.
-Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. If no coherent lesson survives sanitization, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip.
+Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. Apply that sanitization to every `propose` field that could carry the hard-finding content — at minimum `summary`, `detail`, and `action` — so no unchanged field leaks the original sensitive or unsafe content. If no coherent lesson survives sanitization, or if any required field cannot be rewritten coherently without retaining the violation, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip.
-Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. If no coherent lesson survives sanitization, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip.
+Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. Apply that sanitization to every `propose` field that could carry the hard-finding content — at minimum `summary`, `detail`, and `action` — so no unchanged field leaks the original sensitive or unsafe content. If no coherent lesson survives sanitization, or if any required field cannot be rewritten coherently without retaining the violation, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip.
+
+**Soft concerns** — proceed with the candidate, flag the concern to the user before calling `propose`:
+
+- Framing that overgeneralizes from a single observation.
+- Vendor- or product-specific advice presented as universal.
+- Missing acknowledgement of an edge case the session itself surfaced.
+- Wording that could read as biased toward a specific team, person, or commercial product.
+- Impact that the agent cannot fully predict (e.g. action mutates shared state).
+
+#### Applying VIBE√
+
+- **Direct `propose` calls** (outside `/cq:reflect`) — run the check on the single candidate. If a hard finding exists, present both the original and the sanitized rewrite to the user and let them pick (or skip). If only a soft concern exists, present the concern for awareness before proceeding.
+- **Batch proposals via `/cq:reflect`** — see the `/cq:reflect` command for the batch presentation UX (three templates, provenance annotation). The underlying V/I/B/E classification rules are the same.
+
 ### Confirming Knowledge (`confirm`)
 
 Call `confirm` when a knowledge unit retrieved from a query proved correct during your session. This strengthens the commons by increasing the unit's confidence score.

diff --git a/sdk/go/prompts/SKILL.md b/sdk/go/prompts/SKILL.md
@@ -136,6 +136,39 @@ Provide all three insight fields:
 - **detail** — Fuller explanation with enough context to understand the issue. Include a timestamp and source where possible.
 - **action** — Concrete instruction on what to do about it. Prefer principle + verification method over exact values.
 
+#### VIBE√ safety check
+
+Before calling `propose`, evaluate every candidate against four safety dimensions. This applies to all propose calls — those triggered by `/cq:reflect` and direct proposes made while working on a task.
+
+- **V — Vulnerabilities**: Does the candidate contain or reveal credentials, API keys, tokens, internal hostnames, IP addresses, file paths that disclose user identity, or any other secret? Does the action it recommends introduce a security risk if applied blindly (e.g. disabling auth checks, weakening TLS, executing untrusted input)?
+- **I — Impact**: If another agent applied this candidate verbatim in an unrelated codebase, what is the worst plausible outcome? Could it cause data loss, production incidents, or cascading failures?
+- **B — Biases**: Is the framing tied to a specific person, team, vendor, or commercial product in a way that isn't load-bearing for the lesson? Does it present one tool/approach as universally correct when the evidence supports only a narrow context?
+- **E — Edge cases**: Was the lesson learned from a single observation, or has it been validated across multiple cases? Are there obvious conditions (OS, version, scale, concurrency) under which it would not hold and that the candidate fails to acknowledge?
+
+Classify each finding into one of two tiers. Candidates are never dropped automatically — the user owns the final decision.
+
+**Hard findings** — produce a sanitized rewrite before calling `propose`:
+
+- Literal credentials, API keys, access tokens, private keys, or session cookies.
+- Personally identifying information: real names, email addresses, phone numbers, government IDs, physical addresses.
+- Internal-only identifiers that uniquely fingerprint a private system: non-public hostnames, internal service names, customer IDs, ticket numbers from private trackers.
+- Recommendations whose primary effect is to weaken security (disable auth, skip signature verification, suppress sandboxing) without a clearly scoped, defensive justification.
+
+Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. If no coherent lesson survives sanitization, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip.
+
+**Soft concerns** — proceed with the candidate, flag the concern to the user before calling `propose`:
+
+- Framing that overgeneralizes from a single observation.
+- Vendor- or product-specific advice presented as universal.
+- Missing acknowledgement of an edge case the session itself surfaced.
+- Wording that could read as biased toward a specific team, person, or commercial product.
+- Impact that the agent cannot fully predict (e.g. action mutates shared state).
+
+#### Applying VIBE√
+
+- **Direct `propose` calls** (outside `/cq:reflect`) — run the check on the single candidate. If a hard finding exists, present both the original and the sanitized rewrite to the user and let them pick (or skip). If only a soft concern exists, present the concern for awareness before proceeding.
+- **Batch proposals via `/cq:reflect`** — see the `/cq:reflect` command for the batch presentation UX (three templates, provenance annotation). The underlying V/I/B/E classification rules are the same.
+
 ### Confirming Knowledge (`confirm`)
 
 Call `confirm` when a knowledge unit retrieved from a query proved correct during your session. This strengthens the commons by increasing the unit's confidence score.

diff --git a/sdk/go/prompts/reflect.md b/sdk/go/prompts/reflect.md
@@ -63,30 +63,72 @@ For each candidate, assign:
 
 If the session contained no events meeting the above criteria, skip Steps 3–5 and follow the "no candidates" instruction in Step 6.
 
+### Step 2.5 — Run the VIBE√ safety check on each candidate
+
+Apply the VIBE√ safety check as defined in the cq skill against every candidate from Step 2. Classify each finding as clean, soft-concern, or hard-finding; for hard findings, generate the sanitized rewrite. Record the classification per candidate — Steps 3 and 6 use these results for presentation and the final summary.
+
+`/cq:reflect` never drops candidates automatically; the user owns the final decision about what to submit.
+
 ### Step 3 — Present candidates to the user
 
 Open with:
 
 ```
-I identified {N} potential learning candidates from this session worth sharing with the commons.
+I identified {N_total} potential learning candidates from this session.
+{N_hard} have hard concerns and are shown with both the original and a sanitized rewrite — pick which (if either) to store.
+{N_soft} have soft concerns flagged with ⚠️ for your awareness.
+{N_clean} passed the VIBE√ check cleanly.
+```
+
+Present each candidate as a numbered entry. Use one of three templates depending on what Step 2.5 produced.
+
+**Clean candidate:**
+
+```
+{N}. {summary}
+   Domains: {domain tags}
+   Relevance: {estimated_relevance}
+   ---
+   {detail}
+   Action: {action}
 ```
 
-Present each candidate as a numbered entry:
+**Soft-concern candidate** (add the `⚠️` line above the divider):
 
 ```
 {N}. {summary}
    Domains: {domain tags}
    Relevance: {estimated_relevance}
+   ⚠️ {one-line concern}
    ---
    {detail}
    Action: {action}
 ```
 
+**Hard-finding candidate** (show both versions side by side, with the concern annotated):
+
+```
+{N}. {summary}
+   Domains: {domain tags}
+   Relevance: {estimated_relevance}
+   ⚠️ Hard concern: {one-line concern}
+   ---
+   Original:
+     {original detail}
+     Action: {original action}
+   Sanitized:
+     {rewritten detail}
+     Action: {rewritten action}
+```
+
+If the sanitized rewrite is not coherent (per the Step 2.5 fallback), substitute the Sanitized block with: `Sanitized: (no sanitized version possible — original would not generalize once stripped)`.
+
 After listing all candidates, ask:
 
 ```
 Reply with a number to approve, "skip {N}" to discard, or "edit {N}" to revise.
-You can also reply "all" to approve everything, or "none" to discard everything.
+For candidates with both an Original and a Sanitized version shown, use "{N} original" or "{N} sanitized" to choose which to store.
+You can also reply "all" to approve everything (sanitized version where applicable), or "none" to discard everything.
 ```
 
 ### Step 4 — Handle edits
@@ -123,13 +165,24 @@ Stored: {id} — "{summary}"
 ## Session Reflect Complete
 
 {approved} of {total} candidates proposed to cq.
-{skipped} skipped.
+{skipped} skipped by user.
+
+VIBE√ findings this session:
+- Hard concerns (candidates {numbers}): {one-line concern per candidate}
+- Soft concerns (candidates {numbers}): {one-line concern per candidate}
 
 IDs stored this session:
-- {id}: "{summary}"
+- {id}: "{summary}" [{clean | soft | sanitized | original}]
 - ...
 ```
 
+The bracketed annotation on each stored ID records the VIBE√ provenance of what was stored:
+
+- `clean` — no VIBE√ findings; stored as identified.
+- `soft` — soft concern present; stored as-is after the user weighed the flag.
+- `sanitized` — hard finding; the user picked the sanitized rewrite.
+- `original` — hard finding; the user explicitly picked the unmodified version.
+
 If no candidates were identified, display:
 
 ```
@@ -141,3 +194,4 @@ No shareable learnings identified in this session. Sessions with debugging, work
 - **Empty session** — If the session contained only routine tasks, say so and stop after Step 2.
 - **All candidates skipped** — Display the summary with 0 proposed.
 - **`propose` error** — Report the error inline for that candidate and continue with the next one. Do not abort.
+- **No coherent sanitized rewrite possible** — Present the original with the empty-rewrite note from Step 2.5. The user can still choose to keep the original locally or skip; do not silently drop the candidate.
diff --git a/sdk/python/src/cq/prompts/SKILL.md b/sdk/python/src/cq/prompts/SKILL.md
@@ -136,6 +136,39 @@ Provide all three insight fields:
 - **detail** — Fuller explanation with enough context to understand the issue. Include a timestamp and source where possible.
 - **action** — Concrete instruction on what to do about it. Prefer principle + verification method over exact values.
 
+#### VIBE√ safety check
+
+Before calling `propose`, evaluate every candidate against four safety dimensions. This applies to all propose calls — those triggered by `/cq:reflect` and direct proposes made while working on a task.
+
+- **V — Vulnerabilities**: Does the candidate contain or reveal credentials, API keys, tokens, internal hostnames, IP addresses, file paths that disclose user identity, or any other secret? Does the action it recommends introduce a security risk if applied blindly (e.g. disabling auth checks, weakening TLS, executing untrusted input)?
+- **I — Impact**: If another agent applied this candidate verbatim in an unrelated codebase, what is the worst plausible outcome? Could it cause data loss, production incidents, or cascading failures?
+- **B — Biases**: Is the framing tied to a specific person, team, vendor, or commercial product in a way that isn't load-bearing for the lesson? Does it present one tool/approach as universally correct when the evidence supports only a narrow context?
+- **E — Edge cases**: Was the lesson learned from a single observation, or has it been validated across multiple cases? Are there obvious conditions (OS, version, scale, concurrency) under which it would not hold and that the candidate fails to acknowledge?
+
+Classify each finding into one of two tiers. Candidates are never dropped automatically — the user owns the final decision.
+
+**Hard findings** — produce a sanitized rewrite before calling `propose`:
+
+- Literal credentials, API keys, access tokens, private keys, or session cookies.
+- Personally identifying information: real names, email addresses, phone numbers, government IDs, physical addresses.
+- Internal-only identifiers that uniquely fingerprint a private system: non-public hostnames, internal service names, customer IDs, ticket numbers from private trackers.
+- Recommendations whose primary effect is to weaken security (disable auth, skip signature verification, suppress sandboxing) without a clearly scoped, defensive justification.
+
+Generate a single sanitized rewrite that removes or generalizes the violating content while preserving the underlying lesson. If no coherent lesson survives sanitization, flag the candidate as having no coherent rewrite — the user can still choose to keep the original or skip.
+
+**Soft concerns** — proceed with the candidate, flag the concern to the user before calling `propose`:
+
+- Framing that overgeneralizes from a single observation.
+- Vendor- or product-specific advice presented as universal.
+- Missing acknowledgement of an edge case the session itself surfaced.
+- Wording that could read as biased toward a specific team, person, or commercial product.
+- Impact that the agent cannot fully predict (e.g. action mutates shared state).
+
+#### Applying VIBE√
+
+- **Direct `propose` calls** (outside `/cq:reflect`) — run the check on the single candidate. If a hard finding exists, present both the original and the sanitized rewrite to the user and let them pick (or skip). If only a soft concern exists, present the concern for awareness before proceeding.
+- **Batch proposals via `/cq:reflect`** — see the `/cq:reflect` command for the batch presentation UX (three templates, provenance annotation). The underlying V/I/B/E classification rules are the same.
+
 ### Confirming Knowledge (`confirm`)
 
 Call `confirm` when a knowledge unit retrieved from a query proved correct during your session. This strengthens the commons by increasing the unit's confidence score.