Skip to content

docs(open-source): add telemetry docs link to OSS telemetry FAQ#2842

Open
marcklingen wants to merge 1 commit intomainfrom
codex/add-telemetry-documentation-link-to-faq-answer
Open

docs(open-source): add telemetry docs link to OSS telemetry FAQ#2842
marcklingen wants to merge 1 commit intomainfrom
codex/add-telemetry-documentation-link-to-faq-answer

Conversation

@marcklingen
Copy link
Copy Markdown
Member

@marcklingen marcklingen commented Apr 22, 2026

Motivation

  • Make the Open Source FAQ clearer about what telemetry the OSS build sends and provide an explicit link and instructions for disabling it.

Description

  • Update content/handbook/chapters/open-source.mdx to rephrase the OSS telemetry answer, mention the TELEMETRY_ENABLED=false opt-out, and add a link to the telemetry docs at /self-hosting/security/telemetry.

Testing

  • Ran pnpm exec prettier --check content/handbook/chapters/open-source.mdx and the check passed.

Codex Task

Disclaimer: Experimental PR review

Greptile Summary

This PR updates the OSS FAQ answer about telemetry to be more precise: it replaces "aggregated usage analytics" with "aggregated usage telemetry", rephrases the purpose, adds the explicit opt-out env var TELEMETRY_ENABLED=false, and links to the telemetry docs page. The linked page exists in the repo, so the internal link is valid.

Confidence Score: 5/5

Safe to merge — purely a documentation clarification with a valid internal link

Single-line prose change in a docs file; no code logic affected; linked telemetry page is confirmed to exist in the repo; no P0/P1 findings

No files require special attention

Important Files Changed

Filename Overview
content/handbook/chapters/open-source.mdx One-line documentation update: clarifies telemetry wording, adds TELEMETRY_ENABLED=false opt-out, and links to /self-hosting/security/telemetry (which exists in the repo)

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[OSS Self-hosted Langfuse] -->|default| B[Sends aggregated usage telemetry - no personal or trace data]
    A -->|TELEMETRY_ENABLED=false| C[Telemetry disabled]
    B --> D[Telemetry docs page]
Loading

Reviews (1): Last reviewed commit: "docs(open-source): add telemetry docs li..." | Re-trigger Greptile

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
langfuse-docs Ready Ready Preview, Comment Apr 22, 2026 6:08am

Request Review

@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Apr 22, 2026
@github-actions
Copy link
Copy Markdown

@claude review

@dosubot dosubot Bot added the documentation Improvements or additions to documentation label Apr 22, 2026
@marcklingen marcklingen enabled auto-merge April 22, 2026 06:06
@dosubot dosubot Bot added the auto-merge This PR is set to be merged label Apr 22, 2026
<summary>Does the OSS build phone home? Can I disable telemetry?</summary>

By default, Langfuse reports **aggregated usage analytics** (no personal data) so we know which features matter. **No traces, prompts, or customer data leave your cluster.** You can opt out of this.
By default, Langfuse OSS sends **aggregated usage telemetry** (no personal data) so we can improve the product. **No traces, prompts, or customer data leave your cluster.** You can disable it via `TELEMETRY_ENABLED=false` (details in the [telemetry docs](/self-hosting/security/telemetry)).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The updated FAQ text keeps the claim '(no personal data)' while now linking directly to the telemetry docs, which document that userDomains — 'Up to 30 email domains from the local users table with user counts' — are sent with each telemetry event. This creates a visible inconsistency for any reader who clicks through the new link. Consider narrowing the claim to 'no raw personal data' or 'no individual user data', or adding a brief qualifier acknowledging that aggregate email-domain counts are included.

Extended reasoning...

What the bug is and how it manifests

The FAQ answer at line 109 of content/handbook/chapters/open-source.mdx states that Langfuse OSS sends "aggregated usage telemetry" with the parenthetical qualifier "(no personal data)". This PR adds a direct link to /self-hosting/security/telemetry. A reader who clicks that link will immediately see, in the very first row of the telemetry data table, that userDomains — 'Up to 30 email domains from the local users table with user counts' — is collected per telemetry event. The contradiction between "no personal data" and sending email domains (even aggregated) is surfaced directly by this PR.

The specific code path that triggers it

The single-line change adds (details in the [telemetry docs](/self-hosting/security/telemetry)) to the FAQ answer. The telemetry docs table lists userDomains as the very first field, making it impossible to miss. This bug predated the PR but was not directly exposed; after this change, every reader of the FAQ who follows the link will encounter the inconsistency in the same viewport.

Why existing language doesn't prevent it

The telemetry docs themselves note: 'Raw email addresses are not sent as they are considered PII', which signals deliberate intent to distinguish email addresses (not sent) from email domains (sent). One verifier refuted the bug on this basis — arguing both pages consistently classify domains as non-PII and addresses as PII. That reasoning has merit: the authors made a deliberate engineering choice, and strictly under GDPR, email domains (e.g. 'company.com') identify organizations rather than natural persons. However, the FAQ's blanket "no personal data" gives no hint of this nuance. The qualifier exists in the telemetry docs, not in the FAQ, so a reader who only reads the FAQ gets an oversimplified picture, and a reader who follows the new link sees the docs' own hedging about PII right next to the userDomains entry, making the FAQ claim feel imprecise.

Impact

This is a documentation accuracy / precision issue, not a data-leakage bug. No personal data is lost or leaked. The impact is reputational and trust-related: self-hosting users evaluating Langfuse's privacy posture may notice the mismatch and lose confidence, or may file issues / open threads. For organizations with strict interpretations of privacy law, email domains for small companies can be quasi-identifying.

How to fix it

The minimal fix is a wording adjustment in the FAQ, e.g.:

  • "(no raw personal data)" — aligns with the telemetry docs' own PII framing
  • "(no individual user data)" — accurately describes what is omitted
  • "(no traces or personal data; aggregate email-domain counts are included — see telemetry docs)" — most transparent option

Step-by-step proof

  1. User reads the FAQ: 'By default, Langfuse OSS sends aggregated usage telemetry (no personal data)…'
  2. User clicks the new link to /self-hosting/security/telemetry.
  3. User sees the first row of the telemetry table: Field userDomains, Description 'Up to 30 email domains from the local users table with user counts'.
  4. User also sees the note: 'Raw email addresses are not sent as they are considered PII'.
  5. User asks: 'If email domains are not personal data, why does the page feel the need to say raw addresses are PII?' The FAQ's blanket "no personal data" does not prepare them for this nuance, creating a trust gap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-merge This PR is set to be merged codex documentation Improvements or additions to documentation size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant