[Draft] inworld tts auto mode by ianbbqzy · Pull Request #1008 · livekit/agents-js

ianbbqzy · 2026-01-29T22:14:22Z

auto_mode to be added to config param in a separate PR when word tokenizer and user-controlled manual flushes are supported. For now, auto_mode should enhance quality and naturalness of agent response

Description

Changes Made

Pre-Review Checklist

Build passes: All builds (lint, typecheck, tests) pass locally
AI-generated code reviewed: Removed unnecessary comments and ensured code quality
Changes explained: All changes are properly documented and justified above
Scope appropriate: All changes relate to the PR title, or explanations provided for why they're included
Video demo: A small video demo showing changes works as expected and did not break any existing functionality using Agent Playground (if applicable)

Testing

Automated tests added/updated (if applicable)
All tests pass
Make sure both restaurant_agent.ts and realtime_agent.ts work properly (for major changes)

Additional Notes

Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.

Summary by CodeRabbit

Improvements
- Enhanced text-to-speech with automatic streaming for more responsive audio synthesis.
- Improved timing and alignment of words/characters in streamed audio for smoother, monotonic playback.
- Better handling of stream completion to ensure continuous, correctly-timed audio output.

changeset-bot · 2026-01-29T22:14:26Z

⚠️ No Changeset found

Latest commit: 2baf4a1

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

CLAassistant · 2026-01-29T22:14:30Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

Ian Lee seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

coderabbitai · 2026-01-29T22:14:41Z

📝 Walkthrough

Walkthrough

Added timestamp-cumulative handling and flush semantics to TTS synthesis stream; introduced flushCompleted?: boolean on InworldResult and autoMode?: boolean on CreateContextConfig; context creation now forces autoMode: true. Adjusted alignment timestamp offsets and generation end tracking in the stream implementation.

Changes

Cohort / File(s)	Summary
TTS implementation & types `plugins/inworld/src/tts.ts`	Added `autoMode?: boolean` to `CreateContextConfig` and `flushCompleted?: boolean` to `InworldResult`. Implemented cumulative timestamp tracking (`#cumulativeTime`, `#generationEndTime`) in `SynthesizeStream`, applied cumulative offsets to word/char alignments, reset cumulative time on `flushCompleted`, and always set `autoMode: true` in context creation. Added explanatory comments about monotonic timestamps and autoMode rationale.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through timestamps, neat and spry,
Offsets stacked so words don't lie,
A tiny flag — autoMode true,
Flushes tidy, streaming through,
Carrots sync up — audio by-by-by 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description contains only a brief note about auto_mode without filling out the required template sections like 'Description', 'Changes Made', 'Testing', and most checklist items remain unchecked.	Complete the PR description by providing a clear description of changes, listing specific modifications made (interface additions, timestamp tracking logic), and documenting testing approach and checklist completion status.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'inworld tts auto mode' is directly related to the main change in the changeset, which adds auto mode functionality to the Inworld TTS system for enhancing response quality.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52d155b and 2baf4a1.

📒 Files selected for processing (1)

plugins/inworld/src/tts.ts

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

plugins/inworld/src/tts.ts

🧠 Learnings (3)

📓 Common learnings

Learnt from: cshape
Repo: livekit/agents-js PR: 1008
File: plugins/inworld/src/tts.ts:639-641
Timestamp: 2026-02-02T23:20:23.828Z
Learning: The `autoMode` field in Inworld's WebSocket TTS API `create_context` configuration is a forward-compatible feature that will be officially released by Inworld. It is safe to include this field in the configuration as Inworld's API will silently ignore unsupported fields until the feature is available.

📚 Learning: 2026-02-02T23:20:17.980Z

Learnt from: cshape
Repo: livekit/agents-js PR: 1008
File: plugins/inworld/src/tts.ts:639-641
Timestamp: 2026-02-02T23:20:17.980Z
Learning: Include the autoMode field in the Inworld WebSocket TTS API create_context configuration in plugins/inworld/src/tts.ts as a forward-compatible option. Since the API will silently ignore unsupported fields until the feature is released, adding autoMode now is safe and prepares for future usage. Ensure you don’t rely on autoMode for current behavior and consider adding a comment indicating it's forward-compatible. If possible, add a test to verify that existing behavior remains unchanged when autoMode is not yet recognized by the API.

Applied to files:

plugins/inworld/src/tts.ts

📚 Learning: 2026-01-16T14:33:39.551Z

Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to examples/src/test_*.ts : For plugin component debugging (STT, TTS, LLM), create test example files prefixed with `test_` under the examples directory and run with `pnpm build && node ./examples/src/test_my_plugin.ts`

Applied to files:

plugins/inworld/src/tts.ts

🔇 Additional comments (5)

plugins/inworld/src/tts.ts (5)

77-77: LGTM!

The optional interface additions for autoMode and flushCompleted properly extend the existing types without breaking backward compatibility.

Also applies to: 106-106

481-486: LGTM!

The cumulative timestamp tracking fields are well-documented. The comment clearly explains the monotonic timestamp invariant and why this offset mechanism is needed when the server resets timestamps after each generation.

515-519: LGTM!

The flushCompleted handler correctly captures the generation end time as the new cumulative offset, ensuring subsequent generation timestamps continue monotonically from where the previous generation ended.

525-535: LGTM!

The cumulative offset is correctly applied to word alignment timestamps. Using Math.max to update #generationEndTime properly handles potential out-of-order timestamp arrivals within a generation.

547-557: LGTM!

Character alignment timestamp handling mirrors the word alignment logic correctly. Both contribute to tracking #generationEndTime, which is appropriate when both alignment types are present.

Optional: The word and character alignment processing blocks share similar structure. Consider extracting a helper if this pattern expands further.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@plugins/inworld/src/tts.ts`:
- Around line 639-641: Remove the unsupported autoMode field and its comment
from the create_context/create (or create) message builder in the TTS WebSocket
code: find the autoMode: true property (and the preceding comment referencing
auto_mode) in the code that constructs the Inworld "create" context/message
(e.g., inside the function building the create_context payload) and delete both
the property and the misleading comment; if you believe auto-mode must be
enabled, instead add a TODO or a verification step to call Inworld support or
adjust the implementation to implement sentence-tokenizer-driven flush behavior
locally rather than relying on a non-existent API flag.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c02ff2 and 52d155b.

📒 Files selected for processing (1)

plugins/inworld/src/tts.ts

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

plugins/inworld/src/tts.ts

**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

plugins/inworld/src/tts.ts

🧠 Learnings (1)

📚 Learning: 2026-01-16T14:33:39.551Z

Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to examples/src/test_*.ts : For plugin component debugging (STT, TTS, LLM), create test example files prefixed with `test_` under the examples directory and run with `pnpm build && node ./examples/src/test_my_plugin.ts`

Applied to files:

plugins/inworld/src/tts.ts

🔇 Additional comments (1)

plugins/inworld/src/tts.ts (1)

68-77: LGTM: optional autoMode in CreateContextConfig is a clean, non-breaking extension.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

plugins/inworld/src/tts.ts

inworld tts ws auto mode

52d155b

coderabbitai bot reviewed Jan 29, 2026

View reviewed changes

plugins/inworld/src/tts.ts Show resolved Hide resolved

fix timestamps cumulation within a context

2baf4a1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Draft] inworld tts auto mode#1008

[Draft] inworld tts auto mode#1008
ianbbqzy wants to merge 2 commits intolivekit:mainfrom
ianbbqzy:ian/inworld-auto-mode

ianbbqzy commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

changeset-bot bot commented Jan 29, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Jan 29, 2026

Uh oh!

coderabbitai bot commented Jan 29, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ianbbqzy commented Jan 29, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes Made

Pre-Review Checklist

Testing

Additional Notes

Summary by CodeRabbit

Uh oh!

changeset-bot bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

CLAassistant commented Jan 29, 2026

Uh oh!

coderabbitai bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ianbbqzy commented Jan 29, 2026 •

edited by coderabbitai bot

Loading

changeset-bot bot commented Jan 29, 2026 •

edited

Loading

coderabbitai bot commented Jan 29, 2026 •

edited

Loading