Skip to content

Optimize speech preview (getByteTimeDomainData + UI update frequency) #126

@ysdede

Description

@ysdede

Context

  • Speech preview uses AnalyserNode.getByteTimeDomainData + per-frame conversion to Float32Array for waveform bars.
  • UI updates occur at ~60fps via onVisualizationUpdate and Waveform rAF loop.

Evidence

  • src/lib/audio/AudioEngine.tsgetBarLevels() converts Uint8Array to Float32Array every call.
  • src/App.tsxonVisualizationUpdate calls appStore.setBarLevels(audioEngine.getBarLevels()) every tick.
  • src/components/Waveform.tsx draws every rAF, even when not recording.

Impact

  • Continuous preview work adds CPU even when transcription is idle; can be a top driver during capture.

Actions

  • Throttle preview updates to <=20–30fps (separate from inference tick).
  • Skip getByteTimeDomainData when isRecording is false or widget is hidden.
  • Consider downsampling the analyser buffer (e.g., 256 → 64) or reusing a single Float32Array without per-frame allocations.
  • Optionally compute preview in AudioWorklet and send only min/max to UI.

Acceptance

  • Preview CPU drops measurably with recording active but model idle.
  • Trace shows reduced main-thread activity during capture when only preview is running.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformancePerformance optimization and profilingpriority:highImportant next work, impacts stability/performance/usability

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions