Skip to content

fix: bundle mlx-whisper in macOS app so Apple Silicon uses the fast backend#148

Draft
isair wants to merge 2 commits intodevelopfrom
claude/crazy-elgamal
Draft

fix: bundle mlx-whisper in macOS app so Apple Silicon uses the fast backend#148
isair wants to merge 2 commits intodevelopfrom
claude/crazy-elgamal

Conversation

@isair
Copy link
Copy Markdown
Owner

@isair isair commented Apr 6, 2026

Summary

  • Adds collect_submodules('mlx') and collect_submodules('mlx_whisper') to the PyInstaller spec for macOS builds, so all MLX native extensions and submodules are bundled
  • Adds collect_data_files('mlx_whisper') for any asset files the package ships
  • Adds 'mlx_whisper' as a static hidden import entry
  • Collection is guarded by sys.platform == 'darwin' with a try/except fallback (no-op on non-macOS or when mlx isn't installed)

Previously, the bundled macOS .app silently fell back to faster-whisper because PyInstaller couldn't detect the conditionally-imported mlx_whisper / mlx packages.

Closes #122

Test plan

  • Added tests/test_pyinstaller_spec.py with 8 tests verifying MLX bundling configuration
  • Build the macOS arm64 app and confirm import mlx_whisper succeeds inside the bundle
  • Verify the listener selects the mlx backend on Apple Silicon instead of falling back to faster-whisper

🤖 Generated with Claude Code

@isair isair force-pushed the claude/crazy-elgamal branch from 0cd3444 to fcda650 Compare April 7, 2026 12:28
@isair isair marked this pull request as draft April 7, 2026 12:38
@isair isair force-pushed the claude/crazy-elgamal branch 2 times, most recently from 629c9d3 to 53680a7 Compare April 8, 2026 06:30
isair and others added 2 commits April 10, 2026 19:29
…ing (#169) (#171)

The existing `is_rate_limited` check matched "429"/"rate limit" in `str(e)`,
which works for the typical `HfHubHTTPError` format but silently misses edge
cases where the error message omits the HTTP status code (e.g. quota-exceeded
variants with no "429" in the text).

Add a secondary check: if the exception carries a `.response` attribute with
`status_code == 429`, treat it as rate-limited regardless of the message text.
This mirrors how `tts.py` already checks `http_err.response.status_code`.

Applies to both the faster-whisper and MLX Whisper backends in listener.py.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ackend

PyInstaller couldn't detect mlx_whisper and mlx because they're
conditionally imported only on Apple Silicon.  This adds the necessary
hidden imports, submodule collection, and data files so the bundled
.app no longer silently falls back to the slower faster-whisper engine.

Key changes to jarvis_desktop.spec:
- collect_submodules for mlx_whisper, tiktoken, and numba (runtime deps)
- List specific mlx submodules (mlx, mlx.core, mlx._reprlib_fix,
  mlx.nn, mlx.utils) instead of collect_submodules('mlx') which causes
  nanobind to double-register native types and abort
- Filter out mlx_whisper.torch_whisper (needs PyTorch, which is excluded)
- Keep scipy on macOS (remove from excludes) since mlx_whisper needs it
- Bundle mlx.metallib (128 MB Metal shader library) required by mlx.core
- Collect data files for mlx_whisper and tiktoken

Closes #122

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant