Jarvis

A 100% private AI voice assistant that lives on your computer (works offline). Talk naturally as if Jarvis is a third person in the room — say its name anywhere in your sentence and get conversational, context-aware responses. It remembers everything, always knows the current location and time, can search the web, read your screen, control Chrome, track nutrition, and much more with support for unlimited MCPs and tools without context rot. Sensitive info is automatically redacted before anything is saved to disk.

🔒 100% local processing. No subscriptions. No data harvesting. Automatic redaction of sensitive info. Free offline dictation included.

Support Jarvis

Why Jarvis?

🔒 Your data stays yours - 100% local AI processing. No cloud, no subscriptions, no data harvesting. Automatic redaction of sensitive info. This is non-negotiable.

🗣️ A third person in the room - Unlike voice assistants that only respond to rigid commands, Jarvis understands conversations. It maintains a short temporary rolling context of what's being discussed, so when you ask "Jarvis, what do you think?" it knows exactly what you're talking about. Have it chime into discussions with friends, help debug code while you talk through problems, or weigh in on decisions.

🧠 Never forgets - Unlimited memory across conversations. Adapts tone naturally to the topic. Learns your preferences over time.

🎙️ Free dictation - Hold a hotkey, speak, release — your words appear in any app as text. Like WisprFlow, but free, offline, and private. No subscription, no cloud transcription.

🔌 Extensible - MCP integration connects Jarvis to thousands of tools: smart home, GitHub, Slack, databases, and more. Smart tool selection means adding more tools won't slow things down.

📊 Transparent progress - We track what works (and what doesn't) with automated evals. See current accuracy →

🚧 Known limitations: Jarvis is under active development. Primary development happens on macOS. Windows/Linux support may lag behind. We're building in the open, issues and contributions welcome!

Voice-only for now—no text chat interface yet (#35)
No mobile apps (#17)
"Stop" commands during speech sometimes get filtered as echo (#24)
Dictation is not available on macOS 26+ (Tahoe) due to a pynput incompatibility (#172)

See it in action (example conversations)

Chiming into conversations (the magic moment):

👤 Alice: I wonder what the weather will be like tomorrow
👤 Bob: Yeah, we should check before planning the picnic
👤 Alice: Jarvis, what do you think?
  📝 Heard: "What do you think Jarvis?"
  🧠 Intent (wake word): directed → "what do you think about the weather for the picnic"

✨ Working on it: what do you think about the weather for the picnic
  🧰 Tool: getWeather…
  💬 Generating response...

🤖 Jarvis
Tomorrow looks great for a picnic! Sunny with highs around 22°C...

Jarvis understood the entire conversation and gave a contextual answer — no need to repeat the question.

Browser automation via MCP:

  📝 Heard: "Open YouTube Jarvis."
  🧠 Intent (wake word): directed → "open YouTube"

✨ Working on it: open YouTube
  💬 Generating response...

🤖 Jarvis
I have opened YouTube for you.

Health-aware advice:

  📝 Heard: "Should I order pizza tonight considering my health goals Jarvis?"
  🧠 Intent (wake word): directed → "should I order pizza tonight considering my health goals"

✨ Working on it: should I order pizza tonight considering my health goals
  🧰 Tool: fetchMeals…
  💬 Generating response...

🤖 Jarvis
Looking at your week — you mentioned wanting to lose 5kg by June...
Today you've had about 860 calories so far. Given your 1,800 calorie target,
you've got room for pizza! But maybe consider thin crust to save calories.

Personalised news search:

  📝 Heard: "What are some news from today that might interest me Jarvis?"
  🧠 Intent (wake word): directed → "what are some news from today that might interest me"

✨ Working on it: what are some news from today that might interest me
  🧰 Tool: searchMemory…
  🧰 Tool: webSearch…
  💬 Generating response...

🤖 Jarvis
Here's a quick snapshot of today's headlines...

Natural wake word placement (say "Jarvis" anywhere):

  📝 Heard: "Give me a random topic to discuss Jarvis."
  🧠 Intent (wake word): directed → "give me a random topic to discuss"

✨ Working on it: give me a random topic to discuss
  💬 Generating response...

🤖 Jarvis
How about the Fermi Paradox? Given the vast number of stars...

Echo detection (Jarvis ignores its own speech):

🤖 Jarvis
I have opened YouTube for you.

👂 Listening for follow-up (3s)...
  📝 Heard: "I have opened YouTube for you."
  🔇 Heard (echo): "i have opened youtube for you."
💤 Returning to wake word mode

Quick Install

1. Install Prerequisites

Platform	Requirement
All	Ollama
Windows	Visual C++ Redistributable (most PCs already have this)

2. Download Jarvis

Get the latest from GitHub Releases:

Platform	Download	Run
Windows	`Jarvis-Windows-x64.zip`	Extract → Run `Jarvis.exe`
macOS	`Jarvis-macOS-arm64.zip`	Extract → Move to Applications → Right-click → Open
Linux	`Jarvis-Linux-x64.tar.gz`	`tar -xzf` → Run `./Jarvis/Jarvis`

Jarvis starts listening automatically — just say "Jarvis" and talk!

Features

Conversational Awareness - Understands ongoing discussions. Ask "Jarvis, what do you think?" and it knows what you're talking about. Works naturally in multi-person conversations.
Unlimited Memory - Never forgets. Searches across all your conversation history. Memory Viewer GUI included.
Adaptive Tone - Automatically surgical for code, pragmatic for business, encouraging for wellbeing — no manual mode switching
Smart Tool Selection - Embedding-based relevance filtering picks only the tools needed per query — add unlimited MCP tools without performance degradation
Built-in Tools - Screenshot OCR, web search (with auto-fetch), weather, file access, nutrition tracking, location awareness
Natural Voice - Say "Jarvis" anywhere in your sentence, interrupt with "stop", follow up without repeating the wake word
Dictation Mode - Free, offline alternative to WisprFlow — hold a hotkey, speak, release to paste text into any app
MCP Integration - Connect to thousands of external tools (Home Assistant, GitHub, Slack, etc.)

System Requirements

Hardware	VRAM	Model
Most users	8GB+	`gemma4:e2b` (default)
Better quality	16GB+	`gemma4:e4b`
High-end	24GB+	`gpt-oss:20b`

Note: VRAM requirements include the intent judge model (gemma4:e2b) which is always loaded alongside the chat model for voice intent classification. The default model shares this, so no extra VRAM is needed.

The setup wizard will guide you through model selection and installation on first launch.

Configuration

Most users won't need to change anything. Open ⚙️ Settings from the tray menu to configure Jarvis through a graphical interface — no JSON editing required. Settings are saved to ~/.config/jarvis/config.json.

Speech Recognition (Whisper)

Language Modes

Multilingual (default, 99 languages): "whisper_model": "medium"
English Only (slightly better English accuracy): "whisper_model": "medium.en"

Model Sizes

Model	English	Multilingual	Download	VRAM	Speed
Tiny	`tiny.en`	`tiny`	~75 MB	~1 GB	~10x
Base	`base.en`	`base`	~140 MB	~1 GB	~7x
Small	`small.en`	`small`	~465 MB	~2 GB	~4x
Medium	`medium.en`	`medium`	~1.5 GB	~5 GB	~2x
Large V3 Turbo	-	`large-v3-turbo`	~1.5 GB	~6 GB	~8x

Speed is relative to the original large model. Source

GPU Acceleration (Windows)

If you have an NVIDIA GPU, Jarvis can use CUDA for much faster speech recognition. The Windows installer offers an optional CUDA download during setup. For development:

pip install nvidia-cublas-cu12 nvidia-cudnn-cu12

CUDA is detected automatically — no configuration needed.

Voice Interface (Advanced)

LLM Intent Judge - Jarvis uses gemma4:e2b for intelligent voice intent classification (echo detection, query extraction, stop commands). This model is automatically installed alongside your chosen chat model during setup. The intent judge cannot be disabled but gracefully falls back to simpler text matching if Ollama is unavailable.

Dictation Mode — Free WisprFlow Alternative

Hold a hotkey to record speech, release to paste the transcription into any app. Works everywhere — your editor, browser, chat, terminal. Completely local, completely free.

Platform	Default hotkey
Windows	Ctrl + Win
macOS	Ctrl + Option
Linux	Ctrl + Alt

🔒 100% offline — your speech never leaves your machine (unlike cloud dictation services)
🧠 Shared Whisper model — uses the same speech recognition as voice input, no extra memory
⚡ Zero latency startup — no server round-trip, transcription starts the moment you release
📋 Universal paste — works in any app that accepts Ctrl+V / Cmd+V
🔇 Non-intrusive — main voice listener pauses automatically during dictation
✋ Hands-free mode — double-tap the hotkey to keep recording without holding; press again or hit Escape to stop
🧹 Filler word removal — optional LLM-powered cleanup removes "um", "uh", "like", "you know" while preserving meaning
📖 Custom dictionary — define "wrong -> right" replacements for jargon, names, and technical terms
📜 History window — browse, copy, or delete past dictations from the system tray
🎛️ Easy setup — configure dictation during the setup wizard or anytime in Settings (hotkey dropdown, filler removal toggle, custom dictionary editor)

Customise the hotkey in Settings or config.json:

{
  "dictation_hotkey": "ctrl+alt",
  "dictation_filler_removal": true,
  "dictation_custom_dictionary": [
    "jarvis -> Jarvis",
    "pytorch -> PyTorch"
  ]
}

Note: macOS requires Accessibility permissions for the global hotkey. Linux requires X11 (limited Wayland support).

Text-to-Speech

Piper TTS (default) - Neural TTS that auto-downloads on first use (~60MB):

Works out of the box - no setup required
High-quality British English male voice (en_GB-alan-medium)
Fast local synthesis with exact duration tracking

To use different Piper voices, download from HuggingFace and set:

{
  "tts_piper_model_path": "~/.local/share/jarvis/models/piper/en_GB-alan-medium.onnx"
}

Chatterbox - AI voice with emotion control (requires running from source):

{ "tts_engine": "chatterbox" }

Voice cloning with Chatterbox - add a 3-10 second .wav sample:

{
  "tts_engine": "chatterbox",
  "tts_chatterbox_audio_prompt": "/path/to/voice.wav"
}

Location Detection

Jarvis can provide location-aware responses (weather, local time, etc.) using a local GeoLite2 database — no cloud geolocation services are used.

IP detection chain (in order of preference):

Manual IP — configure location_ip_address in settings
UPnP — queries your local router (no traffic leaves LAN)
Socket heuristic — determines which interface routes externally (no data sent)
OpenDNS DNS query — single myip.opendns.com lookup to 208.67.222.222 (only external query)

If your ISP uses carrier-grade NAT (CGNAT), Jarvis automatically resolves your true public IP via the same OpenDNS DNS query. This can be disabled:

{
  "location_cgnat_resolve_public_ip": false
}

Setup: Register for a free MaxMind GeoLite2 account, download the City database (MMDB format), and save it to ~/.local/share/jarvis/geoip/GeoLite2-City.mmdb. The setup wizard will guide you through this.

MCP Tool Integration

Connect Jarvis to external tools via MCP servers:

{
  "mcps": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_TOKEN": "your-token" }
    }
  }
}

Popular integrations:

Home Assistant - Voice control for smart home
Google Workspace - Gmail, Calendar, Drive, Docs
GitHub - Issues, PRs, workflows
Notion - Knowledge management
Slack/Discord - Team communication
Databases - MySQL, PostgreSQL, MongoDB
Composio - 500+ apps in one integration

See full MCP setup guide below.

MCP Integrations

Home Assistant - Smart home voice control

Add MCP Server integration in Home Assistant (Settings → Devices & services)
Expose entities you want to control (Settings → Voice assistants → Exposed entities)
Create Long-lived Access Token (Profile → Security → Create token)
Install proxy: uv tool install git+https://github.com/sparfenyuk/mcp-proxy
Add to config:

{
  "mcps": {
    "home_assistant": {
      "command": "mcp-proxy",
      "args": ["http://localhost:8123/mcp_server/sse"],
      "env": { "API_ACCESS_TOKEN": "YOUR_TOKEN" }
    }
  }
}

"Jarvis, turn on the living room lights" / "set bedroom to 72°" / "run good night scene"

Google Workspace - Gmail, Calendar, Drive, Docs, Sheets

{
  "mcps": {
    "google_workspace": {
      "command": "npx",
      "args": ["-y", "google-workspace-mcp"],
      "env": {
        "GOOGLE_CLIENT_ID": "your-client-id",
        "GOOGLE_CLIENT_SECRET": "your-client-secret"
      }
    }
  }
}

Setup: taylorwilsdon/google_workspace_mcp

GitHub - Repos, issues, PRs, workflows

{
  "mcps": {
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_TOKEN": "your-token" }
    }
  }
}

Notion, Slack, Discord, Databases

Notion:

{ "mcps": { "notion": { "command": "npx", "args": ["-y", "@makenotion/mcp-server-notion"], "env": { "NOTION_API_KEY": "your-token" } } } }

Slack:

{ "mcps": { "slack": { "command": "npx", "args": ["-y", "slack-mcp-server"], "env": { "SLACK_BOT_TOKEN": "xoxb-...", "SLACK_USER_TOKEN": "xoxp-..." } } } }

Discord:

{ "mcps": { "discord": { "command": "npx", "args": ["-y", "discord-mcp-server"], "env": { "DISCORD_BOT_TOKEN": "your-token" } } } }

Databases: bytebase/dbhub (SQL), mongodb-mcp-server (MongoDB)

Composio - 500+ apps in one integration

{
  "mcps": {
    "composio": {
      "command": "npx",
      "args": ["-y", "@composiohq/rube"],
      "env": { "COMPOSIO_API_KEY": "your-key" }
    }
  }
}

Get API key at composio.dev

Troubleshooting

Common issues

Jarvis doesn't hear me - Check microphone permissions, speak clearly after "Jarvis"

Responses are slow - Ensure you have enough VRAM (8GB+ for default model; see System Requirements for other models)

Windows: App won't start - Extract full zip first, check Windows Defender

macOS: "App can't be opened" - Right-click → Open, or System Settings → Privacy & Security → Allow

Linux: No tray icon - sudo apt install libayatana-appindicator3-1

For Developers

Running from source

git clone https://github.com/isair/jarvis.git
cd jarvis

# macOS
bash scripts/run_macos.sh

# Windows (with Micromamba)
pwsh -ExecutionPolicy Bypass -File scripts\run_windows.ps1

# Linux
bash scripts/run_linux.sh

Running from source enables Chatterbox TTS (AI voice with emotion/cloning). Piper TTS works in both bundled and source modes.

Privacy hardening (stay 100% offline)

{
  "web_search_enabled": false,
  "mcps": {},
  "location_auto_detect": false,
  "location_cgnat_resolve_public_ip": false,
  "location_enabled": false
}

Verify: sudo lsof -i -n -P | grep jarvis (should only show 127.0.0.1 to Ollama)

Privacy & Storage

100% offline - No cloud services required
Auto-redaction - Emails, tokens, passwords automatically removed
Local storage - Everything in ~/.local/share/jarvis

License

Personal use: Free forever
Commercial use: Contact us

Support

Report issues · Discussions · Sponsor

Name		Name	Last commit message	Last commit date
Latest commit History 277 Commits
.claude		.claude
.githooks		.githooks
.github		.github
docs/img		docs/img
evals		evals
examples		examples
installer/windows		installer/windows
scripts		scripts
src		src
tests		tests
.editorconfig		.editorconfig
.gitconfig		.gitconfig
.gitignore		.gitignore
.releaserc.json		.releaserc.json
CLAUDE.md		CLAUDE.md
EVALS.md		EVALS.md
LICENSE		LICENSE
README.md		README.md
jarvis_desktop.spec		jarvis_desktop.spec
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Jarvis

Why Jarvis?

Quick Install

1. Install Prerequisites

2. Download Jarvis

Features

System Requirements

Configuration

Language Modes

Model Sizes

GPU Acceleration (Windows)

Dictation Mode — Free WisprFlow Alternative

MCP Integrations

Troubleshooting

For Developers

Privacy & Storage

License

Support

About

Uh oh!

Releases 22

Sponsor this project

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Jarvis

Why Jarvis?

Quick Install

1. Install Prerequisites

2. Download Jarvis

Features

System Requirements

Configuration

Language Modes

Model Sizes

GPU Acceleration (Windows)

Dictation Mode — Free WisprFlow Alternative

MCP Integrations

Troubleshooting

For Developers

Privacy & Storage

License

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 22

Sponsor this project

Uh oh!

Contributors

Uh oh!

Languages