Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## Unreleased

### Changed

- **BREAKING**: Renamed the `[model]` section in `config.toml` to `[inference]`. The section still contains a single field, `ollama_url`, but the name now reflects what it actually configures (the inference daemon endpoint, not a model). There is no backward-compatibility shim: if you had a custom `[model]` section, rename it to `[inference]` after upgrading.
- Active model selection is now strictly Option-typed end to end. Ollama's `/api/tags` is the single source of truth: when nothing is installed and nothing is persisted, Thuki refuses to dispatch requests and surfaces a "Pick a model" prompt instead of falling back to a hardcoded slug. The previous `DEFAULT_MODEL_NAME` constant has been removed.

## [0.6.1](https://github.com/quiet-node/thuki/compare/v0.6.0...v0.6.1) (2026-04-14)


Expand Down
7 changes: 5 additions & 2 deletions docs/configurations.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ open ~/Library/Application\ Support/com.quietnode.thuki/config.toml
### Example

```toml
[model]
[inference]
# Where Thuki finds your local Ollama server. The active model itself is
# selected from the in-app picker (which lists whatever is installed in
# Ollama via /api/tags) and is stored in Thuki's local database, not here.
Expand Down Expand Up @@ -80,10 +80,12 @@ Every domain below is shown as a single table that lists **all** constants Thuki

## Reference

### `[model]`
### `[inference]`

Where to find your local Ollama server. The active model itself is **not** a TOML setting: Thuki discovers installed models live from Ollama's `/api/tags` endpoint, lets you pick one from the in-app model picker, and stores that selection in its local SQLite database (`app_config` table). Storing the active slug in TOML would duplicate ground truth from Ollama and break the moment you remove a model with `ollama rm`, so it lives next to the conversation history instead.

When no model is installed and no choice has been persisted, Thuki refuses to dispatch a chat request and surfaces a "Pick a model" prompt in the input area. Pull a model with `ollama pull <slug>` and select it from the picker chip in the top-right of the overlay.

| Constant | Default | Tunable? | Why not tunable | Bounds | Description |
| :----------- | :------------------------- | :------- | :-------------- | :------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `ollama_url` | `"http://127.0.0.1:11434"` | Yes | — | non-empty URL | The web address where Thuki finds your local Ollama server. The default works if you run Ollama on this machine with its standard port. Change this only if you moved Ollama to a different port or another machine. |
Expand All @@ -98,6 +100,7 @@ The table below also lists the baked-in safety limits that govern Thuki's commun
| `DEFAULT_OLLAMA_SHOW_REQUEST_TIMEOUT_SECS` | `5 s` | No | Protocol cap on a hung daemon to keep the UI responsive. Same rationale as the tags timeout above. | — | How long Thuki waits for Ollama's `/api/show` endpoint to respond before giving up. Used when fetching capability flags (vision, thinking) for each installed model. |
| `MAX_OLLAMA_TAGS_BODY_BYTES` | `4 MiB` | No | Defense-in-depth bound on attacker-controlled response body. A misbehaving or compromised Ollama could otherwise stream an unbounded payload and exhaust memory. | — | The largest `/api/tags` response body Thuki will accept. 4 MiB fits thousands of model entries; anything larger is rejected immediately and the request returns an error. |
| `MAX_OLLAMA_SHOW_BODY_BYTES` | `4 MiB` | No | Defense-in-depth bound on attacker-controlled response body. Same rationale as `MAX_OLLAMA_TAGS_BODY_BYTES`. | — | The largest `/api/show` response body Thuki will accept. Full Modelfiles and parameters can be sizable, but 4 MiB is well above any real model; larger responses are rejected. |
| `MAX_MODEL_SLUG_LEN` | `256 B` | No | Defense-in-depth bound on adversarial input. Real Ollama slugs are a handful of characters; capping the length stops malformed values long before any network or DB work. | — | The longest model slug Thuki will accept from `set_active_model`. Anything longer is rejected immediately by `validate_model_slug`. |

### `[prompt]`

Expand Down
54 changes: 53 additions & 1 deletion src-tauri/src/commands.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,26 @@ pub enum OllamaErrorKind {
NotRunning,
/// The requested model has not been pulled yet (HTTP 404).
ModelNotFound,
/// No active model has been selected. The user must pick a model from
/// the in-app picker before any chat request can be issued. Distinct
/// from `ModelNotFound`, which fires when the daemon answered 404 for
/// a slug we did try to use.
NoModelSelected,
/// Any other unexpected error.
Other,
}

/// Builds the structured error returned when `ActiveModelState` holds `None`
/// at the time `ask_ollama` is invoked. Pulled out as a free function so the
/// exact title + body wording lives in one place and the branch is testable
/// without a full Tauri runtime.
pub fn no_model_selected_error() -> OllamaError {
OllamaError {
kind: OllamaErrorKind::NoModelSelected,
message: "No model selected\nPick a model in the picker.".to_string(),
}
}

/// Structured error emitted over the streaming channel.
/// Rust owns all user-facing copy; the frontend only uses `kind` for styling.
#[derive(Clone, Serialize, Debug)]
Expand Down Expand Up @@ -351,12 +367,24 @@ pub async fn ask_ollama(
config: State<'_, AppConfig>,
active_model: State<'_, crate::models::ActiveModelState>,
) -> Result<(), String> {
let endpoint = format!("{}/api/chat", config.model.ollama_url.trim_end_matches('/'));
let endpoint = format!(
"{}/api/chat",
config.inference.ollama_url.trim_end_matches('/')
);
// Snapshot the active model slug; drop the guard before any `.await`.
let model_name = {
let guard = active_model.0.lock().map_err(|e| e.to_string())?;
guard.clone()
};
let Some(model_name) = model_name else {
// Defense in depth: the onboarding gate already refuses to open the
// overlay without a selected model, so this branch only fires if the
// user removed their last installed model with `ollama rm` between
// launches and the picker hasn't been opened yet. Surface a typed
// error so the frontend can route the user to the picker.
let _ = on_event.send(StreamChunk::Error(no_model_selected_error()));
return Ok(());
};
let cancel_token = CancellationToken::new();
generation.set_token(cancel_token.clone());

Expand Down Expand Up @@ -1266,6 +1294,30 @@ mod tests {
assert!(err.message.contains("401"));
}

#[test]
fn no_model_selected_error_uses_typed_kind_and_actionable_message() {
// The frontend keys off `kind` to route to the picker; the message
// is rendered verbatim. Both are part of the IPC contract: lock
// them down so accidental wording drift does not silently break
// the recovery path.
let err = no_model_selected_error();
assert_eq!(err.kind, OllamaErrorKind::NoModelSelected);
assert!(
err.message.contains("Pick a model"),
"message should steer the user to the picker, got: {}",
err.message,
);
}

#[test]
fn ollama_error_kind_no_model_selected_serializes_as_pascal_case() {
// Wire format check: NoModelSelected must serialize verbatim in
// PascalCase so the React side can match on a stable string in the
// OllamaError discriminator.
let v = serde_json::to_value(OllamaErrorKind::NoModelSelected).unwrap();
assert_eq!(v, serde_json::Value::String("NoModelSelected".to_string()));
}

#[tokio::test]
async fn connection_refused_emits_not_running_error() {
let client = reqwest::Client::new();
Expand Down
9 changes: 5 additions & 4 deletions src-tauri/src/config/defaults.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,6 @@
//! Changing a default here propagates to a fresh first-run config file and to
//! any field a user has left unset or left empty in their existing file.

/// Default active model name, used when no config file exists yet and when a
/// user's `[model] available` list is empty after whitespace resolution.
pub const DEFAULT_MODEL_NAME: &str = "gemma4:e2b";

/// Default Ollama HTTP endpoint (loopback, standard port).
pub const DEFAULT_OLLAMA_URL: &str = "http://127.0.0.1:11434";

Expand Down Expand Up @@ -132,3 +128,8 @@ pub const MAX_OLLAMA_TAGS_BODY_BYTES: usize = 4 * 1024 * 1024;
/// Modelfile and parameters can be sizable, but 4 MiB is comfortably above
/// any real model and bounds attacker-controlled inputs.
pub const MAX_OLLAMA_SHOW_BODY_BYTES: usize = 4 * 1024 * 1024;

/// Maximum accepted byte length for a model slug passed to `set_active_model`.
/// Real Ollama slugs are a handful of characters; 256 is generous while still
/// capping adversarial inputs long before any network or database work.
pub const MAX_MODEL_SLUG_LEN: usize = 256;
6 changes: 3 additions & 3 deletions src-tauri/src/config/loader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -109,11 +109,11 @@ fn rename_corrupt(path: &Path) {
/// and composes the system prompt appendix into `prompt.resolved_system`.
/// After this runs, every `AppConfig` field holds a usable value.
pub(crate) fn resolve(config: &mut AppConfig) {
// Model section: only the Ollama endpoint is configurable here. The
// Inference section: only the Ollama endpoint is configurable here. The
// active model is runtime UI state owned by SQLite app_config, see
// crate::models::ActiveModelState.
if config.model.ollama_url.trim().is_empty() {
config.model.ollama_url = DEFAULT_OLLAMA_URL.to_string();
if config.inference.ollama_url.trim().is_empty() {
config.inference.ollama_url = DEFAULT_OLLAMA_URL.to_string();
}

// Prompt section: empty base -> built-in. Compose resolved_system.
Expand Down
2 changes: 1 addition & 1 deletion src-tauri/src/config/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ pub mod writer;

pub use error::ConfigError;
pub use loader::load_from_path;
pub use schema::{AppConfig, ModelSection, PromptSection, QuoteSection, WindowSection};
pub use schema::{AppConfig, InferenceSection, PromptSection, QuoteSection, WindowSection};
pub use writer::atomic_write;

/// File name of the user config file inside the OS config dir.
Expand Down
8 changes: 4 additions & 4 deletions src-tauri/src/config/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ use super::defaults::{
DEFAULT_SEARCH_TIMEOUT_S, DEFAULT_SEARXNG_MAX_RESULTS, DEFAULT_SEARXNG_URL, DEFAULT_TOP_K_URLS,
};

/// Static, user-tunable model configuration.
/// Static, user-tunable inference daemon configuration.
///
/// The active model selection is NOT stored here. Active-model state is
/// runtime UI state owned by [`crate::models::ActiveModelState`] and
Expand All @@ -34,12 +34,12 @@ use super::defaults::{
/// endpoint URL.
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
#[serde(default)]
pub struct ModelSection {
pub struct InferenceSection {
/// HTTP base URL of the local Ollama instance.
pub ollama_url: String,
}

impl Default for ModelSection {
impl Default for InferenceSection {
fn default() -> Self {
Self {
ollama_url: DEFAULT_OLLAMA_URL.to_string(),
Expand Down Expand Up @@ -176,7 +176,7 @@ impl Default for SearchSection {
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Default)]
#[serde(default)]
pub struct AppConfig {
pub model: ModelSection,
pub inference: InferenceSection,
pub prompt: PromptSection,
pub window: WindowSection,
pub quote: QuoteSection,
Expand Down
46 changes: 27 additions & 19 deletions src-tauri/src/config/tests.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ use super::defaults::{
use super::error::ConfigError;
use super::loader::{compose_system_prompt, load_from_path};
use super::schema::{
AppConfig, ModelSection, PromptSection, QuoteSection, SearchSection, WindowSection,
AppConfig, InferenceSection, PromptSection, QuoteSection, SearchSection, WindowSection,
};
use super::writer::atomic_write;

Expand All @@ -47,7 +47,7 @@ fn defaults_const_values_match_schema_defaults() {
// Guard rail: a change to a default in defaults.rs must flow through to
// AppConfig::default(). If this test fails, someone changed one but not both.
let c = AppConfig::default();
assert_eq!(c.model.ollama_url, DEFAULT_OLLAMA_URL);
assert_eq!(c.inference.ollama_url, DEFAULT_OLLAMA_URL);
assert_eq!(c.prompt.system, "");
assert_eq!(c.prompt.resolved_system, "");
assert_eq!(c.window.overlay_width, DEFAULT_OVERLAY_WIDTH);
Expand Down Expand Up @@ -85,7 +85,7 @@ fn defaults_prompt_base_is_nonempty() {

#[test]
fn section_defaults_are_sensible() {
let m = ModelSection::default();
let m = InferenceSection::default();
assert_eq!(m.ollama_url, DEFAULT_OLLAMA_URL);

let p = PromptSection::default();
Expand All @@ -105,7 +105,7 @@ fn app_config_serde_round_trip_matches_defaults() {
let parsed: AppConfig = toml::from_str(&toml_str).expect("deserialize");
// prompt.resolved_system is marked #[serde(skip)] so it does not round-trip
// through the file. Compare everything else.
assert_eq!(parsed.model, original.model);
assert_eq!(parsed.inference, original.inference);
assert_eq!(parsed.prompt.system, original.prompt.system);
assert_eq!(parsed.window, original.window);
assert_eq!(parsed.quote, original.quote);
Expand All @@ -115,11 +115,11 @@ fn app_config_serde_round_trip_matches_defaults() {
fn app_config_partial_file_fills_missing_fields_with_defaults() {
// Only declare one field; serde(default) fills the rest.
let partial = r#"
[model]
[inference]
ollama_url = "http://localhost:9999"
"#;
let parsed: AppConfig = toml::from_str(partial).expect("partial file parses");
assert_eq!(parsed.model.ollama_url, "http://localhost:9999");
assert_eq!(parsed.inference.ollama_url, "http://localhost:9999");
assert_eq!(parsed.window.overlay_width, DEFAULT_OVERLAY_WIDTH);
assert_eq!(
parsed.quote.max_display_lines,
Expand Down Expand Up @@ -164,7 +164,7 @@ fn load_missing_file_seeds_defaults_and_returns_them() {
let config = load_from_path(&path).expect("seed on first run");

assert!(path.exists(), "file should be seeded");
assert_eq!(config.model.ollama_url, DEFAULT_OLLAMA_URL);
assert_eq!(config.inference.ollama_url, DEFAULT_OLLAMA_URL);
// Resolved system prompt composed from default base plus appendix.
assert!(config
.prompt
Expand All @@ -183,7 +183,7 @@ fn load_missing_file_in_missing_parent_dir_creates_dir() {
let path = config_path_in(&nested);
let config = load_from_path(&path).expect("creates parent dir and seeds");
assert!(path.exists());
assert_eq!(config.model.ollama_url, DEFAULT_OLLAMA_URL);
assert_eq!(config.inference.ollama_url, DEFAULT_OLLAMA_URL);
}

#[test]
Expand All @@ -209,14 +209,14 @@ fn load_existing_valid_file_returns_resolved_config() {
std::fs::write(
&path,
r#"
[model]
[inference]
ollama_url = "http://localhost:99999"
"#,
)
.unwrap();

let config = load_from_path(&path).unwrap();
assert_eq!(config.model.ollama_url, "http://localhost:99999");
assert_eq!(config.inference.ollama_url, "http://localhost:99999");
}

#[test]
Expand Down Expand Up @@ -248,7 +248,7 @@ fn load_corrupt_file_is_renamed_and_reseeded() {
std::fs::write(&path, "this is = definitely not [ valid toml").unwrap();

let config = load_from_path(&path).expect("recover from corrupt file");
assert_eq!(config.model.ollama_url, DEFAULT_OLLAMA_URL);
assert_eq!(config.inference.ollama_url, DEFAULT_OLLAMA_URL);

// Original file renamed with .corrupt- prefix.
let renamed_exists = std::fs::read_dir(&dir)
Expand All @@ -274,7 +274,11 @@ fn load_unreadable_file_returns_in_memory_defaults() {

let dir = fresh_temp_dir();
let path = config_path_in(&dir);
std::fs::write(&path, "[model]\nollama_url = \"http://127.0.0.1:11434\"\n").unwrap();
std::fs::write(
&path,
"[inference]\nollama_url = \"http://127.0.0.1:11434\"\n",
)
.unwrap();
std::fs::set_permissions(&path, std::fs::Permissions::from_mode(0o000)).unwrap();

// If the current user is root, the permission bits are ignored and this
Expand All @@ -286,7 +290,7 @@ fn load_unreadable_file_returns_in_memory_defaults() {
}

let config = load_from_path(&path).expect("fallback to in-memory defaults");
assert_eq!(config.model.ollama_url, DEFAULT_OLLAMA_URL);
assert_eq!(config.inference.ollama_url, DEFAULT_OLLAMA_URL);
// Restore so cleanup works.
let _ = std::fs::set_permissions(&path, std::fs::Permissions::from_mode(0o644));
}
Expand All @@ -295,22 +299,22 @@ fn load_unreadable_file_returns_in_memory_defaults() {

#[test]
fn resolve_unknown_model_field_is_ignored() {
// Older config files seeded a `[model] available = [...]` list. After
// Older config files seeded a `[inference] available = [...]` list. After
// removing that field from the schema, serde must silently drop it
// rather than refusing to parse the file.
let dir = fresh_temp_dir();
let path = config_path_in(&dir);
std::fs::write(
&path,
r#"
[model]
[inference]
available = ["legacy:model", "another:model"]
ollama_url = "http://localhost:11434"
"#,
)
.unwrap();
let config = load_from_path(&path).unwrap();
assert_eq!(config.model.ollama_url, "http://localhost:11434");
assert_eq!(config.inference.ollama_url, "http://localhost:11434");
}

#[test]
Expand All @@ -320,13 +324,13 @@ fn resolve_empty_ollama_url_falls_back() {
std::fs::write(
&path,
r#"
[model]
[inference]
ollama_url = " "
"#,
)
.unwrap();
let config = load_from_path(&path).unwrap();
assert_eq!(config.model.ollama_url, DEFAULT_OLLAMA_URL);
assert_eq!(config.inference.ollama_url, DEFAULT_OLLAMA_URL);
}

#[test]
Expand Down Expand Up @@ -807,7 +811,11 @@ fn search_batch_timeout_invariant_corrected() {
fn toml_without_search_section_deserializes_to_defaults() {
let dir = fresh_temp_dir();
let path = config_path_in(&dir);
std::fs::write(&path, "[model]\nollama_url = \"http://127.0.0.1:11434\"\n").unwrap();
std::fs::write(
&path,
"[inference]\nollama_url = \"http://127.0.0.1:11434\"\n",
)
.unwrap();
let loaded = load_from_path(&path).unwrap();
assert_eq!(
loaded.search.searxng_url, DEFAULT_SEARXNG_URL,
Expand Down
Loading