Performance hot-path issues in Multi Model Conversation v2

[Multi Model Conversation v2](https://raw.githubusercontent.com/Haervwe/open-webui-tools/refs/heads/main/functions/multi_model_conversation_v2.py)

The tool is functionally good, but there are a few avoidable performance issues in the hot path that become noticeable with long conversations, multiple participants, manager mode, and tool use.

This issue is only about low-risk performance / quality fixes that should not change the tool's behavior.

Main problems

1. Full transcript is rebuilt and re-sent on every streamed chunk

  In `pipe()` / `_stream_and_accumulate()`, the tool keeps appending to `total_emitted` and calls `emit_replace(total_emitted)` repeatedly for:
  - participant titles
  - streamed content chunks
  - reasoning updates
  - tool execution details
  - fallback content
  
  This means each new chunk sends the entire accumulated conversation again, so cost grows with transcript size. On long runs this becomes unnecessarily expensive in UI rendering, CPU, and memory.
  
  Why this matters:
  - latency grows over time
  - browser/UI updates get heavier every turn
  - manager mode and multi-round conversations amplify the problem
  
  Low-risk fix:
  - avoid duplicate `emit_replace()` calls when content has not changed
  - throttle / batch replace updates during streaming
  - keep the same UX, just reduce full-message replaces

2. Repeated per-turn model lookups/checks in the hot path

  Inside the participant loop, the tool repeatedly does work that could be resolved once per model / participant at setup time, for example:
  - `Models.get_model_by_id(model)`
  - checking native function calling support
  - re-reading model system prompt / params
  - repeated feature/tool-related lookups
  
  Why this matters:
  - adds avoidable DB/runtime overhead every turn
  - gets worse with more rounds and participants
  
  Low-risk fix:
  - cache per-participant/per-model metadata once at the start of the pipe run
  - reuse cached values in the conversation loop

3. Tool/model setup work is heavier than necessary per run

  The participant setup phase merges tool IDs and features from multiple sources (`MODELS` runtime state + DB model info + metadata/body fallbacks), then loads built-in/imported tools.
  
  This is correct functionally, but it does more work than necessary and could be simplified/cached within a single run.
  
  Why this matters:
  - adds startup latency before the conversation even begins
  - especially painful in environments where models may also load/unload between turns
  
  Low-risk fix:
  - normalize and cache participant config once
  - avoid reprocessing the same participant model more than needed in one execution
  - keep current behavior, just reduce repeated setup work

4. Logging is too verbose in hot paths

  There are many `logger.info(...)` calls in per-run / per-model / tool-loading paths. For a production interactive tool this creates unnecessary journal noise and some overhead.
  
  Low-risk fix:
  - downgrade repetitive operational logs from `info` to `debug`
  - keep only important lifecycle events at `info`
  
  Not part of this issue
  
  - Redesigning manager mode behavior
  - Changing conversation logic / rounds / participant order
  - Removing tool support
  - Solving backend-specific model load/unload behavior completely

Note on environment-dependent slowness

  There is also an environment/runtime factor: if different models are constantly loaded/unloaded between turns, that will always add latency. This issue is not asking to solve that completely, only to remove avoidable overhead from the tool itself so that those environments suffer less.

Suggested acceptance criteria

  - Long conversations no longer trigger a full `emit_replace()` flood for every tiny chunk
  - Per-participant model metadata is resolved once and reused
  - No behavior change in conversation flow, tools, or manager mode
  - Hot-path logs are reduced to reasonable levels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance hot-path issues in Multi Model Conversation v2 #65

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Performance hot-path issues in Multi Model Conversation v2 #65

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions