Skip to content

Make LLM backend any OpenAI-compatible endpoint#11

Open
ethenotethan wants to merge 1 commit intomainfrom
chore/openai-compatible-endpoints
Open

Make LLM backend any OpenAI-compatible endpoint#11
ethenotethan wants to merge 1 commit intomainfrom
chore/openai-compatible-endpoints

Conversation

@ethenotethan
Copy link
Copy Markdown
Collaborator

Summary

  • Drops the hardcoded OpenRouter base URL and Claude Agent SDK framing in favor of generic OpenAI Chat Completions config, so Flashlight works against OpenAI, OpenRouter, vLLM, LM Studio, Ollama, Together, Groq, etc.
  • New env vars: OPENAI_API_KEY (required), OPENAI_BASE_URL (default https://api.openai.com/v1), OPENAI_MODEL (default gpt-4o-mini). No OPENROUTER_* backwards compat.
  • README no longer references Claude Agent SDK / ANTHROPIC_API_KEY; adds an LLM configuration section with an env-var table and per-provider example configs.

Changes

  • agent/burr_app.py — parametrized base URL via get_base_url(), renamed client helper call_openrouter_chat_completion (avoids collision with the existing Burr call_llm action), OpenRouter-specific HTTP-Referer / X-Title attribution headers now only emitted when the base URL actually contains openrouter.ai.
  • agent/cli.py, agent/agent.py — API-key checks swapped to OPENAI_API_KEY with provider-agnostic error messages.
  • .env.example — rewritten with OPENAI_* vars and example configs for OpenAI / OpenRouter / local vLLM / LM Studio / Ollama.
  • README.md — intro + Installation + Requirements + Contributing sections de-Claude'd; de-emoji'd verbose-logging bullets.

Compatibility constraint

"OpenAI-compatible" here specifically means the server returns structured tool_calls on the response message (not JSON emitted as text in content). Flashlight's ReAct loop relies on response["tool_calls"] in _chat_completion; an endpoint that only emits tool-call intent as plain text will stall the loop after one turn. Works well with OpenAI, OpenRouter (Claude/GPT/Gemini), vLLM, LM Studio, mlx-omni-server, and other servers that implement the OpenAI tools schema.

Test plan

  • Syntax check: python -m compileall agent/
  • Smoke test config resolution:
    OPENAI_API_KEY=dummy python -c "from agent.burr_app import get_api_key, get_base_url, DEFAULT_MODEL; print(get_api_key(), get_base_url(), DEFAULT_MODEL)"
    
  • Missing-key path raises with new message: python -c "from agent.burr_app import get_api_key; get_api_key()"
  • End-to-end with a real endpoint (OpenAI or OpenRouter) not run in this PR — reviewer-assisted.

Drops the hardcoded OpenRouter base URL and Claude Agent SDK framing in
favor of generic OpenAI Chat Completions config (OPENAI_API_KEY,
OPENAI_BASE_URL, OPENAI_MODEL), so Flashlight works against OpenAI,
OpenRouter, vLLM, LM Studio, Ollama, and anything else that speaks the
same protocol + tool-calling schema.

- agent/burr_app.py: parametrize base URL, rename client helper to
  _chat_completion (avoid collision with the Burr call_llm action),
  only emit OpenRouter attribution headers when routing through
  openrouter.ai, default model -> gpt-4o-mini
- agent/cli.py, agent/agent.py: swap API-key checks to OPENAI_API_KEY
  with provider-agnostic error messages
- .env.example: document OPENAI_* vars + per-provider examples
- README.md: drop Claude Agent SDK references and add an LLM
  configuration section with an env-var table and example configs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant