Skip to content

perf(executor): increase HTTP transport MaxIdleConnsPerHost#2707

Open
Sotatek-HieuBui2 wants to merge 1 commit intorouter-for-me:devfrom
Sotatek-HieuBui2:perf/http-transport-tuning
Open

perf(executor): increase HTTP transport MaxIdleConnsPerHost#2707
Sotatek-HieuBui2 wants to merge 1 commit intorouter-for-me:devfrom
Sotatek-HieuBui2:perf/http-transport-tuning

Conversation

@Sotatek-HieuBui2
Copy link
Copy Markdown

Increased MaxIdleConns from 100 to 200 and MaxIdleConnsPerHost from 2 to 20 to improve connection reuse and reduce latency for HTTP-based executors like Gemini CLI, Qwen, and others that share the proxy client. This also enables HTTP/2 by default.

@github-actions
Copy link
Copy Markdown

This pull request targeted main.

The base branch has been automatically changed to dev.

@github-actions github-actions bot changed the base branch from main to dev April 11, 2026 10:03
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces HTTP transport tuning for the proxy-aware HTTP client, specifically configuring connection limits and enabling HTTP/2. The feedback identifies a potential performance regression if clients are not reused, as each now receives a unique transport instance. Additionally, there are concerns regarding the unconditional overriding of settings on user-provided transports and a race condition in the tests caused by modifying the global http.DefaultTransport while running parallel tests.

Comment on lines +47 to +49
httpClient := &http.Client{
Transport: cloneDefaultTransport(),
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

By using cloneDefaultTransport(), each call to NewProxyAwareHTTPClient now creates a new http.Transport with its own connection pool. In the previous implementation, the global http.DefaultTransport was used when no proxy was configured, allowing connection reuse across different clients. This change is beneficial for performance only if the returned *http.Client is long-lived and reused. If clients are created per-request, this will lead to a performance regression due to the lack of connection pooling and potential socket exhaustion.

Comment on lines +79 to +82
if transport, ok := rt.(*http.Transport); ok {
cloned := transport.Clone()
applyTransportTuning(cloned)
httpClient.Transport = cloned
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When a custom RoundTripper is provided via context, this logic clones it (if it's a *http.Transport) and overrides its settings with the new tuning parameters. While cloning is good for isolation, unconditionally overriding settings like MaxIdleConns or ForceAttemptHTTP2 on a user-provided transport might be unexpected. If the caller provided a specifically configured transport, their settings will be lost.

Comment on lines +90 to +94
orig := http.DefaultTransport
defer func() { http.DefaultTransport = orig }()

http.DefaultTransport = nil
t.Cleanup(func() { http.DefaultTransport = orig })
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Modifying the global http.DefaultTransport in a test can cause race conditions and flaky failures when other tests in the same process are running in parallel. Since other tests in this file (e.g., TestNewProxyAwareHTTPClient_DefaultTransportTuned) use t.Parallel(), they might observe http.DefaultTransport as nil while this test is running, leading to non-deterministic behavior. Additionally, using both defer and t.Cleanup to restore the same global variable is redundant; t.Cleanup is generally preferred in modern Go tests.

Suggested change
orig := http.DefaultTransport
defer func() { http.DefaultTransport = orig }()
http.DefaultTransport = nil
t.Cleanup(func() { http.DefaultTransport = orig })
orig := http.DefaultTransport
t.Cleanup(func() { http.DefaultTransport = orig })
http.DefaultTransport = nil

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 79d69e8802

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +47 to +49
httpClient := &http.Client{
Transport: cloneDefaultTransport(),
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve shared default transport for no-proxy clients

When neither auth/cfg proxy nor context round-tripper is set, this now assigns Transport: cloneDefaultTransport() to every new client. I checked executor call sites (for example internal/runtime/executor/gemini_executor.go) and they construct a fresh client per request, so this change creates a separate connection pool each time and prevents cross-request keep-alive reuse that previously came from Transport == nil (shared http.DefaultTransport). That can materially increase dial/TLS overhead and connection churn in production.

Useful? React with 👍 / 👎.

Comment on lines +79 to +82
if transport, ok := rt.(*http.Transport); ok {
cloned := transport.Clone()
applyTransportTuning(cloned)
httpClient.Transport = cloned
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reuse context transport instead of cloning per request

The context round-tripper path now clones *http.Transport before use, which defeats the shared-transport caching strategy in sdk/cliproxy/rtprovider.go (cached per proxy URL). Since this helper is called repeatedly, cloning here fragments pools and loses socket reuse across executions for the same auth/proxy, increasing outbound connection pressure and latency under load.

Useful? React with 👍 / 👎.

@aeddk-bnh aeddk-bnh force-pushed the perf/http-transport-tuning branch from 79d69e8 to 76c5a2c Compare April 11, 2026 10:12
@aeddk-bnh
Copy link
Copy Markdown

Thank you for the detailed feedback! I've updated the PR to address the performance and safety concerns:

  • Connection Reuse: Removed cloneDefaultTransport() per request. Instead, I now tune http.DefaultTransport once using sync.Once. Clients without a proxy will now correctly reuse the global connection pool (with performance improvements applied).
  • Preserving User Settings: Removed cloning and tuning of context RoundTrippers. This ensures that user-provided transports and the rtprovider.go caching mechanism are preserved without unexpected overrides.
  • Fixed Flaky Tests: Removed the unsafe modification of http.DefaultTransport in parallel tests to prevent race conditions.
  • DRY Code: Kept the unified tuning logic via applyTransportTuning for both the global default and specifically created proxy transports.

I've force-pushed a single clean commit with these improvements.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 76c5a2c047

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

// Returns:
// - *http.Client: An HTTP client with configured proxy or transport
func NewProxyAwareHTTPClient(ctx context.Context, cfg *config.Config, auth *cliproxyauth.Auth, timeout time.Duration) *http.Client {
GlobalTuneDefaultTransport()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid mutating default transport in request path

Calling GlobalTuneDefaultTransport() here makes the first executor request mutate http.DefaultTransport lazily, but this repository already performs startup HTTP traffic before executor requests (for example registry.StartModelsUpdater in cmd/server/main.go and fetchModelsFromRemote in internal/registry/model_updater.go using &http.Client{Timeout: ...} with nil Transport). Reconfiguring a transport after it has been used is unsafe and can race with in-flight/default-client requests, leading to nondeterministic behavior under startup load; the tuning should happen during early init (before any HTTP use) or on dedicated cloned transports only.

Useful? React with 👍 / 👎.

@aeddk-bnh aeddk-bnh force-pushed the perf/http-transport-tuning branch from 76c5a2c to 8087c74 Compare April 11, 2026 10:25
@aeddk-bnh
Copy link
Copy Markdown

Updated the approach to be even safer and more robust:

  • Thread-safe Shared Transport: Instead of mutating http.DefaultTransport lazily (which could race with startup HTTP traffic), I've introduced a dedicated defaultTunedTransport variable. This transport is a tuned clone of the default, initialized once when the package loads.
  • Zero Impact on Global State: This approach completely avoids modifying the global http.DefaultTransport, ensuring zero side effects on other parts of the system or external libraries.
  • Optimized Connection Reuse: All executor requests that do not require a proxy now share this single, high-performance connection pool, preserving the efficiency of Keep-Alive while benefiting from the tuned parameters.
  • Clean Unit Tests: Reverted to parallel-safe unit tests that verify the use of the shared tuned transport without global state manipulation.

Force-pushed the updated commit.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8087c74106

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}

// Default fallback: use shared tuned transport
httpClient.Transport = defaultTunedTransport
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve nil-transport fallback for Antigravity client

Setting a default transport unconditionally here changes the no proxy + no context round-tripper path from Transport == nil to always non-nil, which breaks the sentinel logic in internal/runtime/executor/antigravity_executor.go (newAntigravityHTTPClient) that uses nil to select the shared antigravityTransport singleton. After this change, that branch is never taken and the Antigravity path clones a new transport per call, fragmenting connection pools and increasing connection churn/latency under load.

Useful? React with 👍 / 👎.

Increased MaxIdleConns from 100 to 200 and MaxIdleConnsPerHost from 2 to 20
to improve connection reuse and reduce latency for HTTP-based executors
like Gemini CLI, Qwen, and others that share the proxy client. This also
enables HTTP/2 by default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@aeddk-bnh aeddk-bnh force-pushed the perf/http-transport-tuning branch from 8087c74 to 6194488 Compare April 11, 2026 10:32
@aeddk-bnh
Copy link
Copy Markdown

Finalized the implementation with the most robust and idiomatic approach:

  • Early Initialization: Moved http.DefaultTransport tuning to the package init() function. This ensures that performance settings are applied at the earliest possible moment, completely eliminating race conditions with startup HTTP traffic (like models updater or registry).
  • Restored nil Transport Fallback: The NewProxyAwareHTTPClient helper now returns a client with Transport: nil when no proxy or custom round-tripper is provided. This:
    • Automatically falls back to the already-tuned http.DefaultTransport.
    • Preserves specialized singleton logic in other executors (e.g., Antigravity's sentinel checks for nil transport).
    • Maximizes connection reuse through the global system pool.
  • Strict Isolation: Confirmed through unit tests that context-provided RoundTrippers are reused as-is, preserving any specialized caller configuration.

This approach combines high performance with the highest level of safety and compatibility across the repository.

Copy link
Copy Markdown
Collaborator

@luispater luispater left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary:
This tuning currently escapes the executor helper and changes process-wide HTTP transport defaults.

Blocking findings:

  • internal/runtime/executor/helps/proxy_helpers.go:15 calls applyTransportTuning(http.DefaultTransport) during package init. That change does not stay scoped to NewProxyAwareHTTPClient.
  • sdk/proxyutil/proxy.go:71 clones http.DefaultTransport, so once helps is imported, every later proxyutil.BuildHTTPTransport(...) call inherits the new 200/20 pool settings as well.
  • I verified this in a detached PR worktree: BuildHTTPTransport("direct") reports 100/0 without importing helps, and 200/20 after importing it. So auth/management callers such as internal/util/proxy.go and sdk/cliproxy/rtprovider.go are retuned too.

Test plan:

  • go test ./internal/runtime/executor/helps ./sdk/proxyutil ./sdk/cliproxy -count=1
  • go build -o test-output ./cmd/server && rm -f test-output

Requested change:
Please keep the tuning local to the executor-specific client/transport construction instead of mutating http.DefaultTransport in package init, and add a regression test that proves unrelated proxyutil callers keep their previous defaults.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants