Skip to content

Replace ReadDeadlines in regionserver client with TCP keepalives#288

Draft
aaronbee wants to merge 3 commits intomasterfrom
keepalive
Draft

Replace ReadDeadlines in regionserver client with TCP keepalives#288
aaronbee wants to merge 3 commits intomasterfrom
keepalive

Conversation

@aaronbee
Copy link
Copy Markdown
Collaborator

@aaronbee aaronbee commented Feb 11, 2025

Remove ReadDeadlines in the regionserver client and replace them with TCP keepalives to detect half-closed connections.

HBase is not guaranteed to return a heartbeat in a given amount of time. This can result in spurious errors caused by the the regionserver client's ReadDeadlines.

@aaronbee aaronbee marked this pull request as draft February 27, 2025 18:28
Configures default KeepAliveConfig on regionserver connections,
including setting TCP_USER_TIMEOUT on linux systems.

TCP_USER_TIMEOUT is an additional layer of protection. It handles the
case of a dead connection, but the send buffer is full such that keep
alives packets are not able to be sent. By setting it to the time of
the full timeout of the normal keepalives it won't short-circuit the
normal keepalives.

See https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/
This is now handled by TCP keep alives.
It has been replaced by TCP keepalives.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant