Skip to content

Guard old colltrace baseline path against null ctranComm_#2265

Open
minsii wants to merge 1 commit intometa-pytorch:mainfrom
minsii:export-D102289281
Open

Guard old colltrace baseline path against null ctranComm_#2265
minsii wants to merge 1 commit intometa-pytorch:mainfrom
minsii:export-D102289281

Conversation

@minsii
Copy link
Copy Markdown
Contributor

@minsii minsii commented Apr 24, 2026

Summary:

  • D101586421 moved ctranComm_ creation inside if (useCtran_), making ctranComm_ null when ctran is disabled. The old colltrace baseline enqueue path unconditionally dereferenced ctranComm_ in two places, causing SIGSEGV:
  • collTraceBaselineGetHandle accessed plan->comm->ctranComm_->collTrace_ to check if old colltrace is active
  • colltraceHandle->trigger() in enqueue.cc was called on a nullptr handle returned by the above
  • Fix: return nullptr from collTraceBaselineGetHandle when NCCL_COLLTRACE is empty or ctranComm_ is null, and guard all colltraceHandle->trigger() calls with null checks in all three versions

Differential Revision: D102289281

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 24, 2026
@minsii minsii force-pushed the export-D102289281 branch from 2b2a7c7 to fa00d1b Compare April 24, 2026 21:56
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Apr 24, 2026

@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D102408648. (Because this pull request was imported automatically, there will not be any future comments.)

@minsii minsii force-pushed the export-D102289281 branch 2 times, most recently from 8a2bae1 to 8b428ba Compare April 24, 2026 23:24
Summary:
- D101586421 moved ctranComm_ creation inside `if (useCtran_)`, making ctranComm_ null when ctran is disabled. The old colltrace baseline enqueue path unconditionally dereferenced ctranComm_ in two places, causing SIGSEGV:
- `collTraceBaselineGetHandle` accessed `plan->comm->ctranComm_->collTrace_` to check if old colltrace is active
- `colltraceHandle->trigger()` in enqueue.cc was called on a nullptr handle returned by the above
- Fix: return nullptr from `collTraceBaselineGetHandle` when NCCL_COLLTRACE is empty or ctranComm_ is null, and guard all `colltraceHandle->trigger()` calls with null checks in all three versions

Reviewed By: YulunW

Differential Revision: D102289281
@minsii minsii force-pushed the export-D102289281 branch from 8b428ba to da21730 Compare April 25, 2026 03:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant