Skip to content

Fixed heap-buffer-overflow in atom hash caused by Symbol().#1026

Merged
xeioex merged 2 commits intonginx:masterfrom
xeioex:fix_symbol_add
Feb 24, 2026
Merged

Fixed heap-buffer-overflow in atom hash caused by Symbol().#1026
xeioex merged 2 commits intonginx:masterfrom
xeioex:fix_symbol_add

Conversation

@xeioex
Copy link
Copy Markdown
Contributor

@xeioex xeioex commented Feb 17, 2026

To reproduce

for (var i = 0; i < 5000; i++) Symbol()

Previously, njs_atom_find_or_add() and njs_atom_find_or_add_string()
duplicated the lookup logic and coupled find with insert. Splitting
them into njs_atom_find() and njs_atom_add() separates concerns:
find does a pure lookup with raw bytes, add inserts a pre-built value.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a heap-buffer-overflow bug that occurred when creating a large number of JavaScript symbols (e.g., 5000+ symbols). The root cause was that symbols and strings were stored in the same atom hash table without proper key separation, leading to potential hash collisions.

Changes:

  • Introduced high-bit marking to distinguish symbol keys from string keys in the atom hash (symbols use atom_id | 0x80000000, strings use hash & 0x7FFFFFFF)
  • Refactored njs_atom_find_or_add into separate njs_atom_find and njs_atom_add functions for clearer separation of concerns
  • Updated symbol insertion to use njs_flathsh_unique_insert instead of the generic njs_flathsh_insert, since symbols have guaranteed unique atom_ids

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
src/njs_atom.c Core fix: Added macros for symbol/string key separation, refactored atom lookup/insertion functions, moved and updated njs_atom_symbol_add
src/njs_atom.h Updated API signatures to expose njs_atom_find and njs_atom_add separately
src/njs_lexer.c Updated lexer word processing to use new find-then-add pattern
src/test/njs_unit_test.c Added regression test creating 6000 symbols to verify fix

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/njs_atom.c
Previously, there was a key_hash collision between strings and
symbols in the atom hash table that led to use of uninitialized memory
and desynchronization of vm->atom_id_generator.

In the atom hash, string entries used djb_hash(content) as key_hash
while symbol entries used raw atom_id. Since djb_hash produces arbitrary
uint32_t values and atom_ids grow monotonically from NJS_ATOM_SIZE,
these spaces overlapped. When a symbol's atom_id matched an existing
string's djb_hash, njs_flathsh_insert() invoked the test function which
read uninitialized fhq.key fields (the function expected string data,
but the caller never initialized it for symbols). Additionally,
atom_id_generator was incremented before the insert, so on failure the
counter diverged from the actual element count, causing subsequent
njs_atom_to_value() calls to read out of bounds.

The fix is to partition the key_hash space using bit 31.

Found by OSS-Fuzz.
Copy link
Copy Markdown
Contributor

@VadimZhestikov VadimZhestikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@xeioex xeioex merged commit 433478d into nginx:master Feb 24, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants