Skip to content

PRNG: use xoshiro256++ in random mode, keep LCG in repeatable#60

Open
rlseaman wants to merge 1 commit intoSmithsonian:mainfrom
rlseaman:css/xoshiro256pp
Open

PRNG: use xoshiro256++ in random mode, keep LCG in repeatable#60
rlseaman wants to merge 1 commit intoSmithsonian:mainfrom
rlseaman:css/xoshiro256pp

Conversation

@rlseaman
Copy link
Copy Markdown

Summary

  • Add xoshiro256++ (Blackman & Vigna 2018) as the PRNG for random mode.
  • Keep existing LCG pinned to repeatable mode so output remains bit-for-bit reproducible across digest2 versions.
  • No new user-facing config: the existing repeatable/random keywords select the PRNG.

Context

This work comes from the Catalina Sky Survey (CSS) — the MPC's most prolific tracklet-producing site and a heavy user of digest2 in production scoring pipelines. CSS has a long-running R&D interest in improvements for survey-scale parallel workloads, and this is the first of a small series of upstream-targeted PRs drawn from that work.

Motivation

The LCG (a=13^13, m=2^59, 59-bit state, 2^57 period) works for digest2's current workload but has three structural limits that matter at scale:

  1. Period 2^57 (~1.4×10¹⁷). Finite; gets tight in massively parallel scoring.
  2. Seed collisions. rand() = 1 and rand() = 2 both seed the LCG to state 1 after the 2*(k/2)+1 rounding — adjacent seeds produce identical starts.
  3. No jump() primitive for provably-disjoint parallel substreams.

None of these are urgent bugs today. They gate future HPC / parallel deployment.

Why couple PRNG to mode rather than expose a `prng` keyword

repeatable mode promises deterministic output. Pinning LCG there extends the promise across versions: a saved digest2 score from 2023 should replay bit-for-bit in 2026. This is the reproducibility-across-versions contract relied on by CI pipelines, bug reports, and cross-survey validation.

random mode's benefits from a better PRNG — longer period, seed decorrelation, parallel jumps — all matter only in random mode. Using xoshiro256++ where it helps, LCG where stability matters.

Users don't choose the PRNG directly because they shouldn't have to: the repeatable vs random decision already captures the intent.

What changes

File Change
d2math.c Add splitmix64 + xoshiro256pp_next + tkSeed; tkRand branches on repeatable
digest2.h Tracklet struct: add uint64_t rng_s[4] next to rand64; tkSeed prototype
digest2.c tkSeed() at both seed sites; resetInvalid saves the full 4-word state
d2lib.c tkSeed() at the Python-binding seed sites
OPERATION.md One paragraph documenting PRNG-per-mode

Net: 79+/11− across 5 changed files. d2cli.c is untouched.

What does NOT change

  • repeatable mode output is bit-for-bit identical to pre-patch. Verified on macOS: 0-line sorted diff against pre-patch main on a 500-tracklet benchmark.
  • The repeatable/random config keywords themselves are unchanged.
  • Per-thread seeding contract (CLI rand(), d2lib.c rand_r()) is preserved — both routed through tkSeed().
  • Tracklet struct layout: one new 32-byte field added; negligible memory impact (~300 bytes for default 9-core pool).

Verification

  • Build: clean on macOS 14 (aarch64, clang -O2) and RHEL 8.10 (x86_64, gcc 8.5), no warnings.
  • Bit-compat (repeatable): sorted output matches pre-patch sorted output exactly on 500-tracklet benchmark.
  • Random mode: runs cleanly on both platforms; scores within expected Monte Carlo noise of LCG results.
  • Cross-platform macOS ↔ RHEL bit-differences exist (ARM vs x86 libm rounding) but are pre-existing, unchanged by this patch.

Performance

Macro-benchmark on 3000 tracklets, repeatable mode, best-of-3:

  • LCG (current): 33.68s
  • xoshiro256++: 33.91s (+0.7%, within run-to-run noise)

This is not a speed PR. A full 6-way PRNG comparison confirms all modern PRNGs are within ~2% on real digest2 — the bottleneck isn't PRNG. This is a capability PR: period, parallel-stream primitives, statistical quality headroom.

Further detail

Full design rationale, statistical comparison (LCG vs xoroshiro128+ vs xoshiro256++ with χ², moments, autocorrelation, 2D uniformity, bit-balance, gap tests), speed benchmark (micro + macro, 6 PRNGs), and deployment considerations (HPC / GPU / embedded / web / SQL) are in the trade study document:

PRNG_TRADE_STUDY.md (on CSS R&D branch)

References:

Test plan

  • Clean build on macOS 14 (aarch64, clang -O2)
  • Clean build on RHEL 8.10 (x86_64, gcc 8.5)
  • repeatable mode byte-identical to pre-patch on 500-tracklet benchmark (macOS)
  • random mode runs cleanly on both platforms
  • resetInvalid correctly preserves 4-word xoshiro state across tracklet recycling
  • Maintainers may wish to verify on their own benchmark corpus

repeatable mode preserves the historical LCG so output is bit-for-bit
reproducible across digest2 versions (the reproducibility contract).
random mode uses xoshiro256++ (Blackman & Vigna 2018) for its longer
period, better seed decorrelation, and parallel-stream primitives.

No user-facing config change: the PRNG follows the existing
repeatable/random mode choice. See PRNG_TRADE_STUDY.md for the trade
study and rationale.

Verified on macOS: sorted output in repeatable mode is byte-identical
to pre-patch upstream; random mode runs clean and produces scores
within expected Monte Carlo noise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@rlseaman
Copy link
Copy Markdown
Author

Sorry for the Claude boilerplate description. It was a fairly interesting trade study and I've been planning to suggest improvements to the digest2 PRNG for a long time. The basic notion is to keep the current algorithm around for repeatable mode, while supporting a better, more modern PRNG for stochastic use cases.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR upgrades digest2’s pseudo-random generation by using xoshiro256++ for random mode while preserving the legacy LCG for repeatable mode to maintain cross-version bit-for-bit reproducibility.

Changes:

  • Add splitmix64 seeding + xoshiro256++ generator and route tkRand() based on repeatable.
  • Extend tracklet to carry xoshiro state and preserve it across tracklet recycling.
  • Update CLI/library seeding sites and document the PRNG-by-mode behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
digest2/digest2/digest2.h Adds xoshiro state to tracklet and exposes tkSeed() API.
digest2/digest2/digest2.c Uses tkSeed() for per-thread seeding and preserves the 4-word RNG state in resetInvalid().
digest2/digest2/d2math.c Implements splitmix64 + xoshiro256++ and switches tkRand() behavior by mode.
digest2/digest2/d2lib.c Routes library seeding through tkSeed() for both repeatable and random modes.
digest2/digest2/OPERATION.md Documents that repeatable uses LCG while random uses xoshiro256++.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread digest2/digest2/digest2.h
Comment on lines +97 to +98
uint64_t rand64; // LCG random number (also state slot 0 for xoshiro)
uint64_t rng_s[4]; // xoshiro256++ state (used when prngChoice != PRNG_LCG)
Comment thread digest2/digest2/d2math.c
tk->rand64 = 2 * (seed / 2) + 1; /* LCG: odd seed per NAG */
uint64_t s = seed;
for (int i = 0; i < 4; i++)
tk->rng_s[i] = splitmix64(&s); /* xoshiro: expand via splitmix */
digest2's scores are insensitive to that difference in practice.

Users need not (and cannot) choose the PRNG directly: the `repeatable` vs.
`random` decision selects it. See `PRNG_TRADE_STUDY.md` for detail.
Comment thread digest2/digest2/digest2.h
void initGlobals(void);
void score(tracklet * tk);
double tkRand(tracklet * tk);
void tkSeed(tracklet * tk, uint64_t seed);
Comment thread digest2/digest2/d2math.c
/*
* xoshiro256++ (Blackman & Vigna 2018) — opt-in alternative PRNG.
* Period 2^256-1, passes BigCrush, has jump() / long_jump() primitives
* for provably-disjoint parallel substreams. See PRNG_TRADE_STUDY.md.
@federicaspoto federicaspoto removed the request for review from matthewjohnpayne April 22, 2026 11:40
@federicaspoto
Copy link
Copy Markdown
Member

Hi @rlseaman ,
thank you for opening the PR.
I asked co-pilot to do a first round of reviews (we always do that now).

Whenever any of us will have time to review it, we will work on it, but we are quite busy at the moment, so this might have to wait for some time.

Thanks,
Federica

@federicaspoto federicaspoto added the Low priority Low priority label Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Low priority Low priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants