Skip to content

Conversation

@cloud-fan
Copy link
Contributor

@cloud-fan cloud-fan commented Jan 27, 2026

What changes were proposed in this pull request?

This PR refactors how doTryWithCallerStacktrace and getTryWithCallerStacktrace handle stacktrace stitching. Instead of adding an OriginalTryStackTraceException as a suppressed exception on the original exception, we now wrap the original exception inside OriginalTryStackTraceException.

Key changes:

  • OriginalTryStackTraceException now wraps the original exception (as its cause) and stores the pre-computed "below" stacktrace portion
  • doTryWithCallerStacktrace returns Failure(wrapper) instead of modifying the original exception's suppressed list
  • getTryWithCallerStacktrace unwraps the original exception, stitches the stacktrace, and throws the original (not the wrapper)

Why are the changes needed?

no suppressed exceptions visible to users

Does this PR introduce any user-facing change?

Yes, exceptions thrown from LazyTry.get will no longer have a suppressed OriginalTryStackTraceException. The stacktrace stitching behavior remains the same.

How was this patch tested?

Existing unit tests in UtilsSuite and LazyTrySuite updated and passing.

Was this patch authored or co-authored using generative AI tooling?

Yes.

@github-actions
Copy link

github-actions bot commented Jan 27, 2026

JIRA Issue Information

=== Improvement SPARK-55240 ===
Summary: Refactor LazyTry stacktrace handling to use wrapper instead of suppressed exception
Assignee: None
Status: Open
Affected: ["4.2.0"]


This comment was automatically generated by GitHub Actions

@github-actions github-actions bot added the CORE label Jan 27, 2026
@cloud-fan cloud-fan changed the title [SPARK-XXXXX][CORE] Refactor LazyTry stacktrace handling to use wrapper instead of suppressed exception [SPARK-55240][CORE] Refactor LazyTry stacktrace handling to use wrapper instead of suppressed exception Jan 27, 2026
@cloud-fan
Copy link
Contributor Author

This is kind of followup of #48882

cc @juliuszsompolski @yaooqinn

Comment on lines 1431 to 1451
// Stitch the stacktrace: keep the "below" part, replace "above" with current caller
originalEx.setStackTrace(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, at this point, the original above stack trace is lost, and also it's indistinguishable for the user, whether the exception happened here, or was only retrieved and stitched from an earlier call?
I do find it useful for debugging to know if the exception really happened on the stack trace of the current call, or it happened in a different call and it was only stitched to pretend as happening here, while in fact it was retrieved from a Try....

I like the wrapper change that makes it cleaner than searching for OriginalTryStackTraceException among the supressed, but I personally would prefer to preserve and show the information about the original place the exception was thrown... However, I can not think of a different way to add it than again adding it as a suppressed Exception, which would then have both a wrapper and a suppressed exception, so I don't think it would be an improvement to the current state...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The suppressed Exception is not ideal, because it's more the "cause", but we cannot make it the cause, because the original exception might have already been the cause... Having it as suppressed exception was the only way I could find to tag it along without disrupting the original exception...

Copy link
Contributor Author

@cloud-fan cloud-fan Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more common case is people see the error once and fix the query. It's better to optimize for the common case (no suppressed error is needed).

I agree that it's useful to know if the error was caused previously. Shall we insert a special frame in the error stacktrace to indicate it? e.g.

original error stacktrace
...
doTryWithCallerStacktrace (only present if it's not the first invocation)
...
current caller stacktrace

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They would still see the error message on top and their call stacktrace, and probably ignore the suppressed listing below.
In practical cases where I rely on it I would most likely be connected with a debugger anyway and then can get the original caller trace anyway... But if it's a bug report with no repro but only the log, the suppressed part can be useful in providing some triage... From user perspective it is indeed a longer and uglier error dump, something that @yaooqinn complained about in the past.
For me the suppressed full original trace is useful, and only a small ugliness for a casual end user, but I don't insist.

val originalEx = wrapper.getCause
// Stitch the stacktrace: keep the "below" part, replace "above" with current caller
originalEx.setStackTrace(
wrapper.belowStacktrace ++ Thread.currentThread().getStackTrace.drop(1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if going ahead with it, you don't need to store belowStacktrace at all, only the prefix length. The below part is already in originalEx, and is not changing, only the suffix of the current caller changes.

So

val belowStacktraceLength = origStackTrace.size - commonSuffixLen

in doTryWithCallerStacktrace and here

originalEx.setStackTrace(
  orignalEx.take(belowStacktraceLength) ++ Thread.currentThread().getStackTrace.drop(1))

should work.

Comment on lines 1449 to 1451
// We are modifying and throwing the original exception. It would be better if we could
// return a copy, but we can't easily clone it and preserve. If this is accessed from
// multiple threads that then look at the stack trace, this could break.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would preserve this code comment next to stack stitching because it still applies.

…er instead of suppressed exception

This PR refactors how `doTryWithCallerStacktrace` and `getTryWithCallerStacktrace` handle stacktrace stitching. Instead of adding an `OriginalTryStackTraceException` as a suppressed exception on the original exception, we now wrap the original exception inside `OriginalTryStackTraceException`.

Key changes:
- `OriginalTryStackTraceException` now wraps the original exception (as its cause) and stores the pre-computed "below" stacktrace portion
- `doTryWithCallerStacktrace` returns `Failure(wrapper)` instead of modifying the original exception's suppressed list
- `getTryWithCallerStacktrace` unwraps the original exception, stitches the stacktrace, and throws the original (not the wrapper)

Benefits:
- Cleaner exception output - no suppressed exceptions visible to users
- Simpler code - no need to track first access or manage suppressed exception lifecycle
- Original exception type is preserved when thrown
Copy link
Contributor

@juliuszsompolski juliuszsompolski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, small nit.

Comment on lines +59 to +61
val isFirstAccess = !materialized
materialized = true
Utils.getTryWithCallerStacktrace(tryT, isFirstAccess)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think it matters much as there are some atomicity problems with resetting the trace anyway, but maybe use AtomicBoolean and just call Utils.getTryWithCallerStacktrace(tryT, !materialized.getAndSet(true))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not thread-safe anyway, and AtomicBoolean takes more memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants