Skip to content

[AMORO-4044] Return correct partition for delete files in iceberg tables whose partition spec have changed#4047

Open
juiceyang wants to merge 1 commit intoapache:masterfrom
juiceyang:issue-4044
Open

[AMORO-4044] Return correct partition for delete files in iceberg tables whose partition spec have changed#4047
juiceyang wants to merge 1 commit intoapache:masterfrom
juiceyang:issue-4044

Conversation

@juiceyang
Copy link

@juiceyang juiceyang commented Jan 16, 2026

Why are the changes needed?

According to org.apache.iceberg.BaseEntriesTable#schema, in org.apache.amoro.scan.TableEntriesScan#entries, the partition field in the fileRecord includes values for all PartitionSpec columns. Therefore, when org.apache.amoro.scan.TableEntriesScan#buildDeleteFile creates a DeleteFile using that fileRecord, if the fileRecord was written with a newer PartitionSpec, then inside org.apache.iceberg.DataFiles#copyPartitionData the partition fields from the older PartitionSpec will overwrite the PartitionSpec columns. This eventually causes the partition field of the created DeleteFile to be set incorrectly, with the partition value becoming null.

Close #4044 .

Brief change log

  • To avoid this issue, we no longer use TableEntriesScan to retrieve the full list of delete files. Instead, we iterate over the manifest files in the delete manifest list to obtain all DeleteFile objects. The DeleteFile objects retrieved this way have the correct partition values.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before making a pull request

Documentation

  • Does this pull request introduce a new feature? (yes / no)
    no
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
    not documented

@juiceyang
Copy link
Author

juiceyang commented Feb 10, 2026

I ran the Core/hadoop2 CI with Maven workflow locally, and it passes when I execute act -W .github/workflows/core-hadoop2-ci.yml, even though the same job has failed previously.

I also noticed that others have reported similar failures:

issue-3985
issue-3558
Based on this, I don’t think the failure is related to the current PR.

@j1wonpark
Copy link
Contributor

Hi @juiceyang, thanks for working on this issue!

I've been looking into the same problem. I think the root cause is actually in TableEntriesScan itself — the entries metadata table returns a unified partition struct containing fields from all PartitionSpecs, but buildDataFile() and buildDeleteFile() pass this directly to withPartition() without projecting it to the spec-specific partition type. Your current approach works around this in getDanglingDeleteFiles() by reading manifests directly, but other callers of TableEntriesScan would still hit the same issue. I'd like to submit a fix at the TableEntriesScan level to address the root cause. Would that be okay with you, or would you prefer to update this PR to fix it there instead?

@juiceyang
Copy link
Author

Hi @juiceyang, thanks for working on this issue!

I've been looking into the same problem. I think the root cause is actually in TableEntriesScan itself — the entries metadata table returns a unified partition struct containing fields from all PartitionSpecs, but buildDataFile() and buildDeleteFile() pass this directly to withPartition() without projecting it to the spec-specific partition type. Your current approach works around this in getDanglingDeleteFiles() by reading manifests directly, but other callers of TableEntriesScan would still hit the same issue. I'd like to submit a fix at the TableEntriesScan level to address the root cause. Would that be okay with you, or would you prefer to update this PR to fix it there instead?

Hi @j1wonpark!

I agree that root cause is in TableEntriesScan. I’ve only changed one spot for now, mainly to keep the change as small as possible and minimize impact. If you can fix it at the TableEntriesScan level, that would be a better and more thorough solution.

Please go ahead and submit a separate PR following your approach. Once your PR fixes the issue, I’ll close mine. This problem has been bothering us for a while—thanks a lot for your work and contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: For Iceberg tables whose PartitionSpec has been changed, Amoro will throw an error when executing the clean-dangling-delete-files operation.

2 participants