feat: Maven repo allow-list with metadata-driven enforcement#1696
feat: Maven repo allow-list with metadata-driven enforcement#1696csasalu wants to merge 2 commits intoconforma:mainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
🚀 New features to boost your workflow:
|
fc42082 to
f8e5e05
Compare
4d11bcd to
8aae2db
Compare
8aae2db to
c939895
Compare
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Maven SBOM extraction utilities, a release policy enforcing allowed Maven repository URLs (with defaulting to Maven Central), tests for extraction and policy behavior, Antora documentation and nav entries, and a new empty Changes
Sequence Diagram(s)sequenceDiagram
participant SBOM as SBOM (CycloneDX / SPDX)
participant Lib as lib.sbom.maven
participant Policy as release.maven_repos
participant RuleData as rule_data.allowed_maven_repositories
participant Results as Policy Engine / Results
SBOM->>Lib: provide SBOM packages
Lib->>Policy: emit package objects {purl, name, repository_url}
Policy->>Policy: for each package -> _get_effective_url(repository_url) (defaults to Maven Central if empty)
Policy->>RuleData: query allowed_maven_repositories
RuleData-->>Policy: allowed list or missing/invalid
Policy->>Policy: _url_is_permitted(effective_url)
alt permitted
Policy->>Results: no deny for package
else not permitted
Policy->>Results: emit deny_unpermitted_urls result (includes term=purl)
else missing/invalid rule data
Policy->>Results: emit policy_data_missing result
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
policy/release/maven_repos/maven_repos.rego (1)
8-16: Package-level metadata appears misplaced or duplicated.The package-level metadata (lines 1-17) includes rule-specific fields like
short_name,failure_msg, andsolution, but this metadata won't be used by either deny rule since they each have their own# METADATAblocks withscope: rule. This package-level metadata withshort_name: deny_unpermitted_urls_1appears to be dead configuration.If this metadata is intended for the package description only, consider removing the rule-specific fields (
short_name,failure_msg,solution,effective_on) from the package-level metadata.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@policy/release/maven_repos/maven_repos.rego` around lines 8 - 16, The package-level metadata block in maven_repos.rego contains rule-specific fields that are duplicated by the per-rule `# METADATA` blocks (remove the dead config): delete the fields `short_name`, `failure_msg`, `solution`, and `effective_on` from the package-level comment header (you can leave package description/collections if needed) so only the individual deny rules' `# METADATA` sections define those rule-specific entries; alternatively move any intended package-wide description into a minimal package comment and keep rule metadata inside the `# METADATA` blocks for the deny rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@policy/release/collection/redhat_maven/redhat_rpms.rego`:
- Around line 1-8: The file name redhat_rpms.rego is inconsistent with the
declared package and metadata (package collection.redhat_maven,
title/description referencing Red Hat Maven); rename the file to
redhat_maven.rego (or alternatively change the package/title to match RPM
semantics) so filename, package name (collection.redhat_maven) and metadata
(title/description) are consistent; locate the file currently named
redhat_rpms.rego and perform the rename/update accordingly.
In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 67-76: The test_spdx_default_fail currently only evaluates deny
and won't fail because it doesn't assert the expected outcome and the mocked
SPDX package puts the purl inside externalRefs so maven.packages'
startswith(item.purl, "pkg:maven/") check never triggers; update the test to (a)
include a top-level "purl" field on the package (e.g., "purl":
"pkg:maven/org.base/no-url@1.0") or adjust the mock so item.purl is present, and
(b) add an assertion that deny is non-empty or equals the expected set (use
lib.assert_equal or assert count(deny) > 0) so test_spdx_default_fail fails when
deny is produced by the maven.packages rule.
- Around line 36-45: In the test_default_maven_central_pass test, the mock
binding uses the whole cdx_input object instead of its components array; update
the binding so data.lib.cyclonedx.packages is bound to cdx_input.components
(i.e., replace the current with data.lib.cyclonedx.packages as cdx_input with
with data.lib.cyclonedx.packages as cdx_input.components) so the rule under test
receives the components array; reference test_default_maven_central_pass and the
cdx_input variable when making the change.
- Around line 11-20: The test_cyclonedx_permitted mock binds cdx_input as an
object with a components field but then injects the whole object into
data.lib.cyclonedx.packages; change the binding so the array is passed: use
cdx_input.components when applying the with-modifier to
data.lib.cyclonedx.packages (leave the cdx_input variable and structure as-is,
just pass its .components array), ensuring the rule under test sees the expected
array rather than the wrapper object.
- Around line 22-34: The test defines spdx_input as an object with a "packages"
field but the mock binding expects the packages array itself; update the
test_spdx_permitted so the mock binds correctly by either (A) setting spdx_input
to the packages array directly (i.e., spdx_input := [{ ... }] ) or (B) change
the with binding to use spdx_input.packages (with data.lib.spdx.packages as
spdx_input.packages) so the rule_data mock uses the packages array; adjust the
symbol references test_spdx_permitted, spdx_input, deny, and
data.lib.spdx.packages accordingly.
---
Nitpick comments:
In `@policy/release/maven_repos/maven_repos.rego`:
- Around line 8-16: The package-level metadata block in maven_repos.rego
contains rule-specific fields that are duplicated by the per-rule `# METADATA`
blocks (remove the dead config): delete the fields `short_name`, `failure_msg`,
`solution`, and `effective_on` from the package-level comment header (you can
leave package description/collections if needed) so only the individual deny
rules' `# METADATA` sections define those rule-specific entries; alternatively
move any intended package-wide description into a minimal package comment and
keep rule metadata inside the `# METADATA` blocks for the deny rules.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: c034ec79-4ec5-4f7e-a4b4-a0bfc9fead3a
📒 Files selected for processing (8)
antora/docs/modules/ROOT/pages/packages/release_maven_repos.adocantora/docs/modules/ROOT/pages/release_policy.adocantora/docs/modules/ROOT/partials/release_policy_nav.adocpolicy/lib/sbom/maven.regopolicy/lib/sbom/maven_test.regopolicy/release/collection/redhat_maven/redhat_rpms.regopolicy/release/maven_repos/maven_repos.regopolicy/release/maven_repos/maven_repos_test.rego
c939895 to
ba0a7b8
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (3)
policy/release/maven_repos/maven_repos_test.rego (3)
67-76:⚠️ Potential issue | 🟠 MajorThis SPDX failure case doesn't hit the Maven extractor.
Lines 68-72 omit the top-level
purlthatpolicy/lib/sbom/maven.regofilters on, so the mocked package is ignored. Also compare the expected deny set so this test actually proves the default-to-Maven-Central path.💡 Suggested fix
test_spdx_default_fail if { spdx_input := {"packages": [{ "name": "no-url", + "purl": "pkg:maven/org.base/no-url@1.0", "externalRefs": [{"referenceType": "purl", "referenceLocator": "pkg:maven/org.base/no-url@1.0"}], "downloadLocation": "NOASSERTION", }]} - deny with data.lib.spdx.packages as spdx_input.packages - with data.rule_data as {"allowed_maven_repositories": ["https://internal.repo/"]} + expected := {{ + "code": "release.maven_repos.deny_unpermitted_urls", + "msg": "Package \"pkg:maven/org.base/no-url@1.0\" (source: \"https://repo.maven.apache.org/maven2/\") is not in the permitted list", + "effective_on": "2026-05-10T00:00:00Z", + "term": "pkg:maven/org.base/no-url@1.0", + }} + + lib.assert_equal(deny, expected) with data.lib.spdx.packages as spdx_input.packages + with data.rule_data as {"allowed_maven_repositories": ["https://internal.repo/"]} }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@policy/release/maven_repos/maven_repos_test.rego` around lines 67 - 76, The test_spdx_default_fail case currently constructs a package without the top-level "purl" field so the matcher in policy/lib/sbom/maven.rego ignores it; update the mocked spdx_input in test_spdx_default_fail to include the top-level "purl" (matching the purl format used by data.lib.spdx.packages) and adjust the assertion for the deny set to check that the deny result includes the Maven Central default repository path (so the test verifies the default-to-Maven-Central behavior when using data.lib.spdx.packages with the rule_data allowed_maven_repositories override).
18-19:⚠️ Potential issue | 🟠 MajorBind the CycloneDX tests to
cdx_input.components.
policy/lib/sbom/maven.regoiteratesdata.lib.cyclonedx.packagesas the package array. Lines 19 and 44 pass the wrapper object instead, so both tests can miss the extraction path they are meant to cover.💡 Suggested fix
lib.assert_empty(deny) with data.rule_data as mock_data - with data.lib.cyclonedx.packages as cdx_input + with data.lib.cyclonedx.packages as cdx_input.components @@ lib.assert_empty(deny) with data.rule_data as mock_data - with data.lib.cyclonedx.packages as cdx_input + with data.lib.cyclonedx.packages as cdx_input.componentsAlso applies to: 43-44
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@policy/release/maven_repos/maven_repos_test.rego` around lines 18 - 19, The tests bind the CycloneDX fixture to the wrapper object instead of the actual components array, so update the test `with` bindings to pass the components array into the policy path used by `policy/lib/sbom/maven.rego`. Specifically, change occurrences that read `with data.lib.cyclonedx.packages as cdx_input` to `with data.lib.cyclonedx.packages.components as cdx_input` for the `lib.assert_empty(deny) with data.rule_data as mock_data with ... as cdx_input` assertions (apply the same change to the second occurrence around the later `lib.assert_empty(deny)`), leaving `lib.assert_empty`, `deny`, and `data.rule_data` as-is.
32-33:⚠️ Potential issue | 🟠 MajorBind the SPDX happy-path test to
spdx_input.packages.
policy/lib/sbom/maven.regoiteratesdata.lib.spdx.packagesas the package array. Line 33 passes the wrapper object, so this test does not exercise the intended branch.💡 Suggested fix
lib.assert_empty(deny) with data.rule_data as mock_data - with data.lib.spdx.packages as spdx_input + with data.lib.spdx.packages as spdx_input.packages🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@policy/release/maven_repos/maven_repos_test.rego` around lines 32 - 33, The test currently binds data.lib.spdx.packages to spdx_input (the whole wrapper) so the policy branch that iterates data.lib.spdx.packages isn't exercised; change the binding to use the inner array by replacing the second "with" to "with data.lib.spdx.packages as spdx_input.packages" so lib.assert_empty(deny) is run with data.lib.spdx.packages mapped to the mock's packages field (refer to lib.assert_empty(deny), data.lib.spdx.packages, and spdx_input.packages).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@policy/lib/sbom/maven.rego`:
- Around line 33-39: The helper _extract_cdx_repo(component) currently uses
"some ref" which can yield multiple urls; change it to select deterministically
(e.g., iterate by index: "some i" and set ref := component.externalRefs[i]" and
pick the smallest i that satisfies ref.type in
["distribution","artifact-repository"] or implement a precedence list and pick
the first match) so the function always returns a single URL for a component;
update the analogous occurrence at lines 52–55 the same way, and add a
regression test that constructs a component with multiple matching externalRefs
to assert only one package is produced with a stable repository_url.
In `@policy/release/maven_repos/maven_repos.rego`:
- Around line 62-73: The _rule_data_errors helper is checking data.rule_data
directly which misses sources that lib.rule_data("allowed_maven_repositories")
handles (config, custom data, defaults); update _rule_data_errors to call
lib.rule_data("allowed_maven_repositories") to retrieve permitted instead of
reading data.rule_data, then pass that result into _is_invalid_data (and keep
_is_invalid_data handling the "UNDEFINED_IN_DATA" sentinel), so the same
rule-data resolution used by _url_is_permitted is used when validating presence.
- Around line 44-56: The rule currently keys violations by
`_repo_url_errors[purl]` which causes unification conflicts when the same `purl`
has multiple different `source` values; change `_repo_url_errors` to emit a set
of error objects (e.g. { "purl": purl, "source": source, "msg": msg }) or use a
composite key like `_repo_url_errors[[purl, source]] := msg` instead of
`_repo_url_errors[purl]`, and update the `deny` construction (the `deny` rule
that uses `_repo_url_errors` and the `result`/`base` creation in `deny contains
result`) to iterate those objects/keys and include both `purl` and `source` (use
`_get_effective_url(pkg.repository_url)`, `_url_is_permitted`, and
`maven.packages` to locate the generation site) so each distinct (purl, source)
violation is reported without unification errors.
---
Duplicate comments:
In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 67-76: The test_spdx_default_fail case currently constructs a
package without the top-level "purl" field so the matcher in
policy/lib/sbom/maven.rego ignores it; update the mocked spdx_input in
test_spdx_default_fail to include the top-level "purl" (matching the purl format
used by data.lib.spdx.packages) and adjust the assertion for the deny set to
check that the deny result includes the Maven Central default repository path
(so the test verifies the default-to-Maven-Central behavior when using
data.lib.spdx.packages with the rule_data allowed_maven_repositories override).
- Around line 18-19: The tests bind the CycloneDX fixture to the wrapper object
instead of the actual components array, so update the test `with` bindings to
pass the components array into the policy path used by
`policy/lib/sbom/maven.rego`. Specifically, change occurrences that read `with
data.lib.cyclonedx.packages as cdx_input` to `with
data.lib.cyclonedx.packages.components as cdx_input` for the
`lib.assert_empty(deny) with data.rule_data as mock_data with ... as cdx_input`
assertions (apply the same change to the second occurrence around the later
`lib.assert_empty(deny)`), leaving `lib.assert_empty`, `deny`, and
`data.rule_data` as-is.
- Around line 32-33: The test currently binds data.lib.spdx.packages to
spdx_input (the whole wrapper) so the policy branch that iterates
data.lib.spdx.packages isn't exercised; change the binding to use the inner
array by replacing the second "with" to "with data.lib.spdx.packages as
spdx_input.packages" so lib.assert_empty(deny) is run with
data.lib.spdx.packages mapped to the mock's packages field (refer to
lib.assert_empty(deny), data.lib.spdx.packages, and spdx_input.packages).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a15faa09-66d3-44be-8627-fd4e31d6d519
📒 Files selected for processing (8)
antora/docs/modules/ROOT/pages/packages/release_maven_repos.adocantora/docs/modules/ROOT/pages/release_policy.adocantora/docs/modules/ROOT/partials/release_policy_nav.adocpolicy/lib/sbom/maven.regopolicy/lib/sbom/maven_test.regopolicy/release/collection/redhat_maven/redhat_maven.regopolicy/release/maven_repos/maven_repos.regopolicy/release/maven_repos/maven_repos_test.rego
🚧 Files skipped from review as they are similar to previous changes (2)
- antora/docs/modules/ROOT/pages/release_policy.adoc
- antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc
c98ca2d to
3e83fd6
Compare
3e83fd6 to
fa5baca
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
policy/lib/sbom/maven_test.rego (1)
66-74: Consider adding a test for multiple matching externalRefs.The test coverage is solid for the common cases. Given the non-deterministic behavior noted in
_extract_cdx_repo/_extract_spdx_repowhen multiple refs match, consider adding a regression test that asserts expected behavior when a component has multipleexternalRefswith matching types.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@policy/lib/sbom/maven_test.rego` around lines 66 - 74, Add a regression test to cover the case where a CycloneDX/SPDX component has multiple matching externalRefs so we can lock in the current non-deterministic behavior of _extract_cdx_repo/_extract_spdx_repo; update or add a test (e.g., alongside test_combined_sources) that constructs mock_cdx and/or mock_spdx entries where a single component contains multiple externalRefs of the same matching type and then assert the output of maven.packages (via the same with-data pattern) equals the expected deterministic choice (or explicitly documents which ref is selected) to prevent future regressions in _extract_cdx_repo/_extract_spdx_repo behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@policy/lib/sbom/maven_test.rego`:
- Around line 66-74: Add a regression test to cover the case where a
CycloneDX/SPDX component has multiple matching externalRefs so we can lock in
the current non-deterministic behavior of _extract_cdx_repo/_extract_spdx_repo;
update or add a test (e.g., alongside test_combined_sources) that constructs
mock_cdx and/or mock_spdx entries where a single component contains multiple
externalRefs of the same matching type and then assert the output of
maven.packages (via the same with-data pattern) equals the expected
deterministic choice (or explicitly documents which ref is selected) to prevent
future regressions in _extract_cdx_repo/_extract_spdx_repo behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d2eb81fe-528e-4f7e-bdb4-de5c852a9e87
📒 Files selected for processing (8)
antora/docs/modules/ROOT/pages/packages/release_maven_repos.adocantora/docs/modules/ROOT/pages/release_policy.adocantora/docs/modules/ROOT/partials/release_policy_nav.adocpolicy/lib/sbom/maven.regopolicy/lib/sbom/maven_test.regopolicy/release/collection/redhat_maven/redhat_maven.regopolicy/release/maven_repos/maven_repos.regopolicy/release/maven_repos/maven_repos_test.rego
✅ Files skipped from review due to trivial changes (3)
- antora/docs/modules/ROOT/partials/release_policy_nav.adoc
- policy/release/collection/redhat_maven/redhat_maven.rego
- antora/docs/modules/ROOT/pages/packages/release_maven_repos.adoc
🚧 Files skipped from review as they are similar to previous changes (2)
- antora/docs/modules/ROOT/pages/release_policy.adoc
- policy/release/maven_repos/maven_repos_test.rego
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
policy/release/maven_repos/maven_repos_test.rego (1)
43-44:⚠️ Potential issue | 🟠 MajorFix CycloneDX mock binding to avoid vacuous pass
cdx_inputis a wrapper object, but the override targetsdata.lib.cyclonedx.packages(array). Passing the wrapper can make this test pass without evaluating any package.Proposed fix
- lib.assert_empty(deny) with data.rule_data as mock_data - with data.lib.cyclonedx.packages as cdx_input + lib.assert_empty(deny) with data.rule_data as mock_data + with data.lib.cyclonedx.packages as cdx_input.componentsAs per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@policy/release/maven_repos/maven_repos_test.rego` around lines 43 - 44, The test overrides the array at data.lib.cyclonedx.packages with the wrapper object cdx_input, which is a mismatch and can make lib.assert_empty(deny) vacuously pass; change the mock binding to override the wrapper itself so the policy sees the correct structure — replace the second `with` in the failing test so it binds `data.lib.cyclonedx` to `cdx_input` (or alternatively bind `data.lib.cyclonedx.packages` to `cdx_input.packages`) so the `deny` evaluation uses the actual packages array.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 112-132: The test test_cyclonedx_multiple_refs_behavior currently
only checks that a package's repository_url is one of the expected URLs; change
it to assert the exact set and count so regressions to a single URL fail: after
computing pkg_list := data.lib.sbom.maven.packages with data.lib.cyclonedx as
mock_cdx, find the specific package (e.g., by name or purl) and assert that the
set of repository_url values equals
{"https://first.repo.com","https://second.repo.com"} and that the
length/cardinality of that set is 2; update any similar test (the one around
lines 134-154) the same way so both the URL set and exact cardinality are
enforced.
---
Duplicate comments:
In `@policy/release/maven_repos/maven_repos_test.rego`:
- Around line 43-44: The test overrides the array at data.lib.cyclonedx.packages
with the wrapper object cdx_input, which is a mismatch and can make
lib.assert_empty(deny) vacuously pass; change the mock binding to override the
wrapper itself so the policy sees the correct structure — replace the second
`with` in the failing test so it binds `data.lib.cyclonedx` to `cdx_input` (or
alternatively bind `data.lib.cyclonedx.packages` to `cdx_input.packages`) so the
`deny` evaluation uses the actual packages array.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e2970946-3644-4f74-b76c-109af8356348
📒 Files selected for processing (3)
policy/lib/sbom/maven.regopolicy/lib/sbom/maven_test.regopolicy/release/maven_repos/maven_repos_test.rego
1d0d0dd to
207242c
Compare
83498db to
a3bc3a9
Compare
Implemented strict repository URL validation with time-based enforcement handled via policy metadata directives. - Restricted Maven builds to approved repositories only. - Added 'effective_from' metadata annotation to allow for a grace period before hard enforcement.All violations are flagged as warnings until the 'effective_from' date. - Support for SPDX and CycloneDX.
2ab23fd to
3cd2b68
Compare
Update the Maven SBOM library to process all repository URLs found in a component's external references instead of only the first match. This aligns with SPDX and CycloneDX schemas which allow 0..N external references per package.
3cd2b68 to
6aa4490
Compare
|
Hi @csasalu, do you have a Jira related to this PR? If so, can you link it? |
Hi @st3penta, Finished fixing all the code review items today. This is ready for review. |
Implemented strict repository URL validation with time-based enforcement handled via policy metadata directives.