Skip to content

Harden module install and pipeline test coverage#327

Merged
PrzemyslawKlys merged 2 commits intomainfrom
feature/module-pipeline-hardening-20260423
Apr 26, 2026
Merged

Harden module install and pipeline test coverage#327
PrzemyslawKlys merged 2 commits intomainfrom
feature/module-pipeline-hardening-20260423

Conversation

@PrzemyslawKlys
Copy link
Copy Markdown
Member

Summary

  • quarantine legacy flat module installs when ModuleVersion cannot be read before conversion
  • validate store submission authority hosts earlier during authentication checks
  • align module pipeline and build-host tests with current dependency and signing behavior
  • harden CLI/release tests against hangs and transient file cleanup races

Validation

  • dotnet test .\PowerForge.Tests\PowerForge.Tests.csproj -c Release --no-restore --filter "FullyQualifiedNameGitHubRunnerHousekeepingWorkflowTests|FullyQualifiedNameModuleBuildHostServiceTests|FullyQualifiedNameModulePipelineApprovedModulesTests|FullyQualifiedNameModulePipelineManifestRefreshTests"
  • dotnet test .\PowerForge.Tests\PowerForge.Tests.csproj -c Release --no-restore --filter "FullyQualifiedNamePowerForgeCliProjectReleaseTests|FullyQualifiedNamePowerForgeReleaseServiceTests"

@claude
Copy link
Copy Markdown

claude Bot commented Apr 23, 2026

Code Review – PR #327: Harden module install and pipeline test coverage

Overview

This PR addresses four distinct concerns:

  1. ModuleInstaller.cs — quarantine legacy flat modules when ModuleVersion cannot be read via text parsing, before attempting full metadata deserialization.
  2. StoreSubmissionService.cs — validate the authority host earlier in ValidateAuthentication.
  3. Test alignment — update assertions in several tests to match changed implementation behavior (signing flags, inbox-module filtering, Az.Accounts reclassification).
  4. Test robustness — fix a deadlock-prone CLI test and add retry logic for file-cleanup races.

The intent is clear and the changes are generally well-scoped. A few things are worth addressing before merge.


Issues & Suggestions

1. Bare catch in LegacyFlatManifestHasModuleVersion swallows fatal exceptions

// PowerForge/Services/ModuleInstaller.cs
catch
{
    return false;
}

A naked catch catches everything, including OutOfMemoryException, StackOverflowException, and ThreadAbortException. The intent is clearly to handle I/O failures gracefully. Prefer specific exceptions:

catch (IOException)
{
    return false;
}
catch (UnauthorizedAccessException)
{
    return false;
}

Or at minimum catch (Exception ex) when (ex is not OutOfMemoryException).

Also: when the file read fails for a non-obvious reason (encoding issue, permissions race), there is no log line. The caller will then quarantine the module with a Warn log, but the underlying cause is silently dropped. Consider logging at Debug or Warn level before returning false.

2. Double file read in HandleLegacyFlatInstall

LegacyFlatManifestHasModuleVersion reads the entire manifest file via File.ReadAllText, and then TryReadManifestMetadata reads it again. For the common success path, the file is read twice. A cleaner approach would be to either:

  • Eliminate the pre-check and let TryReadManifestMetadata return null (which the existing guard already handles), or
  • Pass the already-read content into TryReadManifestMetadata to avoid the double read.

This is a minor efficiency concern, but also a readability one — two consecutive reads of the same file with two separate null-checks makes the logic harder to follow.

3. _ = NormalizeAuthorityHost(...) — side-effect validation pattern

// PowerForge/Services/StoreSubmissionService.cs
_ = NormalizeAuthorityHost(authentication.AuthorityHost);

Using a function for its throwing side-effect and discarding the return value is a code smell. The reader has to look up NormalizeAuthorityHost to understand that its purpose here is validation. Options:

  • Introduce a dedicated ValidateAuthorityHost method with a void signature that makes the intent clear.
  • Actually use the normalized value downstream (if it's needed anyway, this is the right call).
  • At minimum, add an inline comment: // throws ArgumentException if host is not an allowed authority.

4. Weakened assertion in GitHubRunnerHousekeepingWorkflowTests.cs

// Before (precise):
Assert.Contains("fromJson(inputs['runner-labels'])", workflowYaml, StringComparison.Ordinal);

// After (relaxed):
Assert.Contains("fromJson(", workflowYaml, StringComparison.Ordinal);
Assert.Contains("inputs['runner-labels']", workflowYaml, StringComparison.Ordinal);

The split allows the test to pass even if fromJson and inputs['runner-labels'] appear in completely unrelated parts of the workflow YAML. If the underlying YAML was intentionally restructured, consider using a StringAssert-style substring check that captures the meaningful relationship, or add a comment explaining why the looser check is acceptable.

5. No new test for the quarantine-on-unreadable-manifest path

HandleLegacyFlatInstall now has a new code path: quarantine to "unknown" when ModuleVersion cannot be parsed from the manifest text. This is a non-trivial behavior (the directory is moved), but there is no corresponding test added to cover this scenario (e.g., a manifest file with no ModuleVersion key, or an unreadable file). Given the PR's stated goal of hardening, a test for this failure path would significantly improve confidence.


Positives

  • PowerForgeCliProjectReleaseTests.cs async/timeout fix is excellent. Reading stdout/stderr synchronously before WaitForExit is a well-known deadlock pattern; the async approach with a 120s timeout and tree-kill is the correct fix.
  • TryDelete retry logic in PowerForgeReleaseServiceTests.cs is clean and well-bounded (10 × 50ms, last exception propagates correctly with the when (attempt < 9) filter).
  • ModulePipelineApprovedModulesTests.cs assertion flips clearly reflect a deliberate behavioral change (inbox modules filtered from RequiredModules), and the PR description explains this is an alignment, not a regression.
  • ModuleBuildHostServiceTests.cs rename (UsesSharedSigningOverrideWrapperForwardsSigningFlags) better describes what the test actually verifies.

Summary

Severity Item
Medium Bare catch swallows fatal exceptions in LegacyFlatManifestHasModuleVersion
Low Double manifest file read in HandleLegacyFlatInstall
Low _ = NormalizeAuthorityHost(...) obscures intent
Low Weakened assertion in GitHubRunnerHousekeepingWorkflowTests
Suggestion Add test covering the new quarantine-on-unreadable-manifest path

The test robustness improvements (async CLI test, retry cleanup) are solid and should land. The production change in ModuleInstaller.cs needs the catch narrowed before it's safe to ship.

Review generated with Claude Code

@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Follow-up Code Review – PR #327 (current diff)

A prior Claude review flagged several issues. The current diff addresses most of them — this comment tracks what's resolved and what remains.


Resolved from prior review ✅

Issue Resolution
Bare catch swallowed fatal exceptions TryReadLegacyFlatManifestVersion now catches IOException, UnauthorizedAccessException, and DecoderFallbackException specifically, each with Verbose logging
Double manifest file read TryReadManifestMetadata is gone; the new method does a single File.ReadAllText
_ = NormalizeAuthorityHost(...) obscures intent Dedicated ValidateAuthorityHost(string) void method introduced — intent is now explicit
Bare catch in ValidatePathSegment call site Narrowed to catch (ArgumentException)

Remaining / new observations

1. ValidateAuthorityHost relies on NormalizeAuthorityHost throwing — document the contract (Low)

private static void ValidateAuthorityHost(string authorityHost)
{
    NormalizeAuthorityHost(authorityHost);   // return value discarded
}

The validation works because NormalizeAuthorityHost internally calls new Uri(...) which throws UriFormatException on a malformed host. This is non-obvious to a future reader. A one-line comment makes the intent clear:

// Throws UriFormatException if authorityHost is not a valid absolute URI.
NormalizeAuthorityHost(authorityHost);

2. Quarantine test covers missing version, not unreadable file (Suggestion)

InstallFromStaging_ConvertQuarantinesFlatItems_WhenModuleVersionMissing is now more precise (checks the unknown_ prefix, .psd1, and Public dir — good). But it still exercises only the content path (manifest exists, version key absent). The IOException / UnauthorizedAccessException / DecoderFallbackException branches in TryReadLegacyFlatManifestVersion have no test coverage. A separate test that passes an unreadable or binary file would close this gap.

3. runnerLabelFallbackExpression is still a partial YAML match (Low)

const string runnerLabelFallbackExpression =
    "fromJson(inputs.runner_labels_json != '' && inputs.runner_labels_json || " +
    "(inputs['runner-labels'] != '' && inputs['runner-labels'] ||";

This is meaningfully tighter than before (captures the entire beginning of the ternary expression), and it's fine for CI purposes. Just note that the closing ) and the fallback value are not asserted. If the expression is ever accidentally truncated in the YAML, this test would still pass. Acceptable tradeoff given the length, but worth noting.

4. legacyVersion!.Trim() — redundant null-forgiving operator (Nit)

legacyVersion = legacyVersion!.Trim();

TryReadLegacyFlatManifestVersion already guarantees version is non-null when it returns true (the !string.IsNullOrWhiteSpace(version) check). The ! operator silences the compiler warning but adds noise. Annotating the out string? version parameter as out string version (non-nullable) when returning true — via the [NotNullWhen(true)] attribute — would be the idiomatic fix:

private bool TryReadLegacyFlatManifestVersion(string manifestPath, [NotNullWhen(true)] out string? version)

Then the caller can use legacyVersion.Trim() without the suppression.


Summary

The substantive issues from the prior review are all resolved. What remains are two low-severity style points (document the throwing contract, remove the null-forgiving operator), one suggestion (test the I/O-failure branch), and one acknowledged-but-acceptable test gap (partial YAML match). The async/timeout fix and the retry-delete logic are solid. Ready to merge once the optional items are considered.

Review generated with Claude Code

@PrzemyslawKlys PrzemyslawKlys force-pushed the feature/module-pipeline-hardening-20260423 branch from af9251c to 4ec9a72 Compare April 26, 2026 18:51
@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Code Review — PR #327: Harden module install and pipeline test coverage

Overview

This PR delivers four targeted improvements: (1) quarantine legacy flat module installs when the manifest version cannot be read, (2) validate the authority host earlier during authentication, (3) fix a deadlock-prone sync stdout/stderr read in the CLI test, and (4) add a retry loop to TryDelete to handle file-cleanup races on Windows. The changes are well-scoped and readable.


✅ Strengths

  • Deadlock fix in PowerForgeCliProjectReleaseTests — The old sync ReadToEnd() + WaitForExit() pattern can deadlock when process output fills the OS pipe buffer. The new async approach with ReadToEndAsync tasks started before the wait is the correct pattern. Adding a 120-second timeout and Kill(entireProcessTree: true) is also good CI hygiene.

  • Exception specificity in ModuleInstaller.cs — Narrowing the broad catch (Exception) to IOException, UnauthorizedAccessException, and DecoderFallbackException, and downgrading the log from Warn to Verbose, is appropriate. IO failures are operational noise, not actionable warnings.

  • [NotNullWhen(true)] annotation — Correct use of the nullability contract for the Try-pattern.

  • ValidateAuthorityHost called earlier — Failing fast before credential resolution is a cleaner validation order and avoids surfacing an auth failure that is actually a config error.

  • TryDelete retry loop — The 10 × 50 ms retry is reasonable for Windows file handle races. Catching only IOException/UnauthorizedAccessException (not Exception) is appropriately narrow.


⚠️ Issues and Suggestions

1. ValidateAuthorityHost is a pure side-effect wrapper (minor smell)

private static void ValidateAuthorityHost(string authorityHost)
{
    // NormalizeAuthorityHost performs the HTTPS URI validation and throws on invalid input.
    NormalizeAuthorityHost(authorityHost);
}

This method exists solely to discard a return value for a side effect. The comment explaining why the call works is a sign the abstraction is leaky. Consider either inlining the call at the call site or extracting a dedicated AssertIsValidHttpsUri(string, string) helper that communicates intent directly. As-is, a future maintainer who looks at NormalizeAuthorityHost won't see any caller that actually needs the validation side-effect — they'll see it as a pure transform and might reasonably remove the throw paths.

2. DecoderFallbackException catch may be unreachable

var content = File.ReadAllText(manifestPath);

File.ReadAllText uses Encoding.UTF8, which applies a replacement fallback (U+FFFD) for invalid byte sequences — it does not throw DecoderFallbackException by default. The new test (WhenManifestCannotBeDecoded) writes bytes { 0x80, 0x80, 0x80 }, but those will silently become U+FFFD characters and TryGetQuotedStringValue will simply return false without an exception ever being raised. The catch clause is therefore dead code for this code path.

If the goal is truly to detect undecodable content as a distinct case, use an exception-throwing encoding:

var content = File.ReadAllText(manifestPath,
    Encoding.GetEncoding("utf-8",
        EncoderFallback.ExceptionFallback,
        DecoderFallback.ExceptionFallback));

Otherwise, the DecoderFallbackException catch can be removed and the test comment/name should clarify that binary-content manifests are handled by TryGetQuotedStringValue returning false.

3. ModuleBuildHostServiceTests test semantics change

The old test (ExecuteBuildAsync_UsesSharedSigningOverrideWrapper) validated that signing was suppressed by default — i.e., without the caller explicitly opting out. The new test (ExecuteBuildAsync_ForwardsSigningFlags) validates the behavior only when NoSign = true is explicitly set. If the default signing behavior was changed as part of a separate refactor, a regression could be silently undetected. Consider adding or retaining a test that asserts what happens when NoSign is not set, to prevent the default-path from regressing.

4. ModulePipelineApprovedModulesTests assertion reversal

-  Assert.Contains("Microsoft.PowerShell.Utility", requiredNames, StringComparer.OrdinalIgnoreCase);
+  Assert.DoesNotContain("Microsoft.PowerShell.Utility", requiredNames, StringComparer.OrdinalIgnoreCase);

This flip is described as "aligning with current behavior." That's fine if the filtering of inbox modules was an intentional improvement, but the PR does not include the corresponding production code change that drives this. If this change landed in a prior commit, the PR description should reference it. If it hasn't landed yet, merging the test flip first would cause tests to fail on main. Please confirm the production-side change is already merged before landing this.

5. TryDelete — consider using exponential backoff (optional, nit)

The fixed 50 ms sleep means the worst case is 500 ms of stalling. On a CI runner where another process holds a handle, exponential backoff (50, 100, 200, ...) would handle brief transients quickly while still tolerating longer delays. Not a blocker, just a robustness note.


Test Coverage

The new InstallFromStaging_ConvertQuarantinesFlatItems_WhenManifestCannotBeDecoded test is a good addition that covers the binary-manifest case. The refinement of the existing WhenModuleVersionMissing test to assert the specific unknown_ prefix and file presence is also welcome — it makes the test more precise without adding complexity.


Summary

The core changes (deadlock fix, authority host early validation, retry on delete, manifest quarantine hardening) are all solid. The two items that warrant attention before merge are the DecoderFallbackException catch being unreachable (minor, but misleading) and confirming the DoesNotContain assertion flip has a corresponding production-side change already in place. The ValidateAuthorityHost wrapper comment is a low-priority cleanup.

@PrzemyslawKlys PrzemyslawKlys force-pushed the feature/module-pipeline-hardening-20260423 branch 2 times, most recently from 4ec02e6 to 67e23c9 Compare April 26, 2026 19:07
@claude
Copy link
Copy Markdown

claude Bot commented Apr 26, 2026

Code Review

Overview

This PR delivers focused hardening across three areas: (1) defensive handling of unreadable/corrupt legacy flat module manifests, (2) earlier authority-host validation in the store submission auth flow, and (3) test robustness improvements against process hangs and transient file-lock races. The changes are well-scoped and the PR description is accurate.


Strengths

  • ModuleInstaller.cs manifest reading — switching from ModuleManifestMetadataReader to a direct File.ReadAllText with StrictUtf8 (throwOnInvalidBytes: true) is a clean, purposeful improvement. Binary/corrupt .psd1 files will now be caught as DecoderFallbackException and quarantined rather than causing unexpected failures downstream.
  • Specific exception handling — replacing the bare catch in HandleLegacyFlatInstall with catch (ArgumentException) is strictly better; unexpected exceptions now propagate instead of being silently swallowed.
  • Early auth validation — calling ValidateAuthorityHost before ResolveCredential in StoreSubmissionService is a good defense-in-depth move; it fails fast on a misconfigured authority host instead of proceeding into credential resolution or an HTTP request to an invalid host.
  • Process I/O deadlock fix — starting ReadToEndAsync() tasks before WaitForExit in PowerForgeCliProjectReleaseTests is the correct fix for the classic stdout/stderr buffer-fill deadlock.
  • New WhenManifestCannotBeDecoded test — good addition; the binary-bytes fixture (0x80 0x80 0x80) directly exercises the new DecoderFallbackException path in ModuleInstaller.cs.

Issues & Suggestions

1. TryDelete retry loop can throw on the 10th attempt (test cleanup)

File: PowerForgeReleaseServiceTests.cs

The when (attempt < 9) guard means attempt 9 (the last one) re-throws. For test cleanup this is unusual — a failure to delete a temp directory after 10 retries would fail the test rather than just leave behind a stale folder. Compare with the consistent best-effort pattern already used in ModuleInstallerLegacyFlatHandlingTests.cs:

finally
{
    try { temp.Delete(recursive: true); } catch { /* best effort */ }
}

Consider swallowing on the final attempt, since cleanup failure is not a meaningful test outcome here.

2. ReadToEndAsync tasks are awaited without a timeout after WaitForExit

File: PowerForgeCliProjectReleaseTests.cs

After WaitForExit(120_000) returns the stream tasks should drain quickly, but they are awaited without any guard:

var stdout = await stdoutTask;   // could theoretically hang
var stderr = await stderrTask;

The .NET docs note that the integer-overload WaitForExit(int) does not guarantee all redirected stream data has been flushed; only the zero-argument overload does. Calling process.WaitForExit() (no timeout) after the timed wait, or using a CancellationTokenSource with a short deadline, would close this edge case.

3. ValidateAuthorityHost silently accepts an empty authorityHost

File: PowerForge/Services/StoreSubmissionService.cs

var normalized = string.IsNullOrWhiteSpace(authorityHost)
    ? "https://login.microsoftonline.com"
    : authorityHost.Trim();
ValidateHttpsUrl(normalized, nameof(authorityHost));

When authorityHost is empty the default URL is validated instead of the actual value, so the method always returns without error for empty input. This is semantically correct if downstream NormalizeAuthorityHost also substitutes the same default — but the relationship is implicit. A brief comment would help future maintainers understand why a blank value is accepted here.

4. Brittle multi-line constant in GitHubRunnerHousekeepingWorkflowTests.cs

The runnerLabelFallbackExpression constant must exactly match the YAML literal, including whitespace and quote style. Any cosmetic reformatting of the workflow file would fail this test with a confusing message. Consider splitting into two or three smaller Assert.Contains checks for the logical sub-expressions to make the test more resilient and failure messages more diagnostic.

5. Behavior-change assertions deserve a comment

Files: ModulePipelineApprovedModulesTests.cs, ModulePipelineManifestRefreshTests.cs

Several assertions were flipped from Contains to DoesNotContain for Microsoft.PowerShell.Utility and Az.Accounts. This reflects a real policy change (inbox modules are filtered out of RequiredModules). A brief inline comment (e.g., // inbox modules are filtered out) would make the intent clear to reviewers who only read the tests without the PR description.


Minor Notes

  • The using System.Text; addition in ModuleInstaller.cs is correctly placed.
  • Assert.Single instead of Assert.Equal(2, ...) in ModulePipelineManifestRefreshTests.cs is idiomatic xUnit.
  • The positive/negative test pair in ModuleBuildHostServiceTests.cs (ForwardsSigningFlags / DoesNotForwardSigningFlags_WhenUnset) follows good practice.

Summary

The core production changes (manifest decoding hardening, early authority-host validation) are solid and the new tests cover the added paths well. The main actionable items are: the retry-loop swallow-on-last-attempt gap in TryDelete, the missing stream-drain guarantee after WaitForExit, and the brittle YAML constant assertion. Everything else is minor polish.

@PrzemyslawKlys PrzemyslawKlys force-pushed the feature/module-pipeline-hardening-20260423 branch from 67e23c9 to 3e4bdf5 Compare April 26, 2026 19:30
@PrzemyslawKlys PrzemyslawKlys merged commit de5b81e into main Apr 26, 2026
6 checks passed
@PrzemyslawKlys PrzemyslawKlys deleted the feature/module-pipeline-hardening-20260423 branch April 26, 2026 19:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant