Skip to content

Refactor input sources and main workflow#133

Merged
scottstanie merged 66 commits intoisce-framework:mainfrom
scottstanie:v0.2-rewrite
Apr 13, 2026
Merged

Refactor input sources and main workflow#133
scottstanie merged 66 commits intoisce-framework:mainfrom
scottstanie:v0.2-rewrite

Conversation

@scottstanie
Copy link
Copy Markdown
Member

Large overdue update to sweets:

  • Refactor input to allow three interchangeable InSAR input sources (burst-subset S1 + COMPASS, OPERA CSLC, NISAR GSLC) feeding
  • Adds S1 burst subsetting (using burst2safe)
  • Leverage dolphin for most of the timeseries after the geocoded SLC creation.
  • simplify CLI, prune poorly designed/redundant config parameters

More minor fixes:

  • Switch to Pixi for project management for easier multi-environment setup (CPU/GPU)
  • Fix the broken NASA water mask creation, use ASF's pre-made tiles

Issues fixed:

scottstanie and others added 30 commits April 9, 2026 22:54
Adds a tabled FastAPI/uvicorn web UI under src/sweets/web/, a `sweets
server` subcommand that launches it, and the matching pyproject [web]
optional-dependency / pixi feature. Excludes web/ from mypy in both
pyproject and the pre-commit hook (sqlmodel's table=True trips mypy
without its plugin).

Front-end is not built yet; this is the bare server-side scaffolding
so it can be revisited later in the v0.2 cycle.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two small in-flight fixes from the working tree:

- _unzip.py: glob for any S1[AB] SAFE/zip, not just IW. Lets users
  process SM/EW data through the same unzip path.
- core.py: log a warning in _create_burst_interferograms when bursts
  do not share a common set of dates, since uneven coverage causes
  geometry/footprint inconsistencies in stitched outputs.

Also adds CLAUDE.md project guidelines (NumPyDoc style, ruff/black/
mypy/pytest, parse-don't-validate, no needless error handling).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Reorder so [tool.pixi.*] is the canonical environment definition; pip
  [project.dependencies] is mirrored from it for non-pixi users.
- Add burst2safe and tyro to both the pip deps and the conda env.
- Add an explicit s1reader pin to scottstanie/s1-reader@develop-scott
  (upstream isce-framework/s1-reader has a numpy 2 polyfit regression,
  see isce-framework#132).
- Bump pixi python pin to >=3.11 to match [project] requires-python.
- Switch the pixi environments to the named-feature form so they all
  share one solve group.
- Bump the project classifier from Pre-Alpha to Alpha.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the hand-rolled interferogram / stitch / unwrap pipeline with
a thin orchestration around two libraries:

- src/sweets/download.py — new BurstSearch model that wraps
  burst2safe.burst2stack. Takes a small bbox (or WKT polygon), date
  range and track, and returns just the SAFEs whose bursts intersect
  the AOI. Replaces ASFQuery/wget/aria2c.

- src/sweets/_dolphin.py — new DolphinOptions model + run_displacement
  shim that builds a dolphin.workflows.config.DisplacementWorkflow
  from sweets-friendly knobs (half-window, ministack-size, strides,
  unwrap method, etc.) and calls dolphin.workflows.displacement.run.
  dolphin owns phase linking, network selection, stitching, unwrapping,
  timeseries inversion and velocity from here on.

- src/sweets/core.py — slimmed Workflow now does:
  1. download bursts (burst2safe)
  2. fetch DEM, water mask, burst-db, orbits (parallel)
  3. geocode each burst with COMPASS (unchanged)
  4. stitch geometry (unchanged)
  5. run dolphin displacement
  starting_step now means {1: download, 2: geocode, 3: dolphin}.
  Cross-fills bbox between Workflow and BurstSearch so AOI can be
  specified at either level.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The new CLI is three small dataclasses (ConfigCmd, RunCmd, ServerCmd)
fed to tyro.extras.subcommand_cli_from_dict. Drops ~140 lines of
argparse boilerplate, gives proper rich help, and the `sweets server`
command from the WIP web UI commit is preserved.

Heavy imports (sweets.core, uvicorn) are deferred to inside the .run()
methods so `sweets --help` is snappy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- tests/test_core.py: drop ASFQuery references; exercise BurstSearch
  cross-fill, YAML round-trip, default-factory order, and the new
  bbox/wkt validation errors. 6 tests, all passing.
- scripts/demo_sweet.py: rewritten as a Workflow.model_validate call
  against the pecos bbox so it doubles as a smoke test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- REVIVAL.md: notes-to-self at the repo root: what changed, which
  open issues this branch closes, what is still loose, and a smoke
  test recipe targeting the pecos bbox.
- CHANGELOG.md: an "Unreleased — v0.2 rewrite" section with the
  major changes and removals.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Smoke-tested the burst2safe download path against the pecos AOI on
2026-04-09: 2 SAFEs, ~770 MB each (one IW2 measurement TIFF + all-swath
annotations), ~4 min wall time end-to-end. Workflow.from_yaml +
existing_safes round-trip detects them.

Discovered that burst2safe rejects bboxes that span more than one IW
subswath (raises "Products from swaths IW1 and IW2 do not overlap").
Document the workaround (--swaths IW2 in the smoke-test recipe) and
flag it as something to surface in user-facing docs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
mypy was complaining that sys.exit(main()) passes None — main() doesn't
need a wrapper here because tyro raises SystemExit on errors and the
.run() methods raise on failure. Plain main() is enough for `python -m
sweets` to work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use tyro.conf.Positional via Annotated[Path, tyro.conf.Positional] so
the config file can be passed as `sweets run sweets_config.yaml`
instead of `sweets run --config-file sweets_config.yaml`. Matches the
shape of every other CLI in the SAR ecosystem.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The legacy NASA_WATER path is broken on macOS in two independent ways:

1. Newer sardem hard-asserts NASA data sources only support ENVI
   output ("Use COP or 3DEP for GTiff output").
2. sardem's `_unzip_file` runs `"unzip -o -d {}".format(self.cache_dir).split(" ")`,
   which mangles any cache path containing spaces — including the macOS
   `~/Library/Application Support/sweets` default that sweets.utils.get_cache_dir()
   used to return.

Fix:

- dem.create_water_mask now downloads a Copernicus DEM via the same COP
  source as create_dem, then derives a uint8 land(1)/water(0) GTiff by
  thresholding heights > 0. Coarse for coastal AOIs but fine for the
  inland areas sweets is mostly used for, and avoids needing a second
  remote source.
- Default water_mask_filename is now `watermask.tif` (was `watermask.flg`).
- utils.get_cache_dir() now returns `$XDG_CACHE_HOME/sweets`
  (`~/.cache/sweets` if unset) on every platform, sidestepping the
  unzip-cmd-split bug entirely. Also drops the unused force_posix flag.
- tests/test_core.py: update the default-factory check to .tif.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two COMPASS-side issues that show up running the new sweets workflow
end-to-end:

1. COMPASS still uses np.string_ / np.unicode_ aliases that were
   removed in numpy 2.0, crashing s1_geocode_slc.run() inside the
   ProcessPool. Restore them as thin shims before importing compass
   so we don't need a parallel COMPASS fork just for this. TODO is
   to land a real fix on scottstanie/COMPASS and pin to it (like we
   already do for s1-reader).
2. _get_cfg_setup built the static-layers HDF5 path as
   `static_layers_<burst>_<date>.h5`, but COMPASS actually writes
   them per-burst (no date), so the file lookup later in the workflow
   missed the on-disk file and stitch_geometry blew up. Strip the
   trailing date from the stem before adding the prefix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues that surfaced running dolphin against COMPASS-produced CSLCs:

1. dolphin's DisplacementWorkflow validation now requires
   input_options.subdataset for HDF5/NetCDF inputs. COMPASS writes the
   CSLC at /data/VV (or HH/etc.); default to /data/VV here since the
   rest of sweets is co-pol. Users with cross-pol CSLCs can override
   the field on the returned config before run().

2. _existing_gslcs() was happy to count an empty 6-KB CSLC shell as
   "already done". COMPASS creates that shell early in s1_geocode_slc
   and only populates it after the heavy lifting, so any crash mid-run
   leaves a file that looks valid to a stat-based check but breaks
   dolphin downstream. This is exactly what was reported in isce-framework#107.
   Filter out anything below ~1 MB and require the static_ prefix
   not to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pecos AOI runs cleanly through download → COMPASS → geometry → dolphin
in ~5.8 min, producing real interferograms, unwrapped phase, timeseries
and velocity outputs. Document the wall-time breakdown, the five
fixes that came out of running it, and add a TODO for landing a real
COMPASS numpy 2 fix on scottstanie/COMPASS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The np.string_ → np.bytes_ / np.unicode_ → np.str_ rewrite landed on
scottstanie/COMPASS@develop-scott (commit a91a9aa, 64 sites across
s1_geocode_slc.py, s1_geocode_metadata.py, h5_helpers.py).

- pyproject.toml: pin compass to that branch via
  [tool.pixi.pypi-dependencies] alongside the existing s1reader pin,
  and remove the conda-forge `compass = ">=0.4.1"` entry so the pip
  fork is the canonical install.
- src/sweets/_geocode_slcs.py: drop the runtime monkey-patch and the
  noqa: E402 dance — clean import block again.
- CHANGELOG.md / REVIVAL.md: update notes to reflect the real fix.

Verified locally by deleting cached CSLCs and re-running sweets:
COMPASS produces fresh ~273 MB CSLC HDF5s without the shim.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…search

Move opera-utils from conda-forge to a git pin under
[tool.pixi.pypi-dependencies] so sweets gets:

- the new high-level `create_tropo_corrections_for_stack` workflow that
  the tropo correction step builds on (`apply_tropo` + `crop_tropo` +
  `search_tropo` + the SLC-stack reader registry)
- numpy 2 fixes that haven't landed on opera-adt main yet

Also add `asf_search` as an explicit pip dep — we use it through
`opera_utils.download.search_cslcs` for the OPERA CSLC source path,
and pixi can grab it from conda-forge directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a second SLC source class beside BurstSearch. OperaCslcSearch:

- takes the same shape (bbox/wkt + dates + track + out_dir) as BurstSearch
- carries a `kind: Literal["opera-cslc"]` discriminator so the new
  Workflow.search Union can dispatch on it
- resolves OPERA burst IDs by querying ASF DAAC via
  opera_utils.download.search_cslcs
- downloads CSLC HDF5s via download_cslcs and the matching CSLC-STATIC
  layers via download_cslc_static_layers (into a `static_layers/`
  subdirectory)
- exposes existing_cslcs() / existing_static_layers() for Workflow's
  skip-if-exists logic

BurstSearch picks up a `kind: Literal["safe"]` field for symmetry.

_geometry.py: also stitches `local_incidence_angle` now (needed by the
optional tropo correction step that runs after dolphin).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two new modules:

- src/sweets/_tropo.py:
  * TropoOptions Pydantic model (enabled flag, height_max, margin_deg,
    interp_method, num_workers).
  * OperaCslcReader: SLCReader implementation that pulls
    `zero_doppler_start_time` and `bounding_polygon` straight out of
    OPERA CSLC HDF5 `/identification/`. Registered eagerly with
    opera_utils.tropo's sensor registry under both `opera-cslc` and
    `sentinel1` so the create_tropo_corrections_for_stack workflow
    knows how to parse our CSLC stack.
  * Forces aiohttp's ThreadedResolver at import time, sidestepping the
    aiodns DNS-timeout failure that bites both burst2safe and
    crop_tropo on networks where c-ares can't reach a usable resolver.
  * create_tropo_corrections() — thin wrapper over
    opera_utils.tropo.create_tropo_corrections_for_stack with the
    sweets-side TropoOptions knobs.
  * apply_tropo_to_unwrapped() — given dolphin's unwrapped phase
    rasters and per-date tropo correction GeoTIFFs, computes the
    differential per ifg pair (`tropo[date2] - tropo[date1]`),
    converts metres of LOS delay to radians of phase via
    `4*pi/wavelength`, and writes a corrected raster alongside.
  * run_tropo_correction() chains them together.

- src/sweets/_dolphin_yaml_compat.py:
  Side-effect import that monkey-patches dolphin.workflows.config._yaml_model._add_comments
  to recognize anyOf entries containing $refs (Pydantic unions of
  submodels). Without this, dolphin's commented-yaml emitter raises
  KeyError when serializing the new Workflow.search Union. Real fix is
  a 1-liner upstream in dolphin (see REVIVAL.md).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- core.Workflow.search becomes a Union[BurstSearch, OperaCslcSearch];
  the cross-fill model_validator pushes the outer bbox down into either
  variant and defaults `kind` to "safe" for backwards compat with existing
  configs that have no discriminator.
- Workflow.run() branches on the source kind. For BurstSearch the path
  is unchanged (download SAFEs -> orbits -> COMPASS -> stitch geometry
  -> dolphin). For OperaCslcSearch we skip burst-db / orbits / COMPASS
  entirely: download CSLC HDF5s + CSLC-STATIC layers, stitch geometry
  from the static layers, then run dolphin against the pre-geocoded
  CSLCs directly.
- New Workflow.tropo: TropoOptions field, plus a `_run_tropo` post-step
  that fires after dolphin if `tropo.enabled` is true.
- core also picks up a top-of-file `from . import _dolphin_yaml_compat`
  side-effect import so the discriminated-union schema serializes
  cleanly through dolphin's commented-yaml emitter.

CLI:
- `sweets config --source {safe,opera-cslc}` picks the source.
- `sweets config --do-tropo` flips on the tropo correction step.
- SAFE-only flags (`--polarizations`, `--swaths`) are now documented as
  ignored when `--source opera-cslc`.

Tests:
- New tests for the default-kind backwards-compat shim, the OPERA-CSLC
  discriminator, and OPERA-CSLC YAML round-trip with tropo enabled.
- Updated default water-mask filename check to .tif (was already done
  in the dem.py fix earlier).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CHANGELOG: new bullets for OperaCslcSearch (--source opera-cslc),
  the optional tropo correction (--do-tropo), the opera-utils fork pin,
  and the local_incidence_angle stitch addition.
- REVIVAL: extend the "what changed" table with the OPERA CSLC source
  row, the COMPASS / opera-utils fork rows, and the tropo correction
  row. Add a "Smoke test results, round 2" section with end-to-end
  timings (~7.9 min) and the 4 bugs caught/fixed during the run
  (rasterio Float16, aiohttp DNS, dolphin yaml, stale intermediates).
  Mark isce-framework#88 as done. Drop the now-handled OperaCslcSearch and
  smoke-test items from the loose-work list and add 3 follow-ups
  (notebook, README, mintpy export, dolphin yaml upstream fix,
  tropo per-burst averaging).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
dolphin's `_yaml_model._add_comments` crashed on Union-of-submodels
fields because Pydantic 2 emits sub-model entries as bare `$ref`s with
no `type` key. Fixed upstream on scottstanie/dolphin@develop-scott
(commit 46762c6: fall back to the $ref's last path segment).

Move dolphin from the conda-forge entry to a git pin under
[tool.pixi.pypi-dependencies] alongside the other forks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- DolphinOptions.gpu_enabled defaults to True. Harmless on machines
  without a GPU — dolphin falls back to CPU — and removes the footgun
  of forgetting to opt in on a GPU host.
- build_displacement_config / run_displacement now take a `subdataset`
  kwarg (default `/data/VV` for COMPASS / OPERA CSLC layouts). Callers
  with NISAR GSLCs will pass
  `/science/LSAR/GSLC/grids/frequencyA/HH` or similar.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two related fixes to the tropo post-step.

- **Per-burst averaging.** An OPERA burst stack produces ~9 per-burst
  tropo correction rasters per date, one for each burst's sensing time
  (opera_utils' crop_tropo already time-interpolates the 6-hourly
  OPERA L4 TROPO-ZENITH products down to each burst's exact time).
  The previous `_index_tropo_files_by_date` dict-inserted them under
  a YYYYMMDD key, so one arbitrary burst's correction won per date —
  silently throwing away the other 8. Replace with
  `_group_tropo_files_by_date` that buckets the per-burst files and
  feeds them through `_mean_tropo_on_grid` which reprojects each to
  the target's grid and nanmean's the stack before differencing.
- **Apply to timeseries, not just unwrapped ifgs.** `run_tropo_correction`
  now reaches into `dolphin_work_dir/{unwrapped,timeseries}/` itself
  and corrects every `<date1>_<date2>*.tif` it finds. The unwrapped
  rasters are in radians of phase (scale = 4*pi/wavelength); the
  timeseries rasters are already in metres of LOS displacement after
  dolphin's radians-to-metres conversion (scale = 1). Pass a
  `units: Literal["radians", "meters"]` through apply_tropo_to_pairs
  to pick the right scale.

Also suppress the "Mean of empty slice" RuntimeWarning — NaN for
pixels the tropo grid doesn't cover is exactly what we want.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a third source class alongside BurstSearch and OperaCslcSearch:

- `sweets.download.NisarGslcSearch` wraps `opera_utils.nisar.run_download`
  to search CMR for NISAR GSLC products, fetch matching HDF5s and
  bbox-subset each one in a single pass. Fields: bbox/wkt, start/end,
  optional track_frame_number, frequency (`A` or `B`), polarizations,
  short_name. `kind: Literal["nisar-gslc"]` discriminator for the
  Workflow.search Union. `.hdf5_subdataset` computes the
  `/science/LSAR/GSLC/grids/frequency<X>/<POL>` path dolphin needs.
- `sweets.core.Workflow.search` becomes Union of three variants.
  `run()` branches: NISAR path skips COMPASS, burst-db, orbits AND
  the geometry stitching step (NISAR GSLCs are already geocoded and
  carry no separate static-layers product); dolphin reads the grid
  from the HDF5 directly via `_run_dolphin`'s new subdataset kwarg.
  Tropo is refused with a clear warning on NISAR — there's no
  stitched incidence-angle raster to feed apply_tropo.
- `sweets.cli.ConfigCmd` gains `--source nisar-gslc`,
  `--track-frame-number` and `--frequency`. `--track` becomes optional
  for OPERA CSLC and invalid for NISAR.
- Two new tests exercising the NISAR discriminator + YAML round-trip.
- `src/sweets/_dolphin_yaml_compat.py` deleted — the fix landed
  upstream in scottstanie/dolphin@develop-scott (commit 46762c6) so
  the monkey-patch is no longer needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- NISAR GSLC source bullet in CHANGELOG + REVIVAL's "what changed" table.
- Bump the fork-pin bullet to mention dolphin's develop-scott (for the
  yaml_model fix) alongside the existing s1-reader / COMPASS / opera-utils.
- Drop the now-handled "land dolphin yaml fix" item and the
  "average tropo across burst sensing times" item from the loose-work
  list. Add NISAR smoke test + NISAR tropo incidence angle as new
  loose items.
- Replace the "deliberately did NOT do" bullet about dolphin with the
  correct description of which upstreams we left alone.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…SF Vertex

The previous CLI flag `--track-frame-number` only set the FRAME (the
TTT digits in the granule filename, e.g. `71`) and the TRACK
(relative orbit, the RRR digits, e.g. `13`) was not exposed at all —
confusing because ASF Vertex labels these "Track" and "Frame"
respectively.

Fix:
- `NisarGslcSearch` now exposes two fields:
  - `track` (alias `relative_orbit_number`) — the `Track` field on
    ASF Vertex / `RRR` in the filename
  - `frame` (alias `track_frame_number`) — the `Frame` field on ASF
    Vertex / `TTT` in the filename
  Both are forwarded to `opera_utils.nisar.run_download` under their
  canonical opera-utils names.
- CLI: `--track-frame-number` is gone; `--frame` and `--track` (now
  valid for `--source nisar-gslc` too) replace it. Help text references
  the ASF Vertex labels so users can map directly from the website.
- Tests: pin both `track` and `frame` round-trip and add a
  `test_nisar_gslc_alias_field_names` that verifies the canonical
  opera-utils field names also validate.

Verified against the granule
NISAR_L2_PR_GSLC_008_013_D_071_0005_NADV_A_..._001.h5: parses out as
track=13, frame=71 (cycle=8, direction=D), matching the ASF Vertex
display.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two related bugs surfaced when running with `--wkt` instead of `--bbox`:

1. **`_sync_aoi` only forwarded `bbox` to the inner `search` dict**, never
   `wkt`. So `sweets config --wkt ... --source nisar-gslc` produced an
   inner `NisarGslcSearch` with bbox=None, wkt=None and the source's
   own `_check_aoi_and_dates` validator died with "Must provide either
   `bbox` or `wkt`". Mirror the bbox push for wkt.

2. **`Workflow.search` was a plain `Union[...]`** because `Field(discriminator=...)`
   used to crash dolphin's commented-yaml walker. Pydantic's plain-Union
   validation tried each variant in order and reported failures from all
   three at once — a 10-line traceback even when the user only got the
   `kind` discriminator wrong. Now that dolphin handles both `anyOf`
   and `oneOf`+`$ref` (scottstanie/dolphin@54e9cb7), re-add the
   `Annotated[Union[...], Field(discriminator="kind")]` so Pydantic
   dispatches directly and errors point at exactly the wrong field.

Also drop the bbox -> wkt auto-fill in `_set_bbox_and_wkt`. Nothing
downstream reads the outer `self.wkt` past that validator (everything
uses `self.bbox`), and a computed-from-bbox wkt was contaminating
YAML round-trip equality once the wkt push-down was added: the inner
search would gain a wkt on reload that it never had on the first pass.

Tests: tightened `test_bbox_wkt_cross_fill` to assert that supplying
outer wkt now propagates down into `w.search.wkt`, and dropped the
unused `_iou` helper.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The NISAR HDF5 layout drifts between OPERA releases — early BETA
products carry frequencyA with HH/HV; the current PR products serve
frequencyB with VV/VH (e.g.
NISAR_L2_PR_GSLC_008_013_D_071_..._A_..._001.h5 — the trailing `A`
is unrelated to the frequency band). Hardcoding `--frequency A
--polarizations HH` blew up inside opera_utils' `_get_rowcol_slice`
with `KeyError: "object 'frequencyA' doesn't exist"`.

Fix:

- `NisarGslcSearch.frequency` is now `Optional[Literal["A", "B"]]`
  defaulting to `None` (auto-detect).
- New `_resolve_frequency_and_pols()` peeks at the first cached HDF5
  in `out_dir` (or, if there isn't one yet, the first remote CMR hit
  via `opera_utils.nisar.search()`) and returns the frequency letter
  + the polarizations actually present under it.
- New `_reconcile()` reduces user overrides against what's available,
  logging a warning when the user-asked-for values don't match. The
  user keeps any pol they asked for that's actually present; misses
  fall through to whatever's in the file.
- `download()` calls the resolver before forwarding to
  `run_download`, so opera-utils never sees a frequency/pol that
  isn't in the file.
- `hdf5_subdataset` is now a method (not a property) that goes through
  the same resolver, so `Workflow._dolphin_subdataset()` always feeds
  dolphin a path that exists in the HDF5.
- Helper `_peek_nisar_grid` / `_peek_nisar_grid_from_handle` factored
  out so the resolver can use either an h5py.File handle (for the
  remote-search case) or a Path (for the cached-file case).

Tests: tighten the existing NISAR test to assert the user's explicit
choices flow through construction, and rely on the smoke test for the
runtime resolver path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ASF hosts a high-resolution water mask product mosaicked from
OpenStreetMap and ESA WorldCover at
https://asf-dem-west.s3.amazonaws.com/WATER_MASK/TILES/, served as
5-degree GeoTIFFs with native posting around 0.0001 deg. Strictly
better than what sweets had:

- old: SRTMSWBD via sardem (`NASA_WATER`) — broken on macOS (sardem's
  unzip_cmd splits on space, the cache path has "Application Support"
  in it) and restricted to ENVI output.
- old: Copernicus DEM heights > 0 hack — coarse, no shoreline detail,
  miscategorizes inland sub-sea-level basins.
- new: real water mask at native (~10 m equivalent) resolution from
  the actual OSM coastline.

New module `src/sweets/_water_mask.py`:

- `_coord_to_tile_name(lon, lat)` -> ASF 5-deg tile filename
  (e.g. `n55w165.tif`).
- `_get_tile_urls(west, south, east, north)` -> list of `/vsicurl`
  URLs covering the AOI; midpoint sampling for AOIs wider than 5 deg.
- `WaterValue` enum: `ZERO` = water=0/land=1 (dolphin convention,
  default) or `ONE` = water=1/land=0 (ASF native).
- `_buffer_mask(mask, buffer_pixels, water_value)` — circular
  morphological dilation to grow the water class into adjacent land
  pixels (handles erode-the-land case for the inverted convention).
- `create_water_mask(bounds, output, resolution=None, buffer_meters=0,
  water_value=ZERO, overwrite=False)` — pulls tiles via
  `gdal.BuildVRT`, warps to the AOI extent at native resolution,
  inverts if needed, optionally buffers, and writes a uint8 GTiff
  with `LZW + TILED` and a band description noting the convention.

`sweets.dem.create_water_mask` is now a thin shim over the new module
with the dolphin convention as the default. Drops the rasterio /
numpy / sardem imports it no longer needs.

Verified end-to-end against the LA test bbox: produced a 3664x3886
uint8 mask in ~2s, ~51% water (Pacific + Long Beach harbor), with
the band description set correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NISAR PR products ship heterogeneous (frequency, polarization) modes
across cycles, and each product's grid extent can be narrower than its
advertised bounding polygon. That combination broke two things at once:

1. The old `_choose_signature` scoring weighted a user-pinned `frequency`
   match at +100 and a `polarizations` match at only +50, so a config
   pinning `frequency: A, polarizations: [VV]` would pick a single-cycle
   frequencyA/[HH,HV] group over a single-cycle frequencyB/[VV,VH] group
   — choosing the one that had zero overlap with the requested pol.

2. `download()` processed exactly one group and bailed out with 0
   GeoTIFFs when that group's products happened to produce empty stubs
   (e.g. bbox outside the actual frequencyA grid), even though another
   group had perfectly good data for the AOI.

This replaces the scoring-and-pick with a rank-and-iterate flow:

- `_rank_signatures` sorts groups by (stack size, pol match, freq match)
  so the polarization pin always beats the frequency pin on ties and
  neither can outvote a larger coherent stack.
- `download()` iterates the ranked list; for each signature it runs the
  full per-product subset + HDF5 -> GeoTIFF conversion via a new
  `_download_group` helper. The first signature that produces >=1 usable
  GeoTIFF wins and the rest are skipped.
- If every signature yields zero outputs, raise a clear error pointing
  at the most likely causes (AOI outside grid extent, over-specific
  pins) instead of silently writing an empty directory and letting
  dolphin trip over it later.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
scottstanie and others added 18 commits April 12, 2026 18:47
R1 from FUTURE_IDEAS — the self-contained run summary. Walks a work
directory's `dolphin/` output tree and renders one HTML file with:

- Summary table parsed from `sweets_config.yaml` (AOI, dates, source,
  track/frame, polarizations, estimated wall time from raster mtimes,
  dolphin version).
- Velocity raster (`timeseries/velocity.tif`, symmetric percentile
  color scale, in the raster's own units).
- Temporal coherence raster (`interferograms/temporal_coherence_*.tif`,
  viridis colormap with a fixed 0..1 color range so a mostly-saturated
  AOI doesn't collapse the colorbar).
- Longest-baseline cumulative displacement from `timeseries/*.tif`.
- Per-pair coherence bar chart + mean-coherence table from
  `interferograms/*.int.cor.tif`.
- Output inventory grouped by subdirectory.

All rasters are rendered to PNG with matplotlib and embedded as
base64 data URIs, so the HTML is fully portable — no external image
files, no JS, no network fetches at view time. ~200-270 KB total for
a typical run.

Raster reads go through GDAL directly (new `_read_raster` helper)
rather than rasterio because rasterio's `_band_dtype` map doesn't
handle `GDT_Float16` (code 15), which dolphin uses for the temporal
coherence output. Same issue as the opera-utils tropo fix.

CLI: `sweets report <work_dir> [--output ...] [--config-file ...]`.
Defaults to `<work_dir>/sweets_report.html` and auto-picks the first
`*.yaml` in the work dir (excluding dolphin's own generated config)
for the summary table.

Verified against all three cached smoke runs (S1 burst, OPERA CSLC,
NISAR GSLC). Every one produces a clean 6-section report.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fmt
The `src/sweets/web/` FastAPI + Svelte scaffold was committed early
in the revival but never tested or exercised by any code path, CI, or
smoke test. It carried untested deps (sqlmodel, titiler, uvicorn,
etc.) and a mypy exclude. Rather than ship it untested in the v0.3 PR
and take on the maintenance surface, park it on its own branch
(`web-ui-scaffold`) and strip it from the PR branch.

Removed:
- `src/sweets/web/` directory (14 files)
- `ServerCmd` from `cli.py` (the `sweets server` subcommand)
- `[tool.pixi.feature.web.dependencies]` + the `web` pixi environment
- The `exclude = "src/sweets/web/"` mypy workaround

Updated references in REVIVAL.md, FUTURE_IDEAS.md, and
`docs/sweets-demo.ipynb` to point at the branch instead of a live
directory.

The `sweets report` HTML (R1) and the bowser integration (FUTURE_IDEAS
R2) are the two lighter-weight alternatives for browser-based result
viewing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
.DS_Store, the NISAR spec PDF, and two email PDFs were swept up by
`git add -A` in the web-removal commit. Remove them from tracking and
add .gitignore rules so they don't come back.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant