Add NISAR loading and auxiliary stack support#1487
Conversation
Reviewer's GuideRefactors and extends NISAR ingestion to require a real DEM, validate inputs earlier, align DEM/mask warping precisely to the NISAR grid, and add support for geometry enhancements plus new ionosphere, troposphere, SET, and standalone water mask stacks while tightening subset and metadata handling. Sequence diagram for the updated NISAR loading and stack generation workflowsequenceDiagram
actor User
participant MintpyCLI as mintpy.load (CLI)
participant LoadData as load_data.prepare_metadata
participant PrepNisar as prep_nisar.load_nisar
participant PrepGeom as prepare_geometry
participant PrepWater as prepare_water_mask
participant PrepStackIFG as prepare_stack_ifgram
participant PrepStackION as prepare_stack_ion
participant PrepStackTRO as prepare_stack_tropo
participant PrepStackSET as prepare_stack_set
User->>MintpyCLI: run with processor=nisar
MintpyCLI->>LoadData: prepare_metadata(iDict)
Note over LoadData: Resolve DEM path
LoadData->>LoadData: validate demFile not in auto/none/no
LoadData->>LoadData: glob demFile
LoadData-->>User: raise FileNotFoundError if no DEM
Note over LoadData: Validate GUNW inputs
LoadData->>LoadData: glob GUNW unwFile pattern
LoadData-->>User: raise FileNotFoundError if no GUNW
Note over LoadData: Optional water mask
LoadData->>LoadData: only add --mask if waterMaskFile exists and not auto/none/no
LoadData->>PrepNisar: call prep_nisar.py -i GUNW -d DEM [--mask MASK]
PrepNisar->>PrepNisar: load_nisar(inps)
PrepNisar->>PrepNisar: glob input_glob for GUNW
PrepNisar-->>User: raise FileNotFoundError if no GUNW
PrepNisar->>PrepNisar: validate dem_file string and file existence
PrepNisar-->>User: raise ValueError/FileNotFoundError on invalid DEM
PrepNisar->>PrepNisar: extract_metadata(input_files, bbox, polarization)
PrepNisar->>PrepGeom: prepare_geometry(geometryGeo.h5, metaFile, bbox, demFile, maskFile, polarization)
PrepGeom->>PrepGeom: read_and_interpolate_geometry
PrepGeom-->>PrepNisar: geometryGeo.h5 metadata
alt mask_file provided
PrepNisar->>PrepWater: prepare_water_mask(waterMask.h5, metaFile, metadata, bbox, maskFile, polarization)
PrepWater->>PrepWater: warp mask to NISAR grid and apply valid_unw mask
PrepWater-->>PrepNisar: waterMask.h5
end
PrepNisar->>PrepStackIFG: prepare_stack(ifgramStack.h5, inp_files, metadata, demFile, bbox, date12_list, polarization, ifgram)
PrepStackIFG->>PrepStackIFG: _read_stack_observation(stack_type=ifgram)
PrepStackIFG-->>PrepNisar: ifgramStack.h5
PrepNisar->>PrepStackION: prepare_stack(ionStack.h5, ..., stack_type=ion)
PrepStackION->>PrepStackION: _read_stack_observation(stack_type=ion)
PrepStackION-->>PrepNisar: ionStack.h5
PrepNisar->>PrepStackTRO: prepare_stack(tropoStack.h5, ..., stack_type=tropo)
PrepStackTRO->>PrepStackTRO: read_and_interpolate_troposphere
PrepStackTRO-->>PrepNisar: tropoStack.h5
PrepNisar->>PrepStackSET: prepare_stack(setStack.h5, ..., stack_type=set)
PrepStackSET->>PrepStackSET: read_and_interpolate_SET
PrepStackSET-->>PrepNisar: setStack.h5
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- The logic in
prepare_stackandload_nisarbranches on stack type by matching substrings of the output path (e.g.,'inputs/ifgramStack.h5' in outfile), which is brittle; consider passing an explicitstack_typeenum/flag to make the behavior independent of filename conventions. - DEM warping and radar-grid interpolation are implemented separately in
read_and_interpolate_geometry,read_and_interpolate_troposphere, andread_and_interpolate_SET; you could factor out the common warp and interpolation scaffolding to reduce duplication and keep the behavior consistent across geometry/tropo/SET products. - In
_read_raster_epsg, the code assumes a projection with an EPSGAUTHORITY; for robustness, consider handling cases where the projection is missing or uses a different authority (e.g., derive CRS via WKT parsing or raise a clearer error that includes the raw projection string).
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The logic in `prepare_stack` and `load_nisar` branches on stack type by matching substrings of the output path (e.g., `'inputs/ifgramStack.h5' in outfile`), which is brittle; consider passing an explicit `stack_type` enum/flag to make the behavior independent of filename conventions.
- DEM warping and radar-grid interpolation are implemented separately in `read_and_interpolate_geometry`, `read_and_interpolate_troposphere`, and `read_and_interpolate_SET`; you could factor out the common warp and interpolation scaffolding to reduce duplication and keep the behavior consistent across geometry/tropo/SET products.
- In `_read_raster_epsg`, the code assumes a projection with an EPSG `AUTHORITY`; for robustness, consider handling cases where the projection is missing or uses a different authority (e.g., derive CRS via WKT parsing or raise a clearer error that includes the raw projection string).
## Individual Comments
### Comment 1
<location path="src/mintpy/prep_nisar.py" line_range="281-266" />
<code_context>
+ polarization=pol,
)
+ # ifgram stack
prepare_stack(
outfile=stack_file,
inp_files=input_files,
metadata=metadata,
+ demFile=inps.dem_file,
+ bbox=bounds,
+ date12_list=date12_list,
+ polarization=pol,
+ )
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** The same `prepare_stack` helper is used for multiple stack types but relies on filename substrings, which is brittle and hard to maintain.
Inside `prepare_stack`, behavior is selected by checking substrings in `outfile` (e.g. `"inputs/ifgramStack.h5"`, `"inputs/ionStack.h5"`). This couples logic to file naming and will silently break if filenames or directory layout change.
Instead, pass an explicit `stack_type` argument (e.g. `"ifgram" | "ion" | "tropo" | "set"`) and branch on that. This makes the function more robust, readable, and easier to extend.
Suggested implementation:
```python
prepare_stack(
outfile=stack_file,
inp_files=input_files,
metadata=metadata,
demFile=inps.dem_file,
bbox=bounds,
date12_list=date12_list,
polarization=pol,
stack_type="ifgram",
)
```
```python
def prepare_stack(
outfile,
inp_files,
metadata,
demFile=None,
bbox=None,
date12_list=None,
polarization=None,
stack_type=None,
):
```
```python
# Determine stack type: prefer explicit stack_type, fall back to legacy
# filename-based inference for backward compatibility.
effective_stack_type = stack_type
if effective_stack_type is None:
if "inputs/ifgramStack.h5" in outfile:
effective_stack_type = "ifgram"
elif "inputs/ionStack.h5" in outfile:
effective_stack_type = "ion"
elif "inputs/tropoStack.h5" in outfile:
effective_stack_type = "tropo"
elif "inputs/setStack.h5" in outfile:
effective_stack_type = "set"
else:
raise ValueError(
f"Unable to infer stack_type from outfile '{outfile}'. "
"Please pass stack_type explicitly (e.g. 'ifgram', 'ion', 'tropo', 'set')."
)
if effective_stack_type == "ifgram":
```
```python
elif effective_stack_type == "ion":
```
```python
elif effective_stack_type == "tropo":
```
```python
elif effective_stack_type == "set":
```
There are likely additional call sites to `prepare_stack` in `src/mintpy/prep_nisar.py` (for ion, tropo, and set stacks) that are not visible in the snippet. To fully adopt the new API and avoid relying on filename inference:
1. Find all remaining `prepare_stack(` calls and add an explicit `stack_type` keyword argument:
- For ionosphere stacks, pass `stack_type="ion"`.
- For troposphere stacks, pass `stack_type="tropo"`.
- For set stacks, pass `stack_type="set"`.
2. If any code outside this file calls `prepare_stack`, consider updating those callers to pass an explicit `stack_type` as well. The current implementation keeps backward compatibility, but explicit types are preferred going forward.
3. If there are additional stack types beyond `ifgram`, `ion`, `tropo`, and `set`, extend the `effective_stack_type` inference block and the branching logic accordingly, and use consistent string identifiers when calling `prepare_stack`.
</issue_to_address>
### Comment 2
<location path="src/mintpy/prep_nisar.py" line_range="718-727" />
<code_context>
+def read_and_interpolate_troposphere(
</code_context>
<issue_to_address>
**suggestion:** The `mask_file` parameter in `read_and_interpolate_troposphere` is unused, which suggests either incomplete masking logic or an unnecessary argument.
`mask_file` is accepted but never used, unlike in `read_and_interpolate_geometry`, so callers may assume masking is happening when it is not.
Please either remove `mask_file` from this function and its callers, or implement masking here in line with `read_and_interpolate_geometry` so the helpers behave consistently.
Suggested implementation:
```python
def read_and_interpolate_troposphere(
gunw_file, dem_file, xybbox, polarization="HH"
):
```
You will need to:
1. Find all call sites of `read_and_interpolate_troposphere` in the codebase.
2. Remove any `mask_file` argument passed to this function, so that calls follow the updated signature:
- Before: `read_and_interpolate_troposphere(gunw_file, dem_file, xybbox, polarization, mask_file)`
- After: `read_and_interpolate_troposphere(gunw_file, dem_file, xybbox, polarization)`
If any callers relied on the presence of `mask_file` for planned behavior, you may instead decide to implement masking (mirroring `read_and_interpolate_geometry`) and reintroduce it with concrete usage.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Removed redundant return statement from the function.
|
@sourcery-ai review |
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
prepare_stack,meta['FILE_TYPE']is hard-coded to'ifgramStack'even when generating ionosphere, troposphere, or SET stacks; consider setting this dynamically perstack_typeso downstream consumers can distinguish stack types correctly. _read_raster_epsgusesgdal.osr.SpatialReference(), butosris not imported anywhere; switch tofrom osgeo import osrand useosr.SpatialReference()(or otherwise ensure the OSR module is correctly imported) to avoid runtime errors.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `prepare_stack`, `meta['FILE_TYPE']` is hard-coded to `'ifgramStack'` even when generating ionosphere, troposphere, or SET stacks; consider setting this dynamically per `stack_type` so downstream consumers can distinguish stack types correctly.
- `_read_raster_epsg` uses `gdal.osr.SpatialReference()`, but `osr` is not imported anywhere; switch to `from osgeo import osr` and use `osr.SpatialReference()` (or otherwise ensure the OSR module is correctly imported) to avoid runtime errors.
## Individual Comments
### Comment 1
<location path="src/mintpy/prep_nisar.py" line_range="150" />
<code_context>
+ if not projection:
+ raise ValueError(f"Raster has no projection metadata: {path}")
+
+ srs = gdal.osr.SpatialReference()
+ if srs.ImportFromWkt(projection) != 0:
+ raise ValueError(
</code_context>
<issue_to_address>
**issue (bug_risk):** Using gdal.osr.SpatialReference may fail if osr is not exposed under gdal in the installed GDAL bindings.
The Python bindings usually expose `osr` via `from osgeo import osr`, not `gdal.osr`. On many setups `gdal.osr` won’t exist and this will raise an AttributeError at runtime. Import `osr` explicitly (e.g. at the module top) and use `osr.SpatialReference()` here to keep `_read_raster_epsg` working across environments.
</issue_to_address>
### Comment 2
<location path="src/mintpy/prep_nisar.py" line_range="1063" />
<code_context>
}
- # initiate HDF5 file
meta["FILE_TYPE"] = "ifgramStack"
+
writefile.layout_hdf5(outfile, ds_name_dict, metadata=meta)
</code_context>
<issue_to_address>
**issue (bug_risk):** FILE_TYPE is hard-coded to ifgramStack even when preparing ion/tropo/set stacks.
Since `meta["FILE_TYPE"]` is always set to `"ifgramStack"`, stacks written as `ionStack.h5`, `tropoStack.h5`, or `setStack.h5` will carry incorrect metadata, which can break any logic branching on `FILE_TYPE`.
Instead, derive `FILE_TYPE` from `effective_stack_type`, for example:
```pythonile_type_map = {
"ifgram": "ifgramStack",
"ion": "ionStack",
"tropo": "tropoStack",
"set": "setStack",
}
meta["FILE_TYPE"] = file_type_map[effective_stack_type]
```
so the metadata consistently reflects the actual stack type.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
I ran this on a non-redundant stack of 4 nearest-neighbor GUNWs and confirmed that the dataset was loaded. It produced the expected ifgramStack.h5 and geometryGeo.h5, as well as auxiliary outputs including ionStack.h5, setStack.h5, and tropoStack.h5. |
|
@hfattahi Summary of changes:
Responses to specific review items:
Testing performed:
|
hfattahi
left a comment
There was a problem hiding this comment.
Thanks @ehavazli . The changes look good to me. I have one more comment on the mask. Please see details below.
Also one question for you. I obviously have forgotten when in the pipeline an "average spatial coherence" is written to disk. I managed to run your PR but I did not see any avgSpatialCoherence.h5 being generated.
| return _read_unwrapped_phase_valid_mask(gunw_file, xybbox, pol, frequency) | ||
|
|
||
| dset = ds[path] | ||
| mask = dset[xybbox[1] : xybbox[3], xybbox[0] : xybbox[2]] |
There was a problem hiding this comment.
Thanks for adding the mask reader. I think this is fine for the current version of GUNW products. However in a recent ISCE3 PR here we are modifying the mask. With the upcoming release (X05014) the mask will chage the data type from 8 bit Unsigned to 32 bits Unsigned Int and have the following description:
Combination of a water mask, a mask of subswaths of valid samples, and data anomalies"
" in the reference RSLC and the geometrically coregistered secondary RSLC."
" Each pixel value is encoded as a 32-bit unsigned integer."
" Bits 0–7 represent subswath encoding, where the most significant digit represents"
" the water flag of that pixel in the reference RSLC, where 1 is water"
" and 0 is non-water; the second most significant digit corresponds to"
" the subswath number of the reference RSLC, and the least significant digit"
" corresponds to the subswath number of the secondary RSLC;"
" a value of 0 in either digit indicates an invalid sample in the corresponding RSLC."
" Bits 8–15 represent bitwise anomaly flags for the secondary RSLC, and"
" bits 16–23 represent bitwise anomaly flags for the reference RSLC,"
" with each bit corresponding to a specific anomaly condition."
" A value of 0 in the anomaly bits indicates that no anomaly is detected in the corresponding RSLC."
" Bits 24–31 are reserved for future use"
As you can see the first 8 bits of the new 32 bits mask is identical to the current 8 bits mask. So with a very minor change you can keep your reader as is for now. In particular you can simply modify the following to extract the first 8 bits which contains water and subswath masks:
| mask = dset[xybbox[1] : xybbox[3], xybbox[0] : xybbox[2]] | |
| mask_bits = dset[xybbox[1] : xybbox[3], xybbox[0] : xybbox[2]] | |
| # Bits 0–7: Subswath and Water encoding | |
| mask = mask_bits & 0xFF |
With this change your reader should work for current and future GUNW.
| prepare_water_mask( | ||
| outfile=water_mask_file, | ||
| metaFile=input_files[0], |
There was a problem hiding this comment.
@ehavazli it looks to me that the valid data mask is read only for the first interferogram. This can be problematic. The valid data mask may be slightly different between different acquisitions and therefore may be different between different GUNWs.
|
@hfattahi Summary of changes:
|
Summary
This PR improves the NISAR data-loading workflow in MintPy by tightening input validation in
load_data.pyand expandingprep_nisar.pyto generate additional derived products.What changed
processor=nisarWhy
These changes make the NISAR ingestion path more robust and more complete. They reduce configuration errors, improve failure messages, and produce a richer set of MintPy-ready outputs for downstream analysis.
Testing
prep_nisar.pywrites geometry, interferogram, ionosphere, troposphere, SET, and water mask outputs with the updated workflowSummary by Sourcery
Tighten NISAR ingestion by validating DEM/GUNW inputs, aligning DEM/mask warping to the NISAR grid, and generating additional NISAR-derived MintPy stacks and water masks.
New Features:
Bug Fixes:
Enhancements: