Skip to content

Tracking dev progress#451

Draft
HaiwangYu wants to merge 337 commits intomasterfrom
apply-pointcloud
Draft

Tracking dev progress#451
HaiwangYu wants to merge 337 commits intomasterfrom
apply-pointcloud

Conversation

@HaiwangYu
Copy link
Copy Markdown
Member

@HaiwangYu HaiwangYu commented Dec 2, 2025

Make this new one, as #444 seems not working for the tracking purpose

lastgeorge and others added 30 commits April 12, 2026 07:25
Adds walk-history to proto_extend_point and uses the halfway walk point
as break_pt when dot(dir1, dir1_prev) < -0.5, preventing spurious
zig-zag stubs from near-front kinks. Raises dl_vtx_cut default to 2.5 cm
(was 2.0 cm in TaggerCheckNeutrino default_configuration) and removes the
redundant dl_vtx_cut parameter from clus.jsonnet so C++ is the single
source of truth.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…e_shower_1

In examine_shower_1, when building a candidate shower starting from a weakly-
directed muon track (sg) at main_vertex, complete_structure_with_start_segment
flood-fills the graph and pulls in any same-cluster neighbor — including a
Michel electron shower (Shower_C) attached at sg's far-end junction.  The
inner associated-showers loop then finds Shower_C via the angle/distance test
(angle ≈ 0° and distance = 0 because they share a vertex), satisfies the
energy/length commit condition, and flips sg's pdg from 13 → 11, converting
the stopping muon into an electron.

Fix: in the low-energy else-branch of the associated-showers loop, skip any
conn_type=1 shower whose start_vertex is already inside shower1's vertex set.
Such a shower is a downstream decay product (e.g. a Michel electron starting
at the muon's stopping vertex), not an external EM shower that should cause the
start segment to be reclassified.  High-energy (>80 MeV) downstream showers
are handled by the existing if-branch above and are unaffected.

Also carries forward two earlier fixes committed to examine_shower_1:
  - set_mass(electron) when flipping start_segment pdg to 11 (mass consistency)
  - dedup loop to remove stale single-segment muon showers that share a
    start_segment with the newly created shower1 (prevents duplicate bee ids)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
In fill_bee_pf_tree, indirect-connection showers (conn_type 2/3) get a
pseudo mother node to bridge the gap to the main vertex.  The prototype
(fill_psuedo_reco_tree, fill_ssmsp_psuedo overloads 1 & 3) labels this
pseudo mother as gamma (22) for EM showers and neutron (2112) for all
others — the physical reading being that an isolated proton activity
visible only weakly must have been produced by an unseen neutral (neutron).

The toolkit's SSM tagger (fill_ssmsp_pseudo_1/_3) already applied this
rule correctly.  The Bee PF tree helper (append_gamma_shower) did not: it
always wrote "gamma" regardless of the underlying shower's particle type,
silently mislabeling isolated proton activities.  The pf_pdg_to_name
helper already had case 2112 → "n" but it was never reached.

Fix: rename append_gamma_shower → append_pseudo_shower and add the PDG
test (e/γ → 22, otherwise → 2112) matching the SSM and prototype logic.
Pi0 children are unaffected (they are always EM, condition evaluates to 22).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…n break_segments

segment_search_kink's flag_switch direction-swap trigger had no minimum-window
guard: when a kink sat ~2-3 cm from a track endpoint, only 3-4 post-kink fit
points existed and were trivially collinear (path≈chord to within 0.03),
causing flag_switch to fire on degeneracy rather than real geometry.  This
dispatched proto_extend_point backward into the track body, producing spurious
~2-3 cm tail segments at track endpoints.

Fix 1 (PRSegmentFunctions.cxx, segment_search_kink): require the full 9-point
post-kink window AND an absolute chord > 3 cm before trusting the straightness
ratio.  Both branches (flag_switch and flag_search else-if) are guarded
symmetrically.

Fix 2 (NeutrinoPatternBase.cxx, break_segments geometry check): for degree-1
end vertices, extend the replace_segment_and_vertex condition from
(min_dis < 1.5 cm && angle > 120°) to (min_dis < 2.0 cm &&
kink_angle_at_break > 30°).  kink_angle_at_break measures the approach-vs-
departure angle in the steiner path at break_wcp, which is a more physically
meaningful discriminant than the overall-segment angle used previously.  This
absorbs the residual ~1.7 cm tail that survives after Fix 1 when flag_continue
is false and the walker snaps to the kink itself.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nalysis

Replace the previous 13-component (5 live, 8 commented-out) composite score
with a clean 7-component formula tuned on 36 annotated events:

  score = s_dl + s_snap + s_fwd_z + s_clen + s_isol + s_main + s_fv

Key changes vs. the previous scheme:
- Re-enable s_clen (cluster length bonus, +2.0 max at 60 cm) and boost
  s_main (+2.0) and s_isol (-2.0) so geometric signals can compete in the
  uncertain-DL regime (where s_dl ≈ 5 for all candidates).
- Cap s_fwd_z at ±0.25 (was -(z-z_min)/200cm, reaching -3 on wide events)
  so it can no longer swamp the main-cluster bonus.
- Replace the hard 2*dl_vtx_cut snap-distance gate with a soft s_snap
  penalty (-min(2, snap/5cm)), fixing 3 false rejections on long main-cluster
  candidates at 5-8 cm snap distance.
- Delete all dead commented-out scoring code (segs, ltrk, mult, flg_in,
  conf, ptbk) per "delete unused code" policy.
- Add DL score mean/std/regime log line for future diagnosis.
- Bump dl_vtx_min_accept_score default 0.0 → 4.0: empirically separates
  correct uncertain-regime picks (8-12) from true-failure events (3-5).

Net effect on the 36-event log: +6 correct outcomes (3 wrong picks flipped,
3 false rejections accepted), 0 regressions. Also includes Phase 1 changes
(top-K DL payload from Python, dl_vtx_rerank toggle, dl_vtx_score_scale knob).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The degree-1 terminus branch of use_replace in break_segments (introduced
in 7725a9c to absorb ~1.7 cm fold-back tail artifacts) was also absorbing
genuine Michel electrons.  Diagnostics on two events revealed the key
discriminant:

  Event 6852 (fold-back tail):    tv1-vs-tv2 angle = 27.8°, use_replace=1 ✓
  Event 6528 (12.94 MeV Michel):  tv1-vs-tv2 angle = 79.2°, use_replace=1 ✗

tv1 = end_v - start_v (whole segment direction)
tv2 = end_v - break_wcp (local stub direction)

A fold-back artifact at a track endpoint has the stub pointing in roughly
the same direction as the main track (small angle), because the fitter has
simply extended a few points past the true end.  A real physical secondary
(Michel electron, delta ray) diverges from the parent axis at a significant
angle.  kink_angle_at_break cannot discriminate (both cases ≈50–90°).

Add `angle < 45°` to the terminus condition:
  - fold-back tail (27.8°): still absorbed ✓
  - Michel electron (79.2°): falls through to break_segment_into_two ✓

Also removes the temporary std::cout diagnostic prints and #include<iostream>
added to PRSegmentFunctions.cxx during investigation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…a output

Previously m_fitted_charge_2d was cleared at the start of every
fill_fitted_charge_2d() call, so T_proj_data only captured the last
cluster-filtered do_multi_tracking() pass — all other beam-flash clusters
were silently dropped.

Add a per-cluster snapshot cache m_cluster_fitted_charge_2d that stores the
result of each fill_fitted_charge_2d() call keyed by m_cluster_filter.
Overwriting the same key on each re-fit gives "latest fit wins per cluster",
handling the case where a cluster is re-fit during examine_structure /
merge_vertices.  A new assemble_fitted_charge_2d() method merges all
snapshots into m_fitted_charge_2d at the end; TaggerCheckNeutrino::visit
calls it just before grouping.set_track_fitting() so write_proj_data sees
all clusters' cells with correct cluster_id and pred_charge.

Also consolidate kPlaneChOffset usage in UbooneMagnifyTrackingVisitor and
apply it consistently to pu/pv/pw fields in T_rec_charge.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New IEnsembleVisitor that opens the existing tracking ROOT file in
UPDATE mode and adds T_tagger (all TaggerInfo fields + nu_x/y/z) and
T_kine (all KineInfo fields), matching prototype output tree names.
Also adds tagger_output() factory to clus.jsonnet and a tagger
validation plan document.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New comparison binary that reads T_tagger and T_kine from a prototype
ROOT file and a toolkit ROOT file, compares all scalar (Float_t/Int_t)
and vector (vector<float>/vector<int>) branches event by event, and
groups results by originating tagger function (cosmic, gap, mip, ssm,
shw_sp, pio_family, stem, lem, br, tro, hol_lol, vis, numu, nue,
kine, etc.).  Writes per-category histograms and a T_summary tree to
report.root; prints a compact per-category table and exits non-zero if
any branch differs.  Replaces the earlier separate build-verify and
single-event manual inspection steps.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Section 5.4 now documents the actual C++ comparison app
(wire-cell-uboone-tagger-compare) replacing the earlier Python script
placeholder.  Implementation sequence table updated with DONE/TODO
status for each step.  Key file reference table extended with the new
visitor header/impl and comparison app.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ration

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reclassify all SPDLOG_LOGGER_DEBUG/->debug() calls in clus sub-loggers
(clus.Cluster, clus.TrackFitting, clus.NeutrinoPattern, and other clus
component loggers) to TRACE level. MultiAlgBlobClustering high-level
flow logs remain at DEBUG.

With -L clus:debug (default), only high-level flow is shown. To re-enable
detail per subsystem, add e.g. -L clus.NeutrinoPattern:trace or -L clus:trace.
Requires building with --with-spdlog-active-level=trace (new wscript default).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Keep DEBUG only for configure-time roll-call, event milestones (loading
tensor set, Produce pctrees, performance summary, EOS), and the rare
diagnostic skip. Demote to TRACE: the dump() ensemble printouts (fired
before/after every visitor, ~60 lines/event), per-segment fitted-point
counts, and bee-points fill summaries.

With -L clus:debug the MABC log footprint drops from ~72 to ~22 lines/event.
Full detail returns with -L clus:trace.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pass "!V:Silent" to all TMVA::Reader constructors in UbooneNueBDTScorer
and UbooneNumuBDTScorer to suppress the verbose booking/loading messages
that fire at startup for every BDT reader.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- TaggerCheckNeutrino: add per-step DEBUG-level wall-clock timing in
  visit() gated by m_perf, covering preload, PR, vertex determination,
  taggers, kine fill, and a TOTAL line (~12 lines/event when enabled)
- NeutrinoPatternBase: fix init_first_segment always passing &cluster as
  main_cluster causing false flag/pointer mismatch WARN for non-main
  clusters; pass nullptr for non-main clusters instead; demote the
  nullptr fallback message from WARN to TRACE
- BlobSampler: demote wire-index out-of-range from WARN to TRACE since
  the boundary clamping is already correct; consolidate into one TRACE line

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the existing `struct Perf` in MultiAlgBlobClustering with
std::chrono so each perf checkpoint emits a DEBUG-level line showing
the step name, its wall-clock duration, and cumulative time since the
start of operator(). Gated by `m_perf` (existing config knob).

Also minor whitespace cleanup in SteinerGrapher.cxx and
ClusteringRecoveringBundle.cxx (trailing spaces, missing newline).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rename three fields to match prototype output tree names:
  shw_sp_br3_7_shower_main_length → shw_sp_br3_7_main_length
  br3_7_shower_main_length        → br3_7_main_length
  numu_cc_3_acc_track_length      → numu_cc_3_track_length

Also remove match_isFC from the output visitor (not written by prototype).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
numu_tagger3.weights.xml was trained with "numu_cc_3_acc_track_length"
so the numu3 Reader::AddVariable must keep that label, even though the
TaggerInfo field and xgboost reader now use the prototype name
"numu_cc_3_track_length".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When -v is given and a branch differs, print the prototype and toolkit
values from the first differing event alongside the existing stats.
Scalars show inline; vectors print on separate indented lines.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add -n<NeutrinoTaggerInfo.h> option: parses the header at runtime to
build a branch→default map (float/int fields read {init}, vectors → []).
Verbose output now prints a fixed-width table with aligned columns for
n_diff/n_cmp, max|diff|, proto, toolkit, and default.  Vectors get a
two-row proto/toolkit dump below the stats line.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
detect_type now returns SCALAR_D for Double_t leaves.  setup_branch_stat
allows mixed Double_t/Float_t pairs (canonicalised as SCALAR_D) by
reading each side into its native buffer and promoting to double in
compare_one.  Verbose output prints SCALAR_D values the same way as
SCALAR_F.  Fixes [SKIP unsupported type] for nu_x/y/z after prototype
changed their type to Double_t.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
nu_x/y/z are written under alias names in the output visitor but live
as kine_nu_x/y/z_corr in NeutrinoTaggerInfo.h.  Add a small static
alias table so the default-value lookup follows the indirection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The header is installed at <prefix>/include/WireCellClus/NeutrinoTaggerInfo.h
alongside the binary at <prefix>/bin/.  Use readlink(/proc/self/exe) to
find the prefix at runtime so defaults are shown without needing -n.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…se mode

Two additions to the -v output:

1. Fingerprint cluster report (before the per-branch table): groups scalar
   branches that share identical (proto, toolkit) first-diff values and prints
   any cluster of >= 3 branches.  This surfaces the common pattern where one
   upstream quantity (e.g. main-shower energy, main-shower length, main vertex)
   drives dozens of apparent diffs.  For the nue_5384_132_6604 event the
   energy cluster alone has 45 members, collapsing the apparent 16-category
   diff into ~5 root causes.

2. 'note' column in the per-branch table: sentinel_note() flags rows where one
   side is a known sentinel and the other is not:
     |v| >= 1e7  ->  [SENT proto] or [SENT tool]  (no-segment-found style)
     int == -1   ->  [UNINIT proto] or [UNINIT tool]

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants