Conversation
New API pipeline (mothbot_insect_orders_2025) pairing the Mothbot YOLO11m-OBB detector with the existing InsectOrderClassifier2025. Architecture: detector_cls class attribute on APIMothClassifier, CLASSIFIER_CHOICES renamed to PIPELINE_CHOICES, optional rotation field on DetectionResponse. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
YOLO weights and ultralytics are AGPL-3 (compatible with our AGPL-3 project). Mothbot repo has no explicit license, so we re-implement rather than verbatim-port; one adapted snippet (torch 2.6 weights_only fallback) will be attributed inline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step-by-step plan derived from the spec at docs/superpowers/specs/2026-04-14-mothbot-detection-pipeline-design.md. Nine tasks, TDD-style, each ending with a green test run and a commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The dict maps pipeline slug to the classifier class, but it's used as the pipeline registry. Rename for honesty. No behavior change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces a detector_cls class attribute on APIMothClassifier, defaulting to APIMothDetector (FasterRCNN 2023). The /process and /info handlers now read Classifier.detector_cls instead of a hardcoded reference. No behavior change — every existing pipeline keeps the default. Enables pairing a non-FasterRCNN detector with a specific classifier in a future commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Forward-looking schema addition for detectors that produce oriented bounding boxes (first consumer: Mothbot YOLO11m-OBB in a follow-up commit). Existing detectors leave it None. The downstream classifier still crops axis-aligned; this field is preserved so a future species classifier can use it for rotated crops without re-running detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Required for the Mothbot YOLO11m detector (follow-up commit). No code imports it in this commit. Note: ultralytics is AGPL-3.0. This is not a license escalation — the project is already AGPL-3 (PR #137). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements MothObjectDetector_YOLO11m_Mothbot, a single-class
('creature') insect detector trained by Digital Naturalism
Laboratories. Weights hosted on Arbutus and lazily downloaded
via the existing InferenceBaseClass machinery.
Adds a YoloDetection dataclass and a _corners_to_yolo_detection
helper that converts OBB corners into an axis-aligned envelope +
rotation angle, with unit tests on the coordinate math.
The torch 2.6 weights_only fallback is adapted from
Mothbot_Process/pipeline/detect.py (unlicensed repo; pattern is
standard ultralytics PyTorch 2.6 compat handling).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wraps MothObjectDetector_YOLO11m_Mothbot for the /process endpoint:
consumes SourceImage objects, builds DetectionResponses with the new
rotation field populated from the YOLO-OBB angle. No pipeline uses
this detector yet — registration follows in the next commit.
Also fixes a bug in MothObjectDetector_YOLO11m_Mothbot.get_labels:
the base class __init__ called get_labels(None) and overwrote the
class-level category_map with {}; the override returns the class-level
map when no labels_path is set.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pairs the Mothbot YOLO11m detector with the existing InsectOrderClassifier2025 (ConvNeXt-T, 16 insect orders). Binary prefilter is skipped — same policy as the existing insect_orders_2025 pipeline, since the order classifier already distinguishes non-moth insects. Also tightens should_filter_detections() to use issubclass() so subclasses of the exempt classifier set inherit the policy. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sends one test image through the /process endpoint with the mothbot_insect_orders_2025 slug, asserts detections are returned, at least one has a populated rotation field, and each has an order-level terminal classification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 10 minutes and 25 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (10)
📝 WalkthroughWalkthroughThis PR adds a new pipeline slug Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant APIServer as API Server\n/process
participant Detector as YOLO11m Detector
participant Classifier as InsectOrder Classifier
participant Response as PipelineResponse
Client->>APIServer: POST /process (pipeline=mothbot_insect_orders_2025)
APIServer->>Detector: Classifier.detector_cls.run(source images)
Detector->>Detector: get_model() -> ultralytics.YOLO
Detector->>Detector: predict_batch(imgs) -> ultralytics.predict(imgsz, conf, max_det)
Detector->>Detector: post_process_single() -> YoloDetection (x1,y1,x2,y2,rotation,score)
Detector-->>APIServer: DetectionResponse[] (rotation populated when available)
APIServer->>Classifier: classify detections
Classifier-->>APIServer: classifications (including terminal flags)
APIServer-->>Response: assemble PipelineResponse
Response-->>Client: 200 OK with detections + classifications
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
trapdata/api/api.py (1)
143-163:⚠️ Potential issue | 🟠 Major
/infogeneration still cold-loads pipeline models at startup.
make_pipeline_config_response()constructs both the detector and classifier just to read metadata. Becauseinitialize_service_info()calls this for every entry inPIPELINE_CHOICES, adding the Mothbot pipeline now makes startup/readiness depend on downloading and loading its weights too. That turns metadata generation into a heavy, failure-prone cold-start path instead of keeping model loading lazy.A safer pattern here is to build
PipelineConfigResponsefrom class-level metadata (name, description, weights path, category map, detector pairing) without instantiating the models.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@trapdata/api/api.py` around lines 143 - 163, The pipeline config builder is cold-instantiating models (via Classifier(), detector = Classifier.detector_cls(), MothClassifierBinary()) which forces weights download; change make_pipeline_config_response / initialize_service_info to assemble PipelineConfigResponse from class-level metadata instead of constructing instances: read attributes like Classifier.name / description / weights_path / category_map and Classifier.detector_cls (use its class-level metadata, not detector_cls(...)), and whether should_filter_detections(Classifier) is true to add a binary entry using MothClassifierBinary's class attributes (not instantiation); update make_algorithm_config_response callers to accept class-metadata objects or build AlgorithmConfigResponse directly from those class attributes so no model constructors are called during /info generation.
🧹 Nitpick comments (1)
trapdata/api/models/classification.py (1)
40-43: Widendetector_cls's type annotation toAPIInferenceBaseClass.
APIMothDetector_YOLO11m_Mothbotis not a subclass ofAPIMothDetector—they are siblings inheriting separately fromAPIInferenceBaseClass. The type annotation on line 43 (type[APIMothDetector]) does not match the override on line 248 (detector_cls = APIMothDetector_YOLO11m_Mothbot), creating a type contract mismatch. Change the annotation totype[APIInferenceBaseClass]to accurately reflect the supported detector types.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@trapdata/api/models/classification.py` around lines 40 - 43, Change the type annotation of the detector_cls attribute from type[APIMothDetector] to type[APIInferenceBaseClass] so subclasses like APIMothDetector_YOLO11m_Mothbot (which inherit from APIInferenceBaseClass, not APIMothDetector) can override it without violating the type contract; update the annotation on the detector_cls declaration and ensure APIInferenceBaseClass is imported or accessible in classification.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pyproject.toml`:
- Line 34: The pyproject.toml license metadata is incorrect (declares MIT) and
must match the repository LICENSE (AGPL-3); update the license field in
pyproject.toml (change the value of the license key from "MIT" to "AGPL-3.0" or
"AGPL-3") so the package metadata reflects the actual AGPL-3 license.
In `@trapdata/antenna/worker.py`:
- Around line 428-429: The code currently indexes PIPELINE_CHOICES[pipeline]
which can raise KeyError for stale/invalid pipeline slugs; update the processing
path to guard resolution by checking membership first (e.g., if pipeline not in
PIPELINE_CHOICES) and handle unknown slugs by logging a warning and
returning/acknowledging/skipping the message instead of proceeding, otherwise
set classifier_class = PIPELINE_CHOICES[pipeline] and continue to call
should_filter_detections(classifier_class); ensure the skip path uses the same
failure/ack behavior as other non-retriable errors to avoid repeated failed
processing cycles.
In `@trapdata/ml/models/localization.py`:
- Around line 395-400: The rotation is computed from cv2.minAreaRect but pts are
being cast to int32, losing sub-pixel precision; remove the pts.astype(np.int32)
cast and pass the float32 pts directly to cv2.minAreaRect so the rect angle
preserves sub-pixel accuracy (update the call that creates rect from pts and
keep the rest returning YoloDetection(x1, y1, x2, y2, rotation=angle,
score=...)).
---
Outside diff comments:
In `@trapdata/api/api.py`:
- Around line 143-163: The pipeline config builder is cold-instantiating models
(via Classifier(), detector = Classifier.detector_cls(), MothClassifierBinary())
which forces weights download; change make_pipeline_config_response /
initialize_service_info to assemble PipelineConfigResponse from class-level
metadata instead of constructing instances: read attributes like Classifier.name
/ description / weights_path / category_map and Classifier.detector_cls (use its
class-level metadata, not detector_cls(...)), and whether
should_filter_detections(Classifier) is true to add a binary entry using
MothClassifierBinary's class attributes (not instantiation); update
make_algorithm_config_response callers to accept class-metadata objects or build
AlgorithmConfigResponse directly from those class attributes so no model
constructors are called during /info generation.
---
Nitpick comments:
In `@trapdata/api/models/classification.py`:
- Around line 40-43: Change the type annotation of the detector_cls attribute
from type[APIMothDetector] to type[APIInferenceBaseClass] so subclasses like
APIMothDetector_YOLO11m_Mothbot (which inherit from APIInferenceBaseClass, not
APIMothDetector) can override it without violating the type contract; update the
annotation on the detector_cls declaration and ensure APIInferenceBaseClass is
imported or accessible in classification.py.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 472a8862-393f-4f62-ba32-c2cdd6bff870
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
docs/superpowers/plans/2026-04-14-mothbot-detection-pipeline.mddocs/superpowers/specs/2026-04-14-mothbot-detection-pipeline-design.mdpyproject.tomltrapdata/antenna/registration.pytrapdata/antenna/worker.pytrapdata/api/api.pytrapdata/api/models/classification.pytrapdata/api/models/localization.pytrapdata/api/schemas.pytrapdata/api/tests/test_api.pytrapdata/api/tests/test_mothbot_pipeline.pytrapdata/api/tests/utils.pytrapdata/cli/base.pytrapdata/cli/worker.pytrapdata/ml/models/localization.pytrapdata/ml/models/tests/__init__.pytrapdata/ml/models/tests/test_mothbot_yolo.py
| classifier_class = PIPELINE_CHOICES[pipeline] | ||
| use_binary_filter = should_filter_detections(classifier_class) |
There was a problem hiding this comment.
Guard unknown pipeline slugs before classifier resolution.
Line 428 can throw KeyError for stale/invalid queue payloads, which turns into repeated failed processing cycles instead of a controlled skip path.
Suggested resilience fix
- classifier_class = PIPELINE_CHOICES[pipeline]
+ classifier_class = PIPELINE_CHOICES.get(pipeline)
+ if classifier_class is None:
+ logger.error(f"Unknown pipeline slug received from queue: {pipeline}")
+ return False📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| classifier_class = PIPELINE_CHOICES[pipeline] | |
| use_binary_filter = should_filter_detections(classifier_class) | |
| classifier_class = PIPELINE_CHOICES.get(pipeline) | |
| if classifier_class is None: | |
| logger.error(f"Unknown pipeline slug received from queue: {pipeline}") | |
| return False | |
| use_binary_filter = should_filter_detections(classifier_class) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@trapdata/antenna/worker.py` around lines 428 - 429, The code currently
indexes PIPELINE_CHOICES[pipeline] which can raise KeyError for stale/invalid
pipeline slugs; update the processing path to guard resolution by checking
membership first (e.g., if pipeline not in PIPELINE_CHOICES) and handle unknown
slugs by logging a warning and returning/acknowledging/skipping the message
instead of proceeding, otherwise set classifier_class =
PIPELINE_CHOICES[pipeline] and continue to call
should_filter_detections(classifier_class); ensure the skip path uses the same
failure/ack behavior as other non-retriable errors to avoid repeated failed
processing cycles.
There was a problem hiding this comment.
Claude says: Valid concern but deferring — this behavior is pre-existing (same code as pre-rename CLASSIFIER_CHOICES[pipeline]), not introduced by this PR. Unknown slugs in the queue indicate an out-of-sync registration/worker deployment, which today fails loudly with a KeyError + retry loop. Converting to a quiet skip is a behavior change worth its own PR with explicit thought about how to surface the misconfiguration (metrics? dead-letter?). Leaving for a follow-up.
There was a problem hiding this comment.
@mihow — Understood, that's a reasonable call. The pre-existing loud failure on an unknown slug does at least surface the misconfiguration immediately (stack trace in logs), whereas a quiet skip could silently drop queue messages without any signal. A dedicated follow-up that considers the full observability story (metrics, dead-letter, alerting) is the right place for this.
Would you like me to open a follow-up GitHub issue to track this, so it doesn't get lost?
- pyproject.toml: license metadata MIT -> AGPL-3.0 to match LICENSE (AGPLv3 since PR #137). Metadata was stale, not specific to this PR, but directly relates to the licensing claims in this PR's description. - trapdata/ml/models/localization.py: drop int32 cast before cv2.minAreaRect. cv2 accepts float points and the cast was dropping ~0.2 deg of rotation precision on shallow-angled boxes. Verified empirically. - trapdata/api/models/classification.py: widen detector_cls annotation from type[APIMothDetector] to type[APIInferenceBaseClass]. APIMothDetector_YOLO11m_Mothbot is a sibling of APIMothDetector, not a subclass — common ancestor is APIInferenceBaseClass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Claude says: Addressed 3 of the 4 CodeRabbit findings in 3665064. Summary:
Not addressed — 🟠 Valid and real, but pre-existing — The proper fix — building |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
trapdata/ml/models/localization.py (1)
466-466: Consider makingcategory_mapimmutable or annotating as ClassVar.Ruff flags this mutable dict as a class attribute (RUF012). While the code never mutates it, accidental modification would affect all instances. Two options:
- Annotate with
ClassVarto document intent:- Use an immutable mapping
♻️ Option 1: ClassVar annotation
+from typing import ClassVar + class MothObjectDetector_YOLO11m_Mothbot(ObjectDetector): ... - category_map = {0: "creature"} + category_map: ClassVar[dict[int, str]] = {0: "creature"}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@trapdata/ml/models/localization.py` at line 466, The class-level mutable dict category_map should be made immutable or explicitly annotated as a class variable to avoid accidental instance-wide mutation: either annotate it as ClassVar[Dict[int, str]] (add "from typing import ClassVar, Dict" and change the declaration to "category_map: ClassVar[Dict[int, str]] = {0: 'creature'}") or convert it to an immutable mapping (e.g., wrap with types.MappingProxyType by importing MappingProxyType and doing "category_map = MappingProxyType({0: 'creature'})"), or mark it Final (from typing import Final) if you prefer; update imports accordingly and ensure references use the new symbol name category_map.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@trapdata/ml/models/localization.py`:
- Line 466: The class-level mutable dict category_map should be made immutable
or explicitly annotated as a class variable to avoid accidental instance-wide
mutation: either annotate it as ClassVar[Dict[int, str]] (add "from typing
import ClassVar, Dict" and change the declaration to "category_map:
ClassVar[Dict[int, str]] = {0: 'creature'}") or convert it to an immutable
mapping (e.g., wrap with types.MappingProxyType by importing MappingProxyType
and doing "category_map = MappingProxyType({0: 'creature'})"), or mark it Final
(from typing import Final) if you prefer; update imports accordingly and ensure
references use the new symbol name category_map.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cd952a92-0fae-49e5-9021-e0c1aab2f512
📒 Files selected for processing (3)
pyproject.tomltrapdata/api/models/classification.pytrapdata/ml/models/localization.py
🚧 Files skipped from review as they are similar to previous changes (2)
- pyproject.toml
- trapdata/api/models/classification.py
Antenna's pipeline registry keys on the pipeline name. Without this, `mothbot_insect_orders_2025` and `insect_orders_2025` both registered as "Insect Order Classifier" and the second POST silently dedupes against the existing record, so the mothbot pipeline never appears in the Antenna UI. Override name on MothbotInsectOrderClassifier so Antenna treats it as a separate pipeline. The terminal classifier's algorithm key changes to `insect_order_classifier_mothbot_yolo_detector`; updated the integration test to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… APIMothDetector The original detector_cls refactor in #141 updated the API path (api.py) but missed the antenna worker path. When a pipeline declares a non-default detector (e.g. mothbot_insect_orders_2025 with YOLO11m), the worker was still running FasterRCNN detection, then posting results with the FasterRCNN algorithm key, and Antenna rejected them because the /info endpoint declared only the YOLO and order-classifier algorithms. NATS re-delivered the message in a loop. Discovered when testing job 1492 on the local Antenna at beast: Error processing pipeline result for job 1492: Detection algorithm fasterrcnn_for_ami_moth_traps_2023 is not a known algorithm. Known: ['insect_order_classifier_mothbot_yolo_detector', 'mothbot_yolo11m_creature_detector']. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Mothbot YOLO detector's predict_batch strictly required list[PIL.Image], which works for the API/ml-layer path but fails for the antenna worker path. The antenna RESTDataset applies torchvision.transforms.ToTensor() to images before they reach the detector, producing tensor batches that the original check rejected with: Mothbot YOLO11m Creature Detector expects a list of PIL images from the collate fn; got <class 'torch.Tensor'> FasterRCNN's detection path already accepts both, which is why this only surfaced when wiring a YOLO-based pipeline into the worker. Accept three input forms now: - list[PIL.Image] (ML-layer dataloader, unchanged) - torch.Tensor (B, C, H, W) from REST stacked batches - list[torch.Tensor] (C, H, W) from REST mixed-size fallback For tensor inputs, convert back to HWC uint8 numpy so ultralytics does its own letterboxing / normalization at imgsz=1600, matching the PIL path's behavior. Discovered running job 1493 through the beast Antenna worker. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@trapdata/ml/models/localization.py`:
- Around line 575-583: The override of save_results in
MothObjectDetector_YOLO11m_Mothbot is turning the ML-layer save path into a
silent no-op and dropping detections; instead, convert the YOLO-specific
batch_output (YoloDetection entries) into the base bbox payload and call the
base persistence logic (e.g., call super().save_results or reuse
ObjectDetector.save_results) so non-API code still persists results like
FasterRCNN; locate save_results in MothObjectDetector_YOLO11m_Mothbot, map each
YoloDetection -> bbox format expected by ObjectDetector (preserving score,
class, and coords), and then invoke the base save routine rather than logging a
no-op.
- Around line 420-449: The fallback patch that monkey-patches _torch.load and
toggles TORCH_FORCE_WEIGHTS_ONLY_LOAD / TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD must be
serialized to avoid races: add a module-level threading.Lock (e.g.,
_torch_load_patch_lock) and acquire it before saving
original_load/original_force_* and before setting _torch.load, os.environ,
constructing YOLO(str(weights_path)); release the lock in the finally block
after restoring _torch.load and env vars; also import threading at top if not
present. This ensures the _patched_load, _torch.load replacement, and
environment variable changes around YOLO(...) are atomic across threads.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 88b1a7ad-3b51-4184-aaa3-a1804f51d258
📒 Files selected for processing (4)
trapdata/antenna/worker.pytrapdata/api/models/classification.pytrapdata/api/tests/test_mothbot_pipeline.pytrapdata/ml/models/localization.py
✅ Files skipped from review due to trivial changes (1)
- trapdata/api/tests/test_mothbot_pipeline.py
🚧 Files skipped from review as they are similar to previous changes (2)
- trapdata/antenna/worker.py
- trapdata/api/models/classification.py
| except Exception as err: | ||
| if "Weights only load failed" not in str(err): | ||
| raise | ||
|
|
||
| logger.info( | ||
| "Retrying YOLO load with torch.load(weights_only=False) compatibility " | ||
| "(trusted local checkpoint)" | ||
| ) | ||
| original_load = _torch.load | ||
| original_force_wo = os.environ.get("TORCH_FORCE_WEIGHTS_ONLY_LOAD") | ||
| original_force_no_wo = os.environ.get("TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD") | ||
|
|
||
| def _patched_load(*args, **kwargs): | ||
| kwargs["weights_only"] = False | ||
| return original_load(*args, **kwargs) | ||
|
|
||
| _torch.load = _patched_load | ||
| try: | ||
| os.environ["TORCH_FORCE_WEIGHTS_ONLY_LOAD"] = "0" | ||
| os.environ["TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD"] = "1" | ||
| return YOLO(str(weights_path)) | ||
| finally: | ||
| _torch.load = original_load | ||
| if original_force_wo is None: | ||
| os.environ.pop("TORCH_FORCE_WEIGHTS_ONLY_LOAD", None) | ||
| else: | ||
| os.environ["TORCH_FORCE_WEIGHTS_ONLY_LOAD"] = original_force_wo | ||
| if original_force_no_wo is None: | ||
| os.environ.pop("TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD", None) | ||
| else: |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For Ultralytics 8.3 with PyTorch 2.5/2.6, is there a supported way to load local YOLO checkpoints without temporarily monkey-patching torch.loador changingTORCH_FORCE_WEIGHTS_ONLY_LOAD/TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD process-wide?
💡 Result:
No, there is no supported way to load local YOLO checkpoints in Ultralytics 8.3 with PyTorch 2.5/2.6 without monkey-patching or environment variables. Ultralytics addresses the PyTorch weights_only=True default (introduced in 2.6) by internally patching torch.load calls with a torch_load wrapper that sets weights_only=False only where needed (PR #21260, merged in v8.3.162). This is a localized patch, not a global monkey-patch of torch.load or process-wide env var change. Standard usage like model = YOLO('path/to/local.pt') works seamlessly without user intervention, as confirmed in source code importing from ultralytics.utils.patches import torch_load and model loading via load_checkpoint which uses the patched loader. Issues mentioning errors were due to outdated Ultralytics versions; upgrading resolves them without user-side patches.
Citations:
- 1:
ultralytics 8.3.162Replacetorch.loadcalls with patchedtorch_loadmethod that defaults toweights_only=Falseultralytics/ultralytics#21260 - 2: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/utils/torch_utils.py
- 3: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/utils/__init__.py
- 4: https://github.com/ultralytics/ultralytics/blob/main/ultralytics/engine/model.py
🏁 Script executed:
# Find Ultralytics version requirement
find . -type f \( -name "requirements*.txt" -o -name "pyproject.toml" -o -name "setup.py" -o -name "setup.cfg" \) | head -20Repository: RolnickLab/ami-data-companion
Length of output: 89
🏁 Script executed:
# Look at the full _load_ultralytics_yolo function
rg -n "_load_ultralytics_yolo" -A 100 trapdata/ml/models/localization.py | head -150Repository: RolnickLab/ami-data-companion
Length of output: 6759
🏁 Script executed:
# Check for synchronization mechanisms (locks, etc) in the file
rg -n "Lock|lock|mutex|RLock|threading" trapdata/ml/models/localization.pyRepository: RolnickLab/ami-data-companion
Length of output: 231
🏁 Script executed:
# Check Ultralytics version in pyproject.toml
cat pyproject.toml | grep -i ultralyticsRepository: RolnickLab/ami-data-companion
Length of output: 96
Add synchronization to the torch.load fallback path for concurrent cold loads.
This fallback only triggers on "Weights only load failed" errors. If two models fail to load and simultaneously enter this path, both threads will unprotected: (1) replace _torch.load globally, (2) modify process env vars TORCH_FORCE_WEIGHTS_ONLY_LOAD / TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD, and (3) restore them in the finally block. A concurrent load observing partial restoration can see inconsistent state, making startup nondeterministic.
Protect this block with a module-level lock, or consider upgrading Ultralytics to 8.3.162+, which includes a built-in localized torch_load patch (PR #21260) that avoids user-side monkey-patching entirely. Standard YOLO(weights_path) should then work without fallback.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@trapdata/ml/models/localization.py` around lines 420 - 449, The fallback
patch that monkey-patches _torch.load and toggles TORCH_FORCE_WEIGHTS_ONLY_LOAD
/ TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD must be serialized to avoid races: add a
module-level threading.Lock (e.g., _torch_load_patch_lock) and acquire it before
saving original_load/original_force_* and before setting _torch.load,
os.environ, constructing YOLO(str(weights_path)); release the lock in the
finally block after restoring _torch.load and env vars; also import threading at
top if not present. This ensures the _patched_load, _torch.load replacement, and
environment variable changes around YOLO(...) are atomic across threads.
| def save_results(self, item_ids, batch_output, *args, **kwargs): | ||
| """The ML-layer base class expects a save method. The API wrapper | ||
| overrides this, so the DB path is never hit when used via the API. | ||
| Provide a no-op that logs, for symmetry with the FasterRCNN class. | ||
| """ | ||
| logger.info( | ||
| f"{self.name} ML-layer save_results called with {len(item_ids)} items " | ||
| "(no-op; API wrapper handles persistence)" | ||
| ) |
There was a problem hiding this comment.
Don't turn the ML-layer save path into a silent no-op.
ObjectDetector.save_results() persists detections, but this override drops everything on the floor. The API wrapper already has its own save_results(), so this only hurts non-API uses of MothObjectDetector_YOLO11m_Mothbot, where detections will now disappear silently instead of being written like the FasterRCNN detectors. Please either map YoloDetection into the base bbox payload here or keep the no-op confined to the API-only wrapper.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@trapdata/ml/models/localization.py` around lines 575 - 583, The override of
save_results in MothObjectDetector_YOLO11m_Mothbot is turning the ML-layer save
path into a silent no-op and dropping detections; instead, convert the
YOLO-specific batch_output (YoloDetection entries) into the base bbox payload
and call the base persistence logic (e.g., call super().save_results or reuse
ObjectDetector.save_results) so non-API code still persists results like
FasterRCNN; locate save_results in MothObjectDetector_YOLO11m_Mothbot, map each
YoloDetection -> bbox format expected by ObjectDetector (preserving score,
class, and coords), and then invoke the base save routine rather than logging a
no-op.
YOLO-OBB can produce detections where all 4 corners share a y coordinate (e.g. at the image edge), yielding an axis-aligned envelope with H=0. These propagate to ClassificationImageDataset.crop() and crash torchvision's Resize with "Input and output sizes should be greater than 0". Filter them out in post_process_single() with a warning log, so the mothbot_insect_orders_2025 pipeline handles edge-of-frame detections gracefully rather than erroring the entire job. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
YOLO-OBB emits corners outside the image for detections near edges (observed live: y1=-274.39 on a 2464-tall Panama diopsis image). The min/max envelope then carries negative coords downstream, where the antenna worker's int() cast preserves the sign and PyTorch's tensor slicing treats it as end-relative indexing -- yielding an empty crop and an H=0 RuntimeError from the classifier's Resize. Fix at the source by clamping the envelope to [0, width] x [0, height] using result.orig_shape in post_process_single. Adds an optional image_shape kwarg on _corners_to_yolo_detection so the clamp is opt-in (existing callers that don't know the image dimensions stay backward- compatible). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Defence-in-depth for the YOLO-OBB out-of-bounds issue now fixed in the detector. If any other detector ever returns negative coords or coords past the image bounds, the previous guard (y1 >= y2 or x1 >= x2) would not catch it -- PyTorch's tensor[:, y1:y2, x1:x2] silently produces an empty crop when y1 is negative (treated as end-relative) or when y1 is past the tensor height (clamped to height, making the slice empty). Clamp to [0, H] x [0, W] before the bounds check so the guard works on the effective slice range. The bounds-check warning now also logs the raw float coords, which makes this class of bug obvious next time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lytics YOLO11m-Mothbot was trained on cv2.imread-loaded images (BGR). When we feed numpy arrays directly, ultralytics' LoadPilAndNumpy._single_check explicitly does NOT reorder channels (it only converts PIL RGB to BGR, not numpy). Our antenna path was going PIL -> ToTensor (RGB) -> permute -> numpy (still RGB), which meant YOLO was seeing swapped R and B channels. The detector still produced output, but at much lower quality: median detection width was 1227px on a 3280px image -- i.e. roughly full-image boxes. After flipping to BGR, median detection width drops to ~245px, matching expected moth sizes. Extracted the RGB->BGR conversion into _tensor_to_bgr_numpy and covered it with a unit test so the channel order won't silently regress. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…data Adds a second pipeline that pairs the Mothbot YOLO11m-OBB detector with the existing 2023 Panama moth species classifier. The full chain is Mothbot YOLO -> binary moth/non-moth prefilter -> Panama 2023 species classifier. Verified end-to-end against the local Antenna instance (1015 detections / 1665 classifications across 55 images). Pipeline-level cleanup: - Introduce APIMothClassifier.pipeline_description, used by make_pipeline_config_response as the pipeline's description when set. Falls back to the classifier's own description otherwise, so every existing pipeline keeps its current /info payload. - Rename MothbotInsectOrderClassifier from "Insect Order Classifier (Mothbot YOLO detector)" to "Mothbot YOLO + Insect Orders 2025". The parenthetical read as if the classifier was the Mothbot-specific component, but actually only the detector changes; the new name makes the combo explicit. Also fills in pipeline_description so the /info combo reads sensibly. - The new MothbotMothClassifierPanama follows the same naming + pipeline_description pattern. Also refreshes the integration test to cover both pipelines. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers the full iteration loop: worker startup, pipeline registration via /info sync, triggering test_ml_job_e2e, and detection quality verification. Documents the Antenna create_pipelines stale-name gotcha. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…run, and ami worker run alias - `ami worker register --pipeline <slug>` now only advertises the specified pipelines instead of all. Filters the pipeline_configs list before POSTing to Antenna. - `ami worker [run] --project <id>` limits which project jobs the worker pulls via `project__id__in` query param on the /jobs endpoint. - `ami worker run` is now an explicit subcommand alias for `ami worker`. - Registration logging now reports processing service ID, created vs updated pipeline counts, and specific slugs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Adds the Mothbot YOLO11m-OBB detector to the AMI Data Companion and pairs it with existing classifiers in two new pipelines:
mothbot_insect_orders_2025— YOLO-OBB → ConvNeXt insect-order classifier (no binary prefilter).mothbot_panama_moths_2023— YOLO-OBB → binary moth/non-moth prefilter → Panama 2023 species classifier (148 species).Both are verified end-to-end via the local Antenna NATS loop (55 images → 1015 detections → 1665 classifications on the Panama pipeline).
What changed
Core features
MothObjectDetector_YOLO11m_Mothbotintrapdata/ml/models/localization.py. Wrapsultralytics.YOLO, converts OBB output (4 rotated corners) to aYoloDetectiondataclass via_corners_to_yolo_detection. Coordinates are clamped to image bounds viaresult.orig_shapeto prevent downstream crop failures.APIMothDetector_YOLO11m_Mothbotintrapdata/api/models/localization.py. Populates the newrotationfield onDetectionResponsefrom the OBB angle.MothbotInsectOrderClassifierandMothbotMothClassifierPanama, each inheriting 100% of classifier logic from the existing pipeline and only overridingdetector_cls = APIMothDetector_YOLO11m_Mothbot.detector_clsclass attribute onAPIMothClassifier: each pipeline pairs a classifier with a specific detector. All pre-existing pipelines inherit the default (FasterRCNN 2023) — no behavior change.pipeline_descriptionclass attribute onAPIMothClassifier: lets the pipeline level carry a description of the full detector+classifier combo for/info, distinct from the classifier algorithm's own description.rotation: float | Nonefield onDetectionResponse. Populated by the YOLO detector; no consumer reads it yet. Future species classifiers could use it for tighter rotated crops.CLASSIFIER_CHOICES→PIPELINE_CHOICESacross 7 files. Honest naming — the dict is used as a pipeline registry, not just classifiers.should_filter_detections()tightened fromintoissubclass()so subclasses of exempt classes inherit the exemption.Bug fixes discovered during integration testing
0726b23): ultralytics'LoadPilAndNumpy._single_checkpasses numpy inputs as-is (no channel conversion), but the model was trained on cv2-loaded BGR images. Our antenna path fed RGB tensors, causing the detector to produce large, low-quality full-image boxes. Median detection width dropped from ~1227 px to ~245 px after the fix.68e0c16): YOLO-OBB emits corners outside image bounds for detections near edges (observed:y1=-274on a 2464-tall image). Negative coords caused PyTorch slicing to treat them as end-relative indices, yielding empty (H=0) crops that crash the classifier'sResize((128,128)). Fixed at the detector level usingresult.orig_shapeand defence-in-depth in the antenna worker.7edc9d9): OBB results with all 4 corners collinear produce zero-area axis-aligned envelopes. Filtered inpost_process_single.0df47bf): antenna's REST dataloader passes tensors (not PIL); added torch→numpy conversion inpredict_batchfor the YOLO path.9f2b5f9): worker was instantiatingAPIMothDetectordirectly instead of usingclassifier_class.detector_cls.Licensing
yolo11m_4500_imgsz1600_b1_2024-01-18.ptweightsultralyticslibraryNo verbatim-ported code from Mothbot's unlicensed repo — the detection wrapper is a fresh implementation.
Test plan
test_mothbot_yolo.py— 9 tests covering OBB→envelope math, degenerate/negative/out-of-bounds coords, RGB→BGR conversion,post_process_singlefiltering.test_mothbot_pipeline.py— sends one vermont image through each pipeline via/process, asserts detections, rotation field, correct algorithm keys, and binary prefilter for Panama.uv run pytest trapdata/— all tests pass.test_ml_job_e2e— 55 images throughmothbot_panama_moths_2023, 16 throughmothbot_insect_orders_2025, zero failures.Follow-ups
cv2.warpAffine+cv2.getRectSubPix(matching Mothbot'scrop_rect()approach) to produce tighter, rotation-corrected crops instead of the current axis-aligned envelope. Would improve classification accuracy for angled specimens.create_pipelinesrename handling: sync inserts new rows but doesn't rename existing ones. Pipeline name changes require manual DB fixup on each deployment. Worth a fix in the Antennacreate_pipelineslogic (update name/description when slug matches).description: ''for their algorithm rows in Antenna. The description field is set on the Python classes but not flowing through registration. Pre-existing issue, not mothbot-specific._apply_binary_classificationfunction inworker.pyhas the same unclampedint(bbox.y1)pattern that was fixed in the main classifier crop loop. Currently safe because the detector now clamps at the source, but worth adding defence-in-depth for consistency.docs/claude/runbook-mothbot-local-testing.mdfor the local testing loop used during development.🤖 Generated with Claude Code