Skip to content

Conversation

@rbbozkurt
Copy link

This patch promotes the existing Isolation Forest algorithm implementation from the staging phase to builtin status, with improvements. The implementation provides two main builtins, outlierByIsolationForest for training iForest models and outlierByIsolationForestApply for scoring samples based on trained models. Specifically, we optimized the algorithm with vectorized harmonic number computation for improved scalability. The patch extends test coverage in staging/isolationForestTest.dml with comprehensive tests, and Python API integration tests. Refer to JIRA for detailed discussions.

Related to #1980

This patch promotes the existing Isolation Forest algorithm implementation from
the staging phase to builtin status, with improvements. The implementation provides
two main builtins, outlierByIsolationForest for training iForest models and
outlierByIsolationForestApply for scoring samples based on trained models.
Specifically, we optimized the algorithm with vectorized harmonic number
computation for improved scalability. The patch extends test coverage in
`staging/isolationForestTest.dml` with comprehensive tests, and Python API
integration tests. Refer to JIRA for detailed discussions.

Related to apache#1980

Co-authored-by: keremaras1 <[email protected]>
Co-authored-by: denizzqq <[email protected]>
@github-project-automation github-project-automation bot moved this to In Progress in SystemDS PR Queue Jan 29, 2026
@rbbozkurt rbbozkurt marked this pull request as ready for review January 30, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

1 participant