Skip to content

Refactor models to Pydantic#320

Open
NoodlesNZ wants to merge 14 commits intoOWASP:masterfrom
NoodlesNZ:pydantic
Open

Refactor models to Pydantic#320
NoodlesNZ wants to merge 14 commits intoOWASP:masterfrom
NoodlesNZ:pydantic

Conversation

@NoodlesNZ
Copy link
Copy Markdown
Contributor

Replaces the custom var descriptor system with Pydantic v2 BaseModel. This simplifies validation and makes it easier to extend. The public API is unchanged, existing threat model scripts should work without modification.

Changes:

  • Split the monolithic pytm.py into per-class modules (element.py, actor.py, asset.py, dataflow.py, tm.py, etc.)
  • All model classes now use Field() for attribute definitions and validators for coercion
  • Added LLM element with supporting threat conditions
  • Removed sqlDump / --sqldump (dropped the pydal dependency)
  • Dropped Python 3.8 support; added 3.12–3.14

Bug fixes:

  • Threat conditions targeting base classes (e.g. Asset) now correctly use isinstance rather than string comparison
  • Finding used as an override stub no longer registers a phantom element in the global state
  • Legacy string assignment to a dataflow's data field was incorrectly classifying data as PUBLIC instead of UNKNOWN
  • Threat.apply() returned None instead of False on type mismatch

Added Tests:
Added tests/test_pydantic_models.py covering the new validation and coercion behaviour (DataSet, field validators, Threat legacy field mapping). All existing tests pass.

@colesmj colesmj assigned colesmj and unassigned colesmj Apr 11, 2026
@colesmj colesmj self-requested a review April 11, 2026 20:37
Copy link
Copy Markdown
Collaborator

@colesmj colesmj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for submitting this refactoring of the pytm base code; it was long overdue. The change to pydantic instead of the legacy var mechanism will make for cleaner code and also create some new possibilities for extending pytm.

No threats needed to change to support this refactor (e.g. the condition evals were all "safe" with the checks you added)?

class Lambda(Asset):
"""A lambda function running in a Function-as-a-Service (FaaS) environment."""

onAWS: bool = Field(default=True, description="Is this lambda on AWS")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to verify - did you intend to recreate this Field in the subclass vs setting the Field inherited?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just following the original code where onAWS defaults to True for Lambda. To be honest I'm not a fan of using "onAWS" or calling this "Lambda" but I didn't want to change the API for now.

default=None,
description="Is a response to this data flow"
)
srcPort: int = Field(default=-1, description="Source TCP port")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more general comment, not specific to this review but convenient: The srcPort and dstPort descriptions should drop 'TCP' to give them some flexibility to be dataflows for more than just TCP networks.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, especially when we have a protocol field here too. What are your thoughts about setting the protocol field to default to TCP?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a threat perspective that is probably less useful than setting it to HTTP or HTTPS. In truth, any data flow may have layered protocols (e.g. TLS over HTTP over TCP over IPv4); obviously some are more informative for threat analysis than others.

logger = logging.getLogger(__name__)


class var(object):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally....

@NoodlesNZ
Copy link
Copy Markdown
Contributor Author

Thank you for submitting this refactoring of the pytm base code; it was long overdue. The change to pydantic instead of the legacy var mechanism will make for cleaner code and also create some new possibilities for extending pytm.

No threats needed to change to support this refactor (e.g. the condition evals were all "safe" with the checks you added)?

Thanks for the review and the comments! There are a bunch of changes I'd like to make but I wanted to keep it invisible to most end users. Maybe with more drastic changes this might become part of a 2.0 release?

No threats needed to change for now. I did think about moving the threat library to python code, but there was not a lot to gain other than using enums and classes (to save string matching).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants