Skip to content

Split run identity from configuration#122

Open
frazane wants to merge 2 commits intomainfrom
feat/split-run-identity-from-config
Open

Split run identity from configuration#122
frazane wants to merge 2 commits intomainfrom
feat/split-run-identity-from-config

Conversation

@frazane
Copy link
Copy Markdown
Contributor

@frazane frazane commented Mar 26, 2026

Summary

Separates environment identity from run configuration to allow inference environments to be reused across configuration changes, eliminating unnecessary rebuilds of venv and squashfs images. Closes #111

Changes

  • Add ENV_FIELDS and HASH_EXCLUDE ClassVars to RunConfig documenting the identity contract
  • Split hashing logic: env_entry_hash() for environment-level changes, run_specific_hash() for configuration changes
  • Refactor register_run() to compute both env_id and run_id with nested directory structure: data/runs/{env_id}/{config_hash}/
  • Update inference rules to use {env_id} wildcard for environment artifacts (in data/runs/{env_id}/) and {run_id} for run outputs
  • Add ENV_CONFIGS global dict and collect_all_envs() function
  • Add comprehensive unit tests for identity separation

Benefits

  • Reuses environments across config changes (no squashfs rebuild)
  • Reduces disk I/O burden on shared filesystems
  • Clear separation of concerns: environment identity vs. run configuration
  • Nested directory structure aligns with the proposed design in issue

Testing

  • All existing tests pass
  • 5 new tests verify identity separation behavior

Separates environment identity (env_id) from run configuration (run_id) to
allow inference environments to be reused across configuration changes. This
prevents unnecessary rebuilding of venv and squashfs images when only the
inference config YAML or steps are modified.

Changes:

src/evalml/config.py:
- Add RunConfig.ENV_FIELDS ClassVar documenting fields that determine the
  inference environment (checkpoint, extra_requirements, disable_local_eccodes_definitions)
- Add RunConfig.HASH_EXCLUDE ClassVar for fields never included in hashing
  (label, inference_resources)
- Export module-level constants RUN_ENV_FIELDS and RUN_HASH_EXCLUDE

workflow/rules/common.smk:
- Add ENV_HASH_FIELDS and RUN_HASH_EXCLUDE constants
- Split hashing logic into two functions:
  - env_entry_hash(): hashes only environment-determining fields
  - run_specific_hash(): hashes run-specific fields (config YAML, steps)
- Refactor register_run() to compute and store both env_id and run_id in
  each run config entry. Format: run_id = {env_id}/{config_hash}
- Add collect_all_envs() function and ENV_CONFIGS global dict
- Update master_hash() to hash both env and run components separately

workflow/rules/inference.smk:
- Rules using {env_id} wildcard (outputs in data/envs/{env_id}/):
  - prepare_checkpoint
  - extract_checkpoint_requirements
  - create_inference_venv
  - make_squashfs_image
- Rules using {run_id} wildcard with nested config directories:
  - prepare_inference_forecaster
  - prepare_inference_interpolator
  - execute_inference (references env via lookup)
  - create_inference_sandbox

Directory structure change:
- Environment artifacts: data/envs/{env_id}/
- Run-specific outputs: data/runs/{env_id}/{config_hash}/{init_time}/

Benefits:
- Reuses environments across config changes (no squashfs rebuild)
- Reduces disk I/O on shared filesystems
- Documents identity contract via ClassVars
- Nested directory structure clearly separates concerns

Tests:
- Add test_run_identity.py with 5 tests validating identity separation
- All existing tests pass

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
@frazane frazane requested a review from dnerini March 26, 2026 09:20
@frazane frazane changed the title Split run identity from configuration (issue #111) Split run identity from configuration Mar 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New inference environment is computed whenever inference config is updated

1 participant