Skip to content

Gen sweep#208

Open
Sidney-Lisanza wants to merge 30 commits intomainfrom
gen_sweep
Open

Gen sweep#208
Sidney-Lisanza wants to merge 30 commits intomainfrom
gen_sweep

Conversation

@Sidney-Lisanza
Copy link
Copy Markdown
Collaborator

Description

Brief description of changes made

Type of Change

  • Bug fix
  • [ X] New feature
  • [ X] Documentation update
  • Performance improvement
  • Code refactoring

Lisanza and others added 22 commits November 6, 2025 14:55
…nd components

Co-authored-by: Cursor <cursoragent@cursor.com>
…-ligand pipeline

Co-authored-by: Cursor <cursoragent@cursor.com>
…-ligand training

- Add max_cluster_replicates parameter to StructureLightningDataModule to cap
  upsampling of small datasets in balanced training mode
- Add data configs: structure_ligand_all (7-dataset combined), PLINDER baseline,
  distillation, and intermediate configs for protein-ligand training
- Fix elif→if in gen_ume protein-ligand model to allow simultaneous IF/FF eval
- Fix PDB loading edge cases in latent_generator io
- Add structure transforms for protein-ligand data handling
…line eval

- Add compute_protein_ligand_contacts and compute_aligned_ligand_rmsd to
  generation utils as reusable standalone functions
- Add contact-based ligand_in_pocket metric to forward folding evaluator:
  checks if predicted ligand contacts GT pocket residues (replaces centroid-based)
- Add ligand_contacts_protein metric (any protein-ligand contact at 6A)
- Allow skipping ESMFold in conditioned generation (plm_fold=None)
- Add best-of-N display and ligand placement stats to FF cmdline output
- Add LigandMPNN inverse folding baseline evaluator and cmdline script
- Update inverse folding evaluator with pocket-aware metrics
- Update conditioned gen cmdline with additional generation parameters
- Update forward folding and inverse folding callbacks with ligand support
- Update hydra callback configs with protein-ligand evaluation parameters
- Add save_structures and minimize_ligand options to callback configs
…ct-based ligand placement

- Add good_fold_and_in_pocket_fraction (TM > 0.5 AND ligand in correct pocket)
  to FF evaluator summary and cmdline output
- Update merge_cofold_results.py to use contact-based ligand_in_pocket
  (CA within 6A of GT pocket residues) instead of centroid distance
- Add cofold_ligand_contacts_protein and cofold_n_pocket_contacts metrics
- Report good_fold_and_in_pocket in merge summary
- Restructure run_full_eval.sh: Phase 2 supports rf3, boltz, or both
  backends with configurable task selection (COFOLD_TASKS=if,ff,cg,lmpnn)
- RF3 co-folding runs in parallel chunks across multiple GPUs
- Boltz2 co-folding uses SLURM array jobs (one per sample)
- Phase 3 merges co-fold results from either backend
- Add benchmark_conditioned_gen.py for Gen-UME vs Proteina-Complexa comparison
  with ESMFold pre-filtering and per-design timing
- Add run_rf3_ff_baseline.py for RF3 co-folding on designed sequences
- Add submit_cofold_batch.py and run_cofold_local.py for batch co-folding
… docs

- Add ligand-conditioned generation and LigandMPNN baseline sections
- Document evaluation pipeline (Phase 1-3) with RF3/Boltz2 co-folding
- Document contact-based ligand placement metrics and good_fold_and_in_pocket
- Add training data configs and training commands
- Document benchmark script for Gen-UME vs Proteina-Complexa
- Add best-of-N forward folding and aligned ligand RMSD
- Update PoseBusters benchmark description
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant