🧬 BioReason
Incentivizing Multimodal Biological Reasoning
within a DNA-LLM Model

Note

BioReason-Pro is now released!

Building on this work, BioReason-Pro integrates ESM3 embeddings, GO-GPT, and reinforcement learning to generate expert-level protein function annotations, preferred over UniProt entries 79% of the time.

Abstract

Unlocking deep, interpretable biological reasoning from complex genomic data is a major AI challenge hindering scientific discovery. Current DNA foundation models, despite strong sequence representation, struggle with multi-step reasoning and lack inherent transparent, biologically intuitive explanations. We introduce BIOREASON, a pioneering architecture that, for the first time, deeply integrates a DNA foundation model with a large language model (LLM). This novel connection enables the LLM to directly process and reason with genomic information as a fundamental input, fostering a new form of multimodal biological understanding. BIOREASON's sophisticated multi-step reasoning is developed through supervised fine-tuning and targeted reinforcement learning, guiding the system to generate logical, biologically coherent deductions. Across biological reasoning benchmarks, BIOREASON significantly improves performance, raising accuracy on KEGG-based disease pathway prediction from 86% to 98% and delivering an average 15% gain over strong single-modality baselines in variant effect prediction tasks. BIOREASON reasons over unseen biological entities and articulates decision-making through interpretable, step-by-step biological traces, offering a transformative approach for AI in biology that enables deeper mechanistic insights and accelerates testable hypothesis generation from genomic data. Data, code, and checkpoints are publicly available at https://github.com/bowang-lab/BioReason

Key Contributions

• Novel multimodal architecture: The first successful integration of a DNA foundation model with an LLM, establishing a new methodology for AI-driven biological studies.

• Advanced reasoning methodology: A systematic training approach combining supervised fine-tuning and reinforcement learning that incentivizes multi-step biological reasoning.

• New biological reasoning benchmarks: Development and curation of novel benchmarks for evaluating biological reasoning capabilities, including an annotated reasoning dataset for gene pathway and disease prediction from KEGG.

• Empirical performance improvements: Demonstration that BioReason outperforms both DNA foundation models and LLMs used independently or in simple combination, with average performance gains of 15%+ over baseline.

• Interpretable reasoning traces: A mechanism for generating step-by-step biological reasoning traces that provide interpretable predictions, enhancing scientific insight and hypothesis generation.

Datasets

The datasets used to train and evaluate BioReason can be found on our HuggingFace collection with detailed download and usage instructions.

Checkpoints

We will release the checkpoints soon!

Installation

Prerequisites

Python 3.11+
CUDA/GPU for best performance

Installation Steps

# Clone the repository
git clone https://github.com/bowang-lab/BioReason.git
cd BioReason

# Install package
pip install -e .

Results

KEGG-Derived Biological Reasoning Task

Performance comparison on 290 test datapoints for multi-step mechanistic reasoning:

Model	Accuracy	F1-Score	Precision	Recall
[DNA] NT - 500M	86.55	69.76	73.23	66.61
[DNA] Evo2 - 1B	88.28	72.43	75.23	69.83
[LLM] Qwen3 - 1B	85.17	65.71	71.39	64.19
[LLM] Qwen3 - 4B	90.00	79.66	88.24	75.08
[DNA-LLM] NT + Qwen3 - 1B	89.31	81.46	88.24	77.30
[DNA-LLM] NT + Qwen3 - 1B (+GRPO)	91.72	75.06	79.41	72.89
[DNA-LLM] NT + Qwen3 - 4B	95.86	86.25	88.24	84.95
[DNA-LLM] NT + Qwen3 - 4B (+GRPO)	98.28	90.15	91.18	89.62
[DNA-LLM] Evo2 + Qwen3 - 1B	90.42	75.62	77.42	73.91
[DNA-LLM] Evo2 + Qwen3 - 4B	95.17	86.14	91.18	83.33
[DNA-LLM] Evo2 + Qwen3 - 4B (+GRPO)	98.28	93.05	94.12	92.48

Variant Effect Prediction Benchmarks

Performance on pathogenic/benign classification:

Model	Variant Effect - Coding		Variant Effect - Non-SNV
	Accuracy	F1-Score	Accuracy	F1-Score
[DNA] NT - 500M	60.91	45.20	67.93	65.97
[DNA] Evo2 - 1B	70.07	49.19	76.17	66.51
[LLM] Qwen3 - 1B	46.55	34.82	70.67	76.21
[LLM] Qwen3 - 4B	48.99	39.58	61.86	67.60
[DNA-LLM] NT + Qwen3 - 1B	55.58	54.50	72.82	76.93
[DNA-LLM] NT + Qwen3 - 4B	60.94	55.66	65.59	73.00
[DNA-LLM] Evo2 + Qwen3 - 1B	72.83	68.90	88.20	89.91
[DNA-LLM] Evo2 + Qwen3 - 4B	80.21	80.00	83.85	85.02

Citation

If you find this work useful, please cite our paper:

@misc{fallahpour2025bioreasonincentivizingmultimodalbiological,
      title={BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model}, 
      author={Adibvafa Fallahpour and Andrew Magnuson and Purav Gupta and Shihao Ma and Jack Naimer and Arnav Shah and Haonan Duan and Omar Ibrahim and Hani Goodarzi and Chris J. Maddison and Bo Wang},
      year={2025},
      eprint={2505.23579},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2505.23579}, 
}

@article {Fallahpour2026.03.19.712954,
	author = {Fallahpour, Adibvafa and Seyed-Ahmadi, Arman and Idehpour, Parsa and Ibrahim, Omar and Gupta, Purav and Naimer, Jack and Zhu, Kevin and Shah, Arnav and Ma, Shihao and Adduri, Abhinav and G{\"u}loglu, Talu and Liu, Nuo and Cui, Haotian and Jain, Arihant and de Castro, Max and Fallahpour, Amirfaham and Cembellin-Prieto, Antonio and Stiles, John S. and Nem{\v c}ko, Filip and Nevue, Alexander A. and Moon, Hyungseok C. and Sosnick, Lucas and Markham, Olivia and Duan, Haonan and Lee, Michelle Y. Y. and Salvador, Andrea F. M. and Maddison, Chris J. and Thaiss, Christoph A. and Ricci-Tam, Chiara and Plosky, Brian S. and Burke, Dave P. and Hsu, Patrick D. and Goodarzi, Hani and Wang, Bo},
	title = {BioReason-Pro: Advancing Protein Function Prediction with Multimodal Biological Reasoning},
	elocation-id = {2026.03.19.712954},
	year = {2026},
	doi = {10.64898/2026.03.19.712954},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2026/03/20/2026.03.19.712954},
	eprint = {https://www.biorxiv.org/content/early/2026/03/20/2026.03.19.712954.full.pdf},
	journal = {bioRxiv}
}

Authors

Adibvafa Fallahpour¹²³⁵ * (adibvafa.fallahpour@mail.utoronto.ca)
Andrew Magnuson¹² *
Purav Gupta¹² *
Shihao Ma¹²³
Jack Naimer¹²³
Arnav Shah¹²³
Haonan Duan¹²
Omar Ibrahim³
Hani Goodarzi†⁴⁶
Chris J. Maddison†¹²⁷
Bo Wang†¹²³

¹ University of Toronto ² Vector Institute ³ University Health Network (UHN)
⁴ Arc Institute ⁵ Cohere ⁶ University of California, San Francisco ⁷ Google DeepMind

* Equal contribution
† Equal advising

Made with ❤️ at University of Toronto, Vector Institute, and University Health Network

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
bioreason		bioreason
data		data
figures		figures
grpo_trainer_lora_model		grpo_trainer_lora_model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval_kegg_dna_vllm.py		eval_kegg_dna_vllm.py
pyproject.toml		pyproject.toml
reason.py		reason.py
requirements.txt		requirements.txt
sh_convert_deepspeed_to_hf_ckpt_dna.sh		sh_convert_deepspeed_to_hf_ckpt_dna.sh
sh_convert_grpo_to_hf_ckpt.sh		sh_convert_grpo_to_hf_ckpt.sh
sh_grpo.sh		sh_grpo.sh
sh_train_dna_only.sh		sh_train_dna_only.sh
sh_train_dna_qwen.sh		sh_train_dna_qwen.sh
train_dna_only.py		train_dna_only.py
train_dna_qwen.py		train_dna_qwen.py
train_grpo.py		train_grpo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 BioReason
Incentivizing Multimodal Biological Reasoning
within a DNA-LLM Model

Abstract

Key Contributions

Datasets

Checkpoints

Installation

Prerequisites

Installation Steps

Results

KEGG-Derived Biological Reasoning Task

Variant Effect Prediction Benchmarks

Citation

Authors

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧬 BioReasonIncentivizing Multimodal Biological Reasoningwithin a DNA-LLM Model

Abstract

Key Contributions

Datasets

Checkpoints

Installation

Prerequisites

Installation Steps

Results

KEGG-Derived Biological Reasoning Task

Variant Effect Prediction Benchmarks

Citation

Authors

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages

🧬 BioReason
Incentivizing Multimodal Biological Reasoning
within a DNA-LLM Model