A collection of reproducible benchmarks for nail
You'll need the following available on your system path:
- nail
- MMseqs2
- HMMER3
- easel (comes with HMMER3 distributions)
- the create-profmark binary (comes with HMMER3 distributions)
This benchmark was originally run using Pfam version 36.0 and Swissprot release-2023_05
To download the data, you can run
$ ./scripts/download-data.sh
which will place Pfam seed alignments & Swissprot sequences in the data/ directory:
$ tree data/
data
βββ long-seq
βΒ Β βββ query
βΒ Β βΒ Β βββ 1.query.fa
βΒ Β βΒ Β βββ 2.query.fa
βΒ Β βΒ Β βββ 3.query.fa
βΒ Β βΒ Β βββ 4.query.fa
βΒ Β βΒ Β βββ 5.query.fa
βΒ Β βΒ Β βββ 6.query.fa
βΒ Β βββ target
βΒ Β βββ 1.target.fa
βΒ Β βββ 2.target.fa
βΒ Β βββ 3.target.fa
βΒ Β βββ 4.target.fa
βΒ Β βββ 5.target.fa
βΒ Β βββ 6.target.fa
βββ pfam.sto
βββ uniprot.tar.gz
βββ uniprot_sprot.dat.gz
βββ uniprot_sprot.fasta
βββ uniprot_sprot.fasta.ssi
βββ uniprot_sprot.xml.gz
βββ uniprot_sprot_varsplic.fasta.gz
To build the benchmark, run
$ ./scripts/build-benchmark.sh
To run the benchmark, run
$ ./scripts/run-all.sh
To produce the plots, run
$ python ./scripts/plots.py ./benchmark/