PALACE is a computational framework based on deep learning models and conjugate graph theory to assemble high-quality and confident phage genomes from metagenomic sequencing data. PALACE currently supports normal pair-end reads. The assembled phages genomes analyzed in the manuscript are available at zenodo.
conda create -n palace_env
conda activate palace_env
conda install -c delta2cityu -c conda-forge -c bioconda palace
or
#mamba is recommended
mamba create -n palace_env
mamba activate palace_env
mamba install -c delta2cityu -c conda-forge -c bioconda palace
- pysam==0.17.0
- numpy==1.20.2
- sklearn==1.1.1
- biopython==1.78
- pysam==0.17.0
- matplotlib==3.4.2
Please check https://pytorch.org/get-started/previous-versions/ for installation
- torch==1.7.1
- torch-cluster==1.5.9
- torch-geometric==1.7.0
- torch-scatter==2.0.6
- torch-sparse==0.6.9
- torch-spline-conv==1.2.1
- torch-summary==1.4.5
- torchvision==0.8.0a0
- bwa BWA is a software package for reads mapping.
- samtools Reading/writing/editing/indexing/viewing SAM/BAM/CRAM format.
- fastp Provide fast all-in-one preprocessing for FastQ files.
- spades Pre-assembly
- ncbi-blast Sequence alignment tool.
- htslib
Install the prerequisites first, then clone the repository and enter the directory:
git clone https://github.com/deepomicslab/PALACE
#create a new mamba(conda) env
mamba create -n palace ## or conda create -n palace
mamba activate palace ## or conda activate palace
cd ./PALACE/
cd bin
chmod u+x ./*
cd ../share/palace/scripts/
python setup.py build_ext --inplace
- Config the config.txt file, here is a demo file.
fastq1, Read1 paired fastq file.fastq2, Read2 paired fastq file.phagedb, Phage reference database; the phage reference database can be download from google driver.protein_db, Phage protein database dir; the phage protein database file can be download from google drivergcn_model, Deep Learning model for phage contigs predict; can be download from google driverthreads, Threads to be used.out_dir, Output directory.prefix, Intermediate file prefix, can be sample name.ENV_PREFIX, Conda ENV path. can keep empty if conda ENV is activated.
- Runing PALACE.
palace --config config.txt
01-qc/, fastp output.02-assembly/, Raw assembly result with spades with --meta.03-search/, This directory contains three main intermediate files:hit_seqs.outcontains contigs with phage protein.node_scores.out, the second column is the score predicted by deeplearning network.{prefix}_ref_names.txt, contains phage references identified by kmer alignment.04-match/, This directory contains the graph structure of the conjugate graph({prefix}_filtered_graph.txt), the results of the graph decompose({prefix}_all_result.txt).05-furth, This directory contains the local matching result based on the phage reference.final_result, This directory contains the final result, final contig paths for phages({prefix}_final.txt) and phages fasta(```{prefix}_final.fasta)
PALACE is developed by DeepOmics lab under the supervision of Dr. Li Shuaicheng, City University of Hong Kong, Hong Kong, China. Should you have any queries, please feel free to contact us by gzpan2-c@my.cityu.edu.hk or ruohawang2-c@my.cityu.edu.hk.
This project is licensed under the MIT License - see the LICENSE.txt file for details.
