Documentation


Installation


Protal can be installed via conda. For that you have to first add the channels bioconda and conda-forge. As protal has some dependencies, it is advised to install the program in a separate conda environment (as stated in the following).

conda create -n protal protal -c bioconda

Protal can also be installed by compiling the sources from github, but for now, conda is the prefered way to install protal. Further, a precompiled static binary can be downloaded from https://github.com/4less/protal/releases should you be unable to use conda on your cluster. This can also be useful if your cluster contains separate nodes and the nodes with internet are different from the ones running protal. Those library issues could lead to problems.

Run the following to test your installation

protal --help

Downloads


Clone the protal repository to download the latest github version:
git clone https://github.com/4less/protal.git

Usage


Test data

Download small test datasets from sra to test protal.
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR879/002/SRR8797712/SRR8797712_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR879/002/SRR8797712/SRR8797712_2.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR879/003/SRR8797713/SRR8797713_1.fastq.gz
wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR879/003/SRR8797713/SRR8797713_2.fastq.gz

Simple run

You can run protal with the following command:
protal \ 
    --db protal_r214/ \  # path to the protal database
    -1 SRR8797712_1.fastq.gz \ # path to forward reads
    -2 SRR8797712_2.fastq.gz \ # path to reverse reads
    -3 SRR8797712 \ # sample name for output
    --outdir ./ \ # output directory
    --threads 4  # number of threads
Or condensed in one line
protal --db protal_r214/ -1 SRR8797712_1.fastq.gz -2 SRR8797712_2.fastq.gz -3 SRR8797712 --outdir ./ --threads 4 --profile

More complex runs

For more complex runs, it can be tedious to provide all parameters on the command line. Therefore, protal allows to provide a mapping file to provide input and output files in a tabular format. The mapping file for the previous examnple looks like this:
Example mapping file
#STRAIN_OUTPUT_DIR	/path/to/output/strain
#OUTPUT_DIR	/path/to/output/
#SAM_OUTPUT_DIR	/path/to/output/sam
#PROFILE_OUTPUT_DIR	/path/to/output/profiles
#INPUT_DIR	/path/to/input/
#SAMPLEID	FIRST	SECOND	SAM	PREFIX	PROFILE
sample1	SRR8797712_1.fastq.gz	SRR8797712_2.fastq.gz	sample1.sam	sample1	sample1.profile
With this mapping file, all we need to do is provide the mapping file to protal. This is especially useful for large runs with many samples.
protal \
    --db protal_r214/ \
    --map test.map \
    --threads 4 \

Build GTDB database


Right now, building a GTDB database is not fully automated. The following steps are required to build a GTDB database for protal. Mind that it is not recommended. ....
Image
Image
Image

Copyright © All rights reserved | This template is made with Colorlib