FAQ
Since this database is capable of several analyses, and ideally it would be enirched with additional ones, the name "The DataBase" has been considered (maybe with a bit of overconfidence) to be the proper one, and ZDB is the abbreviation of how french people would pronunce TheDB: zeDéBé. Secondly, It also wants to make comparative bacterial genomics eaZy.
How the pipeline works
150 is maximum number of genomes banchmarked with zDB, although there is no potential limit.
The pipeline performs the following analyses:
Orthologous proteins identified with OrthoFinder
Detailed annotation of proteins based on
(Kegg orthologs) - (optional)
COG (clusters of orthologous groups) - (optional) - Pfam domains (optional)
Precomputed homology searches with SwissProt (optional, manually annotated proteins) and Refseq (optional, but higly time and memory consuming - to use carefully) databases
Precomputed phylogenetic reconstructions of orthologous groups
Generation of customized databases for blast search
For a detailed description link (Bastian’s documentation)
For a detailed description link (Bastian’s documentation)
The user selects which annotations wants to include in the configuration file when he/she runs the analysis, therefore a new analyis must be run to include additional features (the argument ' -resume' of Netxflow will take into account the steps already performed if a new analysis is run).
In the home page in the box "Status of your database" you can visualize in red which analyses were not performed.
How to use the Comparative database web
There are several ways to do so. The easiest one is searching the name of the orthogroup of interest in the search bar, alternatively you can find all the orthogroups in the Genomes page once clicking on a genome of interest, or reach the Orthogroup overview page in the tutorial.
You can blast several sequence (both protein or nucleotide) at a time against the genomes of the database.
The search bar takes in input single terms or combination of them - refer to the Tutorial for further info. can be used to search any term into the datbase. For example it is possible to access a cog, a kegg or an accession term directly typing it in the searchbar.
Biological aspects
The distinction between chromosome and plasmids is done based on the annotaions reported in the input file.
Orthogroups (or orthologous groups) are identified with OrthoFinder . This tools identify orthogroups based on BLASTp (parameters: -evalue 0.001) results using the MCL clustering software.