Gene and gene product predictions were downloaded together with t

Gene and gene product predictions were downloaded together with the genomes from NCBI (when available) and JCVI websites, except for the genome of X. axonopodis pv. manihotis str. CIO151 (unpublished), for which coding sequences (CDS) were predicted using Glimmer

3 [71] trained with the X. euvesicatoria str. 85-10 CDS [46]. All the genomes are referred to as stated in GSK461364 in vivo the abbreviation column in Table 1. Generation of Unus, a new library for the execution of phylogenomic workflows Unus is a Perl library that enables the easy execution of phylogenomic workflows including the detection of groups of orthologous genes, batch alignment of sequences, generation of files in a variety of formats and integration of accessory tests for recombination and models of evolution. The various possible workflows the user can go though in order to obtain a phylogenomic inference of the group of bacteria of interest are depicted in Figure 6. Fourteen Perl modules integrating the Unus package are available for download and code browsing

at http://​github.​com/​lmrodriguezr/​Unus/​. Figure 6 summarizes the different pipelines implemented with Unus and alternative programs that can be used. Figure https://www.selleckchem.com/products/blebbistatin.html 6 Workflows executable with the Unus libraries. The workflow on the left depicts the multiple steps allowed by the Unus library. Each step has multiple alternative methods or formats listed on the right side of the diagram. Detection of orthologous groups For the detection of Orthologous Groups (OG), we used the distribution of the Bits Score Ratio (BSR), a find more BLAST-based metric [72] essentially as previously described [10]. Briefly, the BSR is defined

as the proportion of the Bit Score of the alignment of the query sequence and the subject sequence, and the Bit Score of the alignment of the query sequence with itself (i.e., the maximum Bit Score for a given query). The histogram is usually bimodal (Additional file 6), and Unus detects the valley of the distribution as the threshold to accept a hit for each paired comparison. To avoid spurious results in distributions with shallow valleys or with no evident valley, the threshold for three Aspartate distributions was set as the average threshold (as calculated for the other paired comparisons). This method accounts for the problems previously observed when considering the best hit only [73, 74], as in widely used methods such as the BLAST Reciprocal Best Match (RBM), also implemented for comparison (see Additional file 7 for the annotated pseudo-code). Phylogenetic inference Multiple sequence alignments were performed using MUSCLE [75] on each detected OG. Alignments were discarded when a strong signal of recombination was detected in the Phi test [76], i.e., p-value ≤ 0.01 under the null model of no recombination. Phylogenetic inference based on whole genomes used Maximum Likelihood (ML) optimality criterion, as implemented in RAxML v7.2.

Comments are closed.