Multiple genome alignments

Multiple alignments are calculated between groups of genomes.

Alignments available

NameGenomesMethod used
8 riceOryza barthii, Oryza glaberrima, Oryza glumipatula, Oryza meridionalis, Oryza nivara, Oryza rufipogon, Oryza sativa Indica Group, Oryza sativa Japonica GroupEPO
11 riceOryza barthii, Oryza brachyantha, Oryza glaberrima, Oryza glumipatula, Oryza longistaminata, Oryza meridionalis, Oryza nivara, Oryza punctata, Oryza rufipogon, Oryza sativa Indica Group, Oryza sativa Japonica GroupEPO-Extended

Alignment methods

PECAN Multiple Alignment

Pecan is used to provide global multiple genomic alignments. First, Mercator is used to build a synteny map between the genomes and then Pecan builds alignments in these syntenic regions.

Pecan is a global multiple sequence alignment program that makes practical the probabilistic consistency methodology for significant numbers of sequences of practically arbitrary length. As input it takes a set of sequences and a phylogenetic tree. The parameters and heuristics it employs are highly user configurable, it is written entirely in Java and also requires the installation of Exonerate.

EPO Multiple Alignment

The EPO (Enredo, Pecan, Ortheus) pipeline is a three step pipeline for whole-genome multiple alignments.

  1. Enredo produces colinear segments from extant genomes handling both rearrangements, deletions and duplications.
  2. Pecan, as described above, is used to align these segments.
  3. Finally, Ortheus is used to create genome-wide ancestral sequence reconstructions.

The pipeline requires alignments of so-called anchor sequences, which are explained here. Further details on all these methods can be found at: Enredo and Pecan: Genome-wide mammalian consistency-based multiple alignment with paralogs

EPO-Extended Multiple Alignment

Due to difficulties with running Ortheus on the fragmented assemblies, we have two flavours of the pipeline.

  1. The plain EPO pipeline is available on the chromosome-level genomes, listed as EPO in the table above
  2. The scaffold-level genomes are then projected onto the EPO alignments using LastZ-net alignments, listed as EPO-Extended.

By construction, each pair of EPO and EPO-Extended alignments represent the exact same alignment of chromosome-level genomes.