Repeat feature annotation

Several software programs are run to annotate three types of repeats:

Low-complexity regions (Dust [1])
Tandem repeats (TRF [2])
Complex repeats:
- RepeatMasker [3]
- Repeat Detector (Red) [4] and Ensembl/plant-scripts [5]

Annotating repeats with RepeatMasker requires a repeat library. Repeat libraries from the following sources are used and combined where possible:

The MIPS Repeat Database (REdat).
nrTEplants, a curated library with repeated sequences annotated at REdat, RepetDB, TREP and other collections [5].

Viewing and accessing repeat features

By default, repeat features are not displayed in the genome browser; display them by using the Configure this page option. You can view all repeats, or a subset of repeats based on type.

The repeat annotations can be programatically accessed using the Ensembl API. See the RepeatFeature and RepeatFeatureAdaptor documentation for further details.

References

Morgulis A et al. (2006) A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J Comput Biol. 13:1028-40
Benson G (1999) Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27: 573-580
Smit AFA, Hubler R, Green P (2015) RepeatMasker Open-4.0 http://www.repeatmasker.org
Girgis HZ (2015) Red: an intelligent, rapid, accurate tool for detecting repeats de-novo on the genomic scale. BMC Bioinformatics, 16:227
Contreras-Moreira B, Filippi CV, Naamati G, Garcíía Girón C, Allen JE, Flicek P (2021) Efficient masking of plant genomes by combining kmer counting and curated repeats Preprint from bioRxiv, DOI: 10.1101/2021.03.22.436504

Repeat feature annotation

Viewing and accessing repeat features

References

About Us

Get help

Our sister sites

Follow us