Pisum sativum Assembly and Gene Annotation

About Pisum sativum

Pea (Pisum sativum L., 2n = 14) is the second most important grain legume in the world after common bean and is an important green vegetable with 14.3 t of dry pea and 19.9 t of green pea produced in 2016. Pea belongs to the Leguminosae (or Fabaceae), which includes cool season grain legumes from the Galegoid clade, such as pea, lentil (Lens culinaris Medik.), chickpea (Cicer arietinum L.), faba bean (Vicia faba L.) and tropical grain legumes from the Milletoid clade, such as common bean (Phaseolus vulgaris L.), cowpea (Vigna unguiculata (L.) Walp.) and mungbean (Vigna radiata (L.) R. Wilczek). It provides significant ecosystem services: it is a valuable source of dietary proteins, mineral nutrients, complex starch and fibers with demonstrated health benefits and its symbiosis with N-fixing soil bacteria reduces the need for applied N fertilizers so mitigating greenhouse gas emissions. Pea was domesticated ~10,000 years ago by Neolithic farmers of the Fertile Crescent, along with cereals and other grain legumes8. The large reservoir of genetic diversity in Pisum has facilitated its spread throughout Asia, Europe, Africa, the Americas and Oceania where it has adapted to diverse environments and culinary practices.

Assembly

Complementary approaches were combined to obtain the pea reference genome assembly. Whole-genome Illumina short-read sequences were assembled into contigs using SoapdeNovo, then combined into scaffolds using long-range PacBio RSII sequences and whole-genome profiling of a bacterial articial chromosome (BAC) library. Scaffolds were manually curated for inter and intrachromosomal chimeras using sequences obtained from single chromosomes isolated by flow-cytometr and ultra-high-density skim genotyping-by-sequencing genetic map. Curated scaffolds were then integrated into 24,623 super-scaffolds (L50 of 415 kilobases (kb)) using BioNano maps. The seven pseudomolecules representing the pea chromosomes were obtained by anchoring super-scaffolds onto high-density genetic maps. Pseudomolecules were named according to the reference pea genetic map25 and chromosome numbering.

Annotation

Ab initio and homology-based methods were combined to annotate protein-coding sequences. In total, 44,756 complete and 29 truncated genes were predicted, with an average gene length, coding sequence length and exon number of 2,784 base pairs (bp), 1,016 bp and 6.33 exons, respectively. The vast majority of gene models were supported by complementary DNA/expressed sequence tag evidence.

References

A reference genome for pea provides insight into legume genome evolution.
Jonathan Kreplak, Mohammed-Amin Madoui, Petr Cápal....Jaroslav Doležel, Patrick Wincker & Judith Burstin
1. Nature Genetics. 51

Picture credit: Creative Commons Attribution-Share Alike 3.0 via Wikimedia Commons (Image source)

Statistics

Summary

Assembly	Pisum_sativum_v1a, INSDC Assembly GCA_900700895.2, Jul 2019
Database version	115.1
Golden Path Length	3,920,131,294
Genebuild by	URGI
Genebuild method	External annotation import
Data source	Genoscope

Gene counts

Gene/transcipt that contains an open reading frame (ORF).Coding genes	44,756
A transcript is the operational unit of a gene. In a genomic context, transcripts consist of one or more exons, with adjoining exons being separated by introns. The exons/introns are transcribed and then the introns spliced out. Transcripts may or may not encode a proteinGene transcripts	57,835

Pisum sativum Assembly and Gene Annotation

About Pisum sativum

Assembly

Annotation

References

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us

Favourite species

All species

Pisum sativum Assembly and Gene Annotation

About Pisum sativum

Assembly

Annotation

References

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us