Vigna unguiculata Assembly and Gene Annotation
About Vigna unguiculata
Cowpea (Vigna unguiculata [L.] Walp.) is one of the most important food and nutritional security crops, providing the main source of protein to millions of people in developing countries. In sub‐Saharan Africa, smallholder farmers are the major producers and consumers of cowpea, which is grown for its grains, tender leaves and pods as food for human consumption, with the crop residues being used for fodder or added back to the soil to improve fertility. Cowpea was domesticated in Africa, from where it spread into all continents and now is commonly grown in many parts of Asia, Europe, USA, and Central and South America. One of the strengths of cowpea is its high resilience to harsh conditions, including hot and dry environments, and poor soils.
Assembly
With PacBio data, eight draft assemblies were generated. Each of the eight assemblies contributed a fraction of its contigs to the final assembly using "stitching". Scaffolds were obtained by mapping the stitched and polished assembly to both optical maps using the Kansas State University pipeline. A total of 519.4 Mb of sequence scaffold were generated with an N50 of 16.4 Mb. Finally, a total of 10 genetic maps containing 44 003 unique Illumina iSelect SNPs were used to anchor and orient sequence scaffolds into 11 pseudochromosomes via ALLMAPS. More details about the process can be found in the quoted paper below.
Annotation
The assembled genome was annotated using de novo gene prediction and transcript evidence based on cowpea ESTs and RNA-seq data from leaf, stem, root, flower and seed tissue, and protein sequences of Arabidopsis, common bean, soybean, Medicago, poplar, rice and grape. In total, 29 773 protein-coding loci were annotated, along with 12 514 alternatively spliced transcripts. Most (95.9%) of the 1440 expected plant genes in BUSCO v3 were identified in the cowpea gene set, indicating completeness of genome assembly and annotation.
- The genome of cowpea (Vigna unguiculata [L.] Walp.).
Lonardi S, Muñoz-Amatriaín M, Liang Q, Shu S, Wanamaker SI, Lo S, Tanskanen J, Schulman AH, Zhu T, Luo MC, Alhakami H, Ounit R, Hasan AM, Verdier J, Roberts PA, Santos JRP, Ndeve A, Doležel J, Vrána J, Hokin SA, Farmer AD, Cannon SB, Close TJ.. Plant J 98 (5)
Picture credit: Wikipedia
Statistics
Summary
Assembly | ASM411807v1, INSDC Assembly GCA_004118075.1, Jan 2019 |
Database version | 113.1 |
Golden Path Length | 519,066,764 |
Genebuild by | JGI |
Genebuild method | External annotation import |
Data source | ucr DOI |
Gene counts
Coding genes | 31,814 |
Gene transcripts | 54,348 |