Setaria italica Assembly and Gene Annotation

About Setaria italica

Setaria italica (foxtail millet) is a grain crop widely grown in Asia with particular significance in semi-arid regions of Northern China. It is also grown on a moderate scale in other parts of the world as a forage crop. It is one of the oldest domesticated crops with archeological remains from 5,500 to 5,900 years BC in northern China. Motivation for sequencing foxtail millet includes its close relationship, both genetically and physiologically, to the biofuel crop switchgrass (Panicum virgatum). Direct study of switchgrass is complicated by its large genome size and polyploidy. Data from the foxtail millet genome assists in study and improvement of switchgrass and related biofuel crops. The nuclear genome (~490 Mb) is diploid with nine chromosomes (2n=18).

Assembly

Setaria italica cv. Yugu1 was sequenced and assembled by the Joint Genome Institute (JGI) in collaboration with community researchers. Sanger sequencing was performed on whole-genome shotgun clone libraries having different insert sizes. Reads totalling 8.29x coverage were assembled with Arachne giving scaffolds that were arranged predominantly into nine pseudomolecules.

Annotation

Protein-coding genes were predicted using the standard JGI plant gene annotation pipeline. They used ESTs and homologous peptides from Arabidopsis, Brachypodium, rice and sorghum, mapped with BLAT alignments of PASA and EXONERATE, with GenomeScan, FGENESH+ and FGENESH_EST. BACs and fosmids were annotated using AUGUSTUS with maize parameters to predict gene models, which were compared to GenBank, TAIR and IRGSP/RAP proteins and manually inspected. Gene models were validated with RNA-seq data.

References

  1. Reference genome sequence of the model plant Setaria.
    Bennetzen JL, Schmutz J, Wang H, Percifield R, Hawkins J, Pontaroli AC, Estep M, Feng L, Vaughn JN, Grimwood J et al. 2012. Nat. Biotechnol.. 30:555-561.
  2. Genome sequence of foxtail millet (Setaria italica) provides insights into grass evolution and biofuel potential.
    Zhang G, Liu X, Quan Z, Cheng S, Xu X, Pan S, Xie M, Zeng P, Yue Z, Wang W et al. 2012. Nat. Biotechnol.. 30:549-554.
  3. Image credit: Markus Hagenlocher, via WikiCommons.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblySetaria_italica_v2.0, INSDC Assembly GCA_000263155.2, Oct 2015
Database version98.2
Base Pairs400,908,904
Golden Path Length405,732,883
Genebuild byJGI
Genebuild methodImport
Data sourceJoint Genome Institute

Gene counts

Coding genes35,831
Non coding genes779
Small non coding genes748
Long non coding genes31
Gene transcripts41,802

About this species