Setaria italica Assembly and Gene Annotation
About Setaria italica
Setaria italica (foxtail millet) is a grain crop widely grown in Asia with particular significance in semi-arid regions of Northern China. It is also grown on a moderate scale in other parts of the world as a forage crop. It is one of the oldest domesticated crops with archeological remains from 5,500 to 5,900 years BC in northern China. Motivation for sequencing foxtail millet includes its close relationship, both genetically and physiologically, to the biofuel crop switchgrass (Panicum virgatum). Direct study of switchgrass is complicated by its large genome size and polyploidy. Data from the foxtail millet genome assists in study and improvement of switchgrass and related biofuel crops. The nuclear genome (~490 Mb) is diploid with nine chromosomes (2n=18).
Setaria italica cv. Yugu1 was sequenced and assembled by the Joint Genome Institute (JGI) in collaboration with community researchers. Sanger sequencing was performed on whole-genome shotgun clone libraries having different insert sizes. Reads totalling 8.29x coverage were assembled with Arachne giving scaffolds that were arranged predominantly into nine pseudomolecules.
Protein-coding genes were predicted using the standard JGI plant gene annotation pipeline. They used ESTs and homologous peptides from Arabidopsis, Brachypodium, rice and sorghum, mapped with BLAT alignments of PASA and EXONERATE, with GenomeScan, FGENESH+ and FGENESH_EST. BACs and fosmids were annotated using AUGUSTUS with maize parameters to predict gene models, which were compared to GenBank, TAIR and IRGSP/RAP proteins and manually inspected. Gene models were validated with RNA-seq data.
- Reference genome sequence of the model plant Setaria.
Bennetzen JL, Schmutz J, Wang H, Percifield R, Hawkins J, Pontaroli AC, Estep M, Feng L, Vaughn JN, Grimwood J et al. 2012. Nat. Biotechnol.. 30:555-561.
- Genome sequence of foxtail millet (Setaria italica) provides insights into grass evolution and biofuel potential.
Zhang G, Liu X, Quan Z, Cheng S, Xu X, Pan S, Xie M, Zeng P, Yue Z, Wang W et al. 2012. Nat. Biotechnol.. 30:549-554.
- Image credit: Markus Hagenlocher, via WikiCommons.
General information about this species can be found in Wikipedia.
|Assembly||Setaria_italica_v2.0, INSDC Assembly GCA_000263155.2, Oct 2015|
|Golden Path Length||405,732,883|
|Genebuild method||Imported from ENA|
|Data source||Joint Genome Institute|
|Non coding genes||779|
|Small non coding genes||748|
|Long non coding genes||31|