Camelina sativa Assembly and Gene Annotation

About Camelina sativa

Camelina sativa (false flax, gold of pleasure, German sesame) is a relict oilseed crop of the Crucifer family (Brassicaceae) with centres of origin in SE Europe and SW Asia. C. sativa was cultivated in Europe as an important oilseed crop for many centuries before being displaced by higher-yielding crops such as canola and wheat. It has several agronomic advantages for production, including early maturity, low requirement for water and nutrients, adaptability to adverse environmental conditions and resistance to common cruciferous pests and pathogens. It is currently being re-embraced as an industrial oil platform crop. C. sativa is diploid (2n=40) with an estimated genome size of 785 Mb, retaining a well preserved hexaploid genome as a result of a whole-genome triplication event.

Assembly

The genome of a homozygous doubled haploid line (DH55) was sequenced using a hybrid Illumina and Roche 454 next-generation sequencing (NGS) approach. Filtered sequence data (96.53 Gb) provided 123x coverage of the estimated genome size, which was assembled using a hierarchical assembly strategy into 37,871 scaffolds. A high-density genetic map based on 3,575 polymorphic markers allowed 608.54 Mb of the assembled genome, represented by 588 scaffolds to be anchored to the 20 chromosomes of C. sativa, thereby producing a highly contiguous final assembly with an N50 size of >30 Mb. The final genome assembly contains 641.45 Mb of sequence, covering 82% of the estimated genome size, 95% of which is in 20 chromosomes.

Annotation

RNA-seq data (78.5 Gb) was generated from tissue samples collected at 12 different growth stages to assist with annotation of protein-coding genes. Based on a comprehensive strategy of ab initio gene prediction and homology evidence from proteome data sets, ESTs and RNA-seq transcripts, 89,418 non-redundant genes were predicted, of which 4,753 (5.3%) genes encoded two or more alternatively spliced isoforms. More than 95% (85,274) of these annotated genes were located on the pseudochromosomes with the remainder on unanchored scaffolds. Based on sequence identity 97% of the predicted C. sativa genes have homologues in UniProt. RNA-seq evidence suggested that >90% of the genes were expressed (FPKM>0) in one or more developmental stages.

Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 1,298,859 Low complexity (Dust) features, covering 84 Mb (13.1% of the genome); 333,331 RepeatMasker features (with the REdat library), covering 128 Mb (20.0% of the genome); 5,216 RepeatMasker features (with the RepBase library), covering 1 Mb (0.1% of the genome); 333,890 Tandem repeats (TRF) features, covering 30 Mb (4.7% of the genome); Repeat Detector repeats length 230Mb (35.9% of the genome).

References

The emerging biofuel crop Camelina sativa retains a highly undifferentiated hexaploid genome structure.
Kagale S, Koh C, Nixon J, Bollina V, Clarke WE, Tuteja R, Spillane C, Robinson SJ, Links MG, Clarke C, Higgins EE, Huebert T, Sharpe AG, Parkin IA..

Image credit: Fornax CC BY-SA 3.0

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

Assembly	Cs, INSDC Assembly GCA_000633955.1,
Database version	115.1
Golden Path Length	641,356,059
Genebuild by	Camelina sativa Genome Project
Genebuild method	External annotation import
Data source	Agriculture & AgriFood Canada

Gene counts

Gene/transcipt that contains an open reading frame (ORF).Coding genes	89,402
A transcript is the operational unit of a gene. In a genomic context, transcripts consist of one or more exons, with adjoining exons being separated by introns. The exons/introns are transcribed and then the introns spliced out. Transcripts may or may not encode a proteinGene transcripts	94,479

Camelina sativa Assembly and Gene Annotation

About Camelina sativa

Assembly

Annotation

References

Links

More information

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us

Favourite species

All species

Camelina sativa Assembly and Gene Annotation

About Camelina sativa

Assembly

Annotation

References

Links

More information

Statistics

Summary

Gene counts

About Us

Get help

Our sister sites

Follow us