Chara braunii Assembly and Gene Annotation
About Chara braunii
Chara braunii belongs to class Charophyceae, a class of charophyte green algae, commonly known as stoneworts and brittleworts. Charophyceae have the most complex body plans among charophytic algae. The haploid thallus body plan encompasses a shoot-like axis consisting of nodes with whorls, internodes, a simplex apical meristem, and multicellular rhizoids. Their genomes have been compared to those of land plants to identify evolutionary novelties for plant terrestrialization.
Assembly
C. braunii features a haplontic life cycle; this draft sequence here represents a haploid genome. The observed chromosome number (n=14) corresponds to the base chromosome number of Chara species. Genomic DNA of the uni-algal strain S276 isolated from Lake Kasumigaura (Ibaraki, Japan) was sequenced as the reference genome using Illumina technology and sequences were compared with those of the strain S277 that was isolated from the pond at Ehime (Japan). Approximately 0.25 Gbp of scaffolds were present in only one of the datasets and found to be of bacterial origin. After removal of these prokaryotic sequences, 1.75 Gbp of scaffold data (N50 size of 2.26 Mbp) were obtained, of which 1.43 Gbp were assembled into contigs. This corresponds to ∼74% of the genome as measured by flow cytometry (1.89-1.96 Gbp) and to ∼61% of the 2.35 Gbp estimated by k-mer analysis.
Annotation
RNA sequencing (RNA-seq) of vegetative and reproductive stages was used together with full-length cDNA sequences to annotate the genome with AUGUSTUS. In total 23,546 putative protein-coding gene models were identified, of which 53% are supported by RNA-seq data. At least 94% of several conserved core gene sets are encoded by the genome, indicating its suitability for genomic and comparative analyses.
Repeats were annotated with the Ensembl Genomes repeat feature pipeline. There are: 3043731 Low complexity (Dust) features, covering 436 Mb (24.9% of the genome); 817800 RepeatMasker features (with the REdat library), covering 126 Mb (7.2% of the genome); 207059 RepeatMasker features (with the RepBase library), covering 26 Mb (1.5% of the genome); 1784326 Tandem repeats (TRF) features, covering 160 Mb (9.1% of the genome); Repeat Detector repeats length 788Mb (45.05% of the genome).
References
- The Chara Genome: Secondary Complexity and Implications for Plant
Terrestrialization.
Nishiyama T, Sakayama H, de Vries J, Buschmann H, Saint-Marcoux D et al. 2018. Cell. 174(2):448-464.
Picture credit: Show_ryu CC BY-SA 3.0
Links
More information
General information about this species can be found in Wikipedia.
Statistics
Summary
Assembly | Cbr_1.0, INSDC Assembly GCA_003427395.1, Jul 2018 |
Database version | 113.1 |
Golden Path Length | 1,751,211,849 |
Genebuild by | KU |
Genebuild method | Import |
Data source | Kanazawa University |
Gene counts
Coding genes | 34,718 |
Pseudogenes | 2 |
Gene transcripts | 35,666 |