A quantitative reference transcriptome for Nematostella vectensis early embryonic development : a pipeline for de novo assembly in emerging model systems

dc.contributor.author Tulin, Sarah
dc.contributor.author Aguiar, Derek
dc.contributor.author Istrail, Sorin
dc.contributor.author Smith, Joel
dc.date.accessioned 2013-07-29T15:22:09Z
dc.date.available 2013-07-29T15:22:09Z
dc.date.issued 2013-06-03
dc.description © The Author(s), 2013. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in EvoDevo 4 (2013): 16, doi:10.1186/2041-9139-4-16. en_US
dc.description.abstract The de novo assembly of transcriptomes from short shotgun sequences raises challenges due to random and non-random sequencing biases and inherent transcript complexity. We sought to define a pipeline for de novo transcriptome assembly to aid researchers working with emerging model systems where well annotated genome assemblies are not available as a reference. To detail this experimental and computational method, we used early embryos of the sea anemone, Nematostella vectensis, an emerging model system for studies of animal body plan evolution. We performed RNA-seq on embryos up to 24 h of development using Illumina HiSeq technology and evaluated independent de novo assembly methods. The resulting reads were assembled using either the Trinity assembler on all quality controlled reads or both the Velvet and Oases assemblers on reads passing a stringent digital normalization filter. A control set of mRNA standards from the National Institute of Standards and Technology (NIST) was included in our experimental pipeline to invest our transcriptome with quantitative information on absolute transcript levels and to provide additional quality control. We generated >200 million paired-end reads from directional cDNA libraries representing well over 20 Gb of sequence. The Trinity assembler pipeline, including preliminary quality control steps, resulted in more than 86% of reads aligning with the reference transcriptome thus generated. Nevertheless, digital normalization combined with assembly by Velvet and Oases required far less computing power and decreased processing time while still mapping 82% of reads. We have made the raw sequencing reads and assembled transcriptome publically available. Nematostella vectensis was chosen for its strategic position in the tree of life for studies into the origins of the animal body plan, however, the challenge of reference-free transcriptome assembly is relevant to all systems for which well annotated gene models and independently verified genome assembly may not be available. To navigate this new territory, we have constructed a pipeline for library preparation and computational analysis for de novo transcriptome assembly. The gene models defined by this reference transcriptome define the set of genes transcribed in early Nematostella development and will provide a valuable dataset for further gene regulatory network investigations. en_US
dc.format.mimetype text/plain
dc.format.mimetype application/pdf
dc.format.mimetype application/vnd.ms-excel
dc.format.mimetype application/postscript
dc.identifier.citation EvoDevo 4 (2013): 16 en_US
dc.identifier.doi 10.1186/2041-9139-4-16
dc.identifier.uri https://hdl.handle.net/1912/6121
dc.language.iso en_US en_US
dc.publisher BioMed Central en_US
dc.relation.haspart https://hdl.handle.net/1912/5613
dc.relation.uri https://doi.org/10.1186/2041-9139-4-16
dc.rights Attribution 2.0 Generic *
dc.rights.uri http://creativecommons.org/licenses/by/2.0 *
dc.subject Transcriptome en_US
dc.subject Gene regulatory network en_US
dc.subject Nematostella embryonic development en_US
dc.subject Body plan evolution en_US
dc.subject Next-generation sequencing en_US
dc.subject Illumina HiSeq en_US
dc.subject Trinity en_US
dc.subject Oases en_US
dc.subject RNA-seq en_US
dc.title A quantitative reference transcriptome for Nematostella vectensis early embryonic development : a pipeline for de novo assembly in emerging model systems en_US
dc.type Article en_US
dspace.entity.type Publication
relation.isAuthorOfPublication eab81a74-4c1f-4242-8d5c-c09ee0f19544
relation.isAuthorOfPublication 4f9be058-b0e2-4941-b8e9-7b565e1faf28
relation.isAuthorOfPublication f5f5348e-aea3-45ce-a550-11be015e4d10
relation.isAuthorOfPublication 137d1176-d835-47a1-8997-79cff8560f49
relation.isAuthorOfPublication.latestForDiscovery eab81a74-4c1f-4242-8d5c-c09ee0f19544
Files
Original bundle
Now showing 1 - 5 of 5
Thumbnail Image
Name:
2041-9139-4-16.pdf
Size:
2.34 MB
Format:
Adobe Portable Document Format
Description:
Article
No Thumbnail Available
Name:
2041-9139-4-16-s1.txt
Size:
1.18 KB
Format:
Plain Text
Description:
Additional file 1: diginorm_velvet_oases_commands.txt
No Thumbnail Available
Name:
2041-9139-4-16-s2.txt
Size:
1.28 KB
Format:
Plain Text
Description:
Additional file 2: qc_trinity_commands.txt
No Thumbnail Available
Name:
2041-9139-4-16-s3.ps
Size:
3.34 MB
Format:
Postscript Files
Description:
Additional file 3: Ordinary least square regression plots
No Thumbnail Available
Name:
2041-9139-4-16-s4.xlsx
Size:
32.84 KB
Format:
Microsoft Excel
Description:
Additional file 4: Gene ontology (GO) term definitions
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.89 KB
Format:
Item-specific license agreed upon to submission
Description: