Identification and removal of contaminant sequences from ribosomal gene databases : lessons from the Census of Deep Life
Sheik, Cody S.
Kiel Reese, Brandi
Twing, Katrina I.
Sylvan, Jason B.
Grim, Sharon L.
Schrenk, Matthew O.
Sogin, Mitchell L.
Colwell, Frederick S.
MetadataShow full item record
Earth’s subsurface environment is one of the largest, yet least studied, biomes on Earth, and many questions remain regarding what microorganisms are indigenous to the subsurface. Through the activity of the Census of Deep Life (CoDL) and the Deep Carbon Observatory, an open access 16S ribosomal RNA gene sequence database from diverse subsurface environments has been compiled. However, due to low quantities of biomass in the deep subsurface, the potential for incorporation of contaminants from reagents used during sample collection, processing, and/or sequencing is high. Thus, to understand the ecology of subsurface microorganisms (i.e., the distribution, richness, or survival), it is necessary to minimize, identify, and remove contaminant sequences that will skew the relative abundances of all taxa in the sample. In this meta-analysis, we identify putative contaminants associated with the CoDL dataset, recommend best practices for removing contaminants from samples, and propose a series of best practices for subsurface microbiology sampling. The most abundant putative contaminant genera observed, independent of evenness across samples, were Propionibacterium, Aquabacterium, Ralstonia, and Acinetobacter. While the top five most frequently observed genera were Pseudomonas, Propionibacterium, Acinetobacter, Ralstonia, and Sphingomonas. The majority of the most frequently observed genera (high evenness) were associated with reagent or potential human contamination. Additionally, in DNA extraction blanks, we observed potential archaeal contaminants, including methanogens, which have not been discussed in previous contamination studies. Such contaminants would directly affect the interpretation of subsurface molecular studies, as methanogenesis is an important subsurface biogeochemical process. Utilizing previously identified contaminant genera, we found that ∼27% of the total dataset were identified as contaminant sequences that likely originate from DNA extraction and DNA cleanup methods. Thus, controls must be taken at every step of the collection and processing procedure when working with low biomass environments such as, but not limited to, portions of Earth’s deep subsurface. Taken together, we stress that the CoDL dataset is an incredible resource for the broader research community interested in subsurface life, and steps to remove contamination derived sequences must be taken prior to using this dataset.
© The Author(s), 2018. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Frontiers in Microbiology 9 (2018): 840, doi:10.3389/fmicb.2018.00840.
Suggested CitationArticle: Sheik, Cody S., Kiel Reese, Brandi, Twing, Katrina I., Sylvan, Jason B., Grim, Sharon L., Schrenk, Matthew O., Sogin, Mitchell L., Colwell, Frederick S., "Identification and removal of contaminant sequences from ribosomal gene databases : lessons from the Census of Deep Life", Frontiers in Microbiology 9 (2018): 840, DOI:10.3389/fmicb.2018.00840, https://hdl.handle.net/1912/10327
The following license files are associated with this item:
Showing items related by title, author, creator and subject.
Malins, Donald C.; Stegeman, John J.; Anderson, Jack W.; Johnson, Paul M.; Gold, Jordan; Anderson, Katie M. (National Institute of Environmental Health Sciences, 2003-12-18)Structural differences were identified in gill DNA from two groups of English sole collected from Puget Sound, Washington, in October 2000. One group was from the industrialized Duwamish River (DR) in Seattle and the ...
Characterization of an aryl hydrocarbon receptor from a cetacean : an approach for assessing contaminant susceptibility in protected species Jensen, Brenda A. (Massachusetts Institute of Technology and Woods Hole Oceanographic Institution, 2000-09)Some cetaceans bioaccumulate substantial concentrations of halogenated aromatic hydrocarbons (HAH) in their tissues, but little is known about the effects of such burdens on cetacean health. 2,3,7,8-Tetrachlorodibenzo-p-dioxin ...
Environmental contaminants activate human and polar bear (Ursus maritimus) pregnane X receptors (PXR, NR1I2) differently Lille-Langoy, Roger; Goldstone, Jared V.; Rusten, Marte; Milnes, Matthew R.; Male, Rune; Stegeman, John J.; Blumberg, Bruce; Goksoyra, Anders (Elsevier, 2015-02-10)Many persistent organic pollutants (POPs) accumulate readily in polar bears because of their position as apex predators in Arctic food webs. The pregnane X receptor (PXR, formally NR1I2, here proposed to be named promiscuous ...