TIAMMAt: leveraging biodiversity to revise protein domain models, evidence from innate immunity
TIAMMAt: leveraging biodiversity to revise protein domain models, evidence from innate immunity
Date
2021-08-30
Authors
Tassia, Michael G.
David, Kyle T.
Townsend, James P.
Halanych, Kenneth M.
David, Kyle T.
Townsend, James P.
Halanych, Kenneth M.
Linked Authors
Person
Person
Person
Person
Alternative Title
Citable URI
As Published
Date Created
Location
DOI
10.1093/molbev/msab258
Related Materials
Replaces
Replaced By
Keywords
Protein evolution
Domain annotation
Animal evolution
Innate immunity
Domain annotation
Animal evolution
Innate immunity
Abstract
Sequence annotation is fundamental for studying the evolution of protein families, particularly when working with nonmodel species. Given the rapid, ever-increasing number of species receiving high-quality genome sequencing, accurate domain modeling that is representative of species diversity is crucial for understanding protein family sequence evolution and their inferred function(s). Here, we describe a bioinformatic tool called Taxon-Informed Adjustment of Markov Model Attributes (TIAMMAt) which revises domain profile hidden Markov models (HMMs) by incorporating homologous domain sequences from underrepresented and nonmodel species. Using innate immunity pathways as a case study, we show that revising profile HMM parameters to directly account for variation in homologs among underrepresented species provides valuable insight into the evolution of protein families. Following adjustment by TIAMMAt, domain profile HMMs exhibit changes in their per-site amino acid state emission probabilities and insertion/deletion probabilities while maintaining the overall structure of the consensus sequence. Our results show that domain revision can heavily impact evolutionary interpretations for some families (i.e., NLR’s NACHT domain), whereas impact on other domains (e.g., rel homology domain and interferon regulatory factor domains) is minimal due to high levels of sequence conservation across the sampled phylogenetic depth (i.e., Metazoa). Importantly, TIAMMAt revises target domain models to reflect homologous sequence variation using the taxonomic distribution under consideration by the user. TIAMMAt’s flexibility to revise any subset of the Pfam database using a user-defined taxonomic pool will make it a valuable tool for future protein evolution studies, particularly when incorporating (or focusing) on nonmodel species.
Description
© The Author(s), 2021. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Tassia, M. G., David, K. T., Townsend, J. P., & Halanych, K. M. TIAMMAt: leveraging biodiversity to revise protein domain models, evidence from innate immunity. Molecular Biology and Evolution, 38(12), (2021): 5806–5818, https://doi.org/10.1093/molbev/msab258.
Embargo Date
Citation
Tassia, M. G., David, K. T., Townsend, J. P., & Halanych, K. M. (2021). TIAMMAt: leveraging biodiversity to revise protein domain models, evidence from innate immunity. Molecular Biology and Evolution, 38(12), 5806–5818.