Mol. Biol. Evol. 37(7):2034–2044; doi:10.1093/molbev/msaa065
Molecular Biology and Evolution provides yearly recognition of recently published manuscripts that have made strong impressions on our research community since their publication. Below, we highlight ten discoveries, five methods, and five resources as “Emerging Classics” based on citations accrued per fractional year since print publication. Articles are listed alphabetically by the first author’s family name. Total citation counts were obtained from Web of Science on November 6, 2020. We congratulate these authors on the significance of their contributions and look forward to seeing new classics emerge in the years to come.
AbstractPresumably, due to a rapid early diversification, major parts of the higher-level phylogeny of birds are still resolved controversially in different analyses or are considered unresolvable. To address this problem, we produced an avian tree of life, which includes molecular sequences of one or several species of ∼90% of the currently recognized family-level taxa (429 species, 379 genera) including all 106 family-level taxa of the nonpasserines and 115 of the passerines (Passeriformes). The unconstrained analyses of noncoding 3-prime untranslated region (3′-UTR) sequences and those of coding sequences yielded different trees. In contrast to the coding sequences, the 3′-UTR sequences resulted in a well-resolved and stable tree topology. The 3′-UTR contained, unexpectedly, transcription factor binding motifs that were specific for different higher-level taxa. In this tree, grebes and flamingos are the sister clade of all other Neoaves, which are subdivided into five major clades. All nonpasserine taxa were placed with robust statistical support including the long-time enigmatic hoatzin (Opisthocomiformes), which was found being the sister taxon of the Caprimulgiformes. The comparatively late radiation of family-level clades of the songbirds (oscine Passeriformes) contrasts with the attenuated diversification of nonpasseriform taxa since the early Miocene. This correlates with the evolution of vocal production learning, an important speciation factor, which is ancestral for songbirds and evolved convergent only in hummingbirds and parrots. As 3′-UTR-based phylotranscriptomics resolved the avian family-level tree of life, we suggest that this procedure will also resolve the all-species avian tree of life
AbstractReconstructing the evolutionary history of island biotas is complicated by unusual morphological evolution in insular environments. However, past human-caused extinctions limit the use of molecular analyses to determine origins and affinities of enigmatic island taxa. The Caribbean formerly contained a morphologically diverse assemblage of caviomorph rodents (33 species in 19 genera), ranging from ∼0.1 to 200 kg and traditionally classified into three higher-order taxa (Capromyidae/Capromyinae, Heteropsomyinae, and Heptaxodontidae). Few species survive today, and the evolutionary affinities of living and extinct Caribbean caviomorphs to each other and to mainland taxa are unclear: Are they monophyletic, polyphyletic, or paraphyletic? We use ancient DNA techniques to present the first genetic data for extinct heteropsomyines and heptaxodontids, as well as for several extinct capromyids, and demonstrate through analysis of mitogenomic and nuclear data sets that all sampled Caribbean caviomorphs represent a well-supported monophyletic group. The remarkable morphological and ecological variation observed across living and extinct caviomorphs from Cuba, Hispaniola, Jamaica, Puerto Rico, and other islands was generated through within-archipelago evolutionary radiation following a single Early Miocene overwater colonization. This evolutionary pattern contrasts with the origination of diversity in many other Caribbean groups. All living and extinct Caribbean caviomorphs comprise a single biologically remarkable subfamily (Capromyinae) within the morphologically conservative living Neotropical family Echimyidae. Caribbean caviomorphs represent an important new example of insular mammalian adaptive radiation, where taxa retaining “ancestral-type” characteristics coexisted alongside taxa occupying novel island niches. Diversification was associated with the greatest insular body mass increase recorded in rodents and possibly the greatest for any mammal lineage.
AbstractSubstantial progress has been made globally to control malaria, however there is a growing need for innovative new tools to ensure continued progress. One approach is to harness genetic sequencing and accompanying methodological approaches as have been used in the control of other infectious diseases. However, to utilize these methodologies for malaria, we first need to extend the methods to capture the complex interactions between parasites, human and vector hosts, and environment, which all impact the level of genetic diversity and relatedness of malaria parasites. We develop an individual-based transmission model to simulate malaria parasite genetics parameterized using estimated relationships between complexity of infection and age from five regions in Uganda and Kenya. We predict that cotransmission and superinfection contribute equally to within-host parasite genetic diversity at 11.5% PCR prevalence, above which superinfections dominate. Finally, we characterize the predictive power of six metrics of parasite genetics for detecting changes in transmission intensity, before grouping them in an ensemble statistical model. The model predicted malaria prevalence with a mean absolute error of 0.055. Different assumptions about the availability of sample metadata were considered, with the most accurate predictions of malaria prevalence made when the clinical status and age of sampled individuals is known. Parasite genetics may provide a novel surveillance tool for estimating the prevalence of malaria in areas in which prevalence surveys are not feasible. However, the findings presented here reinforce the need for patient metadata to be recorded and made available within all future attempts to use parasite genetics for surveillance.
AbstractThe genus Acropora comprises the most diverse and abundant scleractinian corals (Anthozoa, Cnidaria) in coral reefs, the most diverse marine ecosystems on Earth. However, the genetic basis for the success and wide distribution of Acropora are unknown. Here, we sequenced complete genomes of 15 Acropora species and 3 other acroporid taxa belonging to the genera Montipora and Astreopora to examine genomic novelties that explain their evolutionary success. We successfully obtained reasonable draft genomes of all 18 species. Molecular dating indicates that the Acropora ancestor survived warm periods without sea ice from the mid or late Cretaceous to the Early Eocene and that diversification of Acropora may have been enhanced by subsequent cooling periods. In general, the scleractinian gene repertoire is highly conserved; however, coral- or cnidarian-specific possible stress response genes are tandemly duplicated in Acropora. Enzymes that cleave dimethlysulfonioproprionate into dimethyl sulfide, which promotes cloud formation and combats greenhouse gasses, are the most duplicated genes in the Acropora ancestor. These may have been acquired by horizontal gene transfer from algal symbionts belonging to the family Symbiodiniaceae, or from coccolithophores, suggesting that although functions of this enzyme in Acropora are unclear, Acropora may have survived warmer marine environments in the past by enhancing cloud formation. In addition, possible antimicrobial peptides and symbiosis-related genes are under positive selection in Acropora, perhaps enabling adaptation to diverse environments. Our results suggest unique Acropora adaptations to ancient, warm marine environments and provide insights into its capacity to adjust to rising seawater temperatures.
AbstractCorrespondence between evolution and development has been discussed for more than two centuries. Recent work reveals that phylogeny−ontogeny correlations are indeed present in developmental transcriptomes of eukaryotic clades with complex multicellularity. Nevertheless, it has been largely ignored that the pervasive presence of phylogeny−ontogeny correlations is a hallmark of development in eukaryotes. This perspective opens a possibility to look for similar parallelisms in biological settings where developmental logic and multicellular complexity are more obscure. For instance, it has been increasingly recognized that multicellular behavior underlies biofilm formation in bacteria. However, it remains unclear whether bacterial biofilm growth shares some basic principles with development in complex eukaryotes. Here we show that the ontogeny of growing Bacillus subtilis biofilms recapitulates phylogeny at the expression level. Using time-resolved transcriptome and proteome profiles, we found that biofilm ontogeny correlates with the evolutionary measures, in a way that evolutionary younger and more diverged genes were increasingly expressed toward later timepoints of biofilm growth. Molecular and morphological signatures also revealed that biofilm growth is highly regulated and organized into discrete ontogenetic stages, analogous to those of eukaryotic embryos. Together, this suggests that biofilm formation in Bacillus is a bona fide developmental process comparable to organismal development in animals, plants, and fungi. Given that most cells on Earth reside in the form of biofilms and that biofilms represent the oldest known fossils, we anticipate that the widely adopted vision of the first life as a single-cell and free-living organism needs rethinking.
AbstractPopulation genetic theory and empirical evidence indicate that deleterious alleles can be purged in small populations. However, this viewpoint remains controversial. It is unclear whether natural selection is powerful enough to purge deleterious mutations when wild populations continue to decline. Pheasants are terrestrial birds facing a long-term risk of extinction as a result of anthropogenic perturbations and exploitation. Nevertheless, there are scant genomics resources available for conservation management and planning. Here, we analyzed comparative population genomic data for the three extant isolated populations of Brown eared pheasant (Crossoptilon mantchuricum) in China. We showed that C. mantchuricum has low genome-wide diversity and a contracting effective population size because of persistent declines over the past 100,000 years. We compared genome-wide variation in C. mantchuricum with that of its closely related sister species, the Blue eared pheasant (C. auritum) for which the conservation concern is low. There were detrimental genetic consequences across all C. mantchuricum genomes including extended runs of homozygous sequences, slow rates of linkage disequilibrium decay, excessive loss-of-function mutations, and loss of adaptive genetic diversity at the major histocompatibility complex region. To the best of our knowledge, this study is the first to perform a comprehensive conservation genomic analysis on this threatened pheasant species. Moreover, we demonstrated that natural selection may not suffice to purge deleterious mutations in wild populations undergoing long-term decline. The findings of this study could facilitate conservation planning for threatened species and help recover their population size.
AbstractIt has been suggested that, due to the structure of the genetic code, nonsynonymous transitions are less likely than transversions to cause radical changes in amino acid physicochemical properties so are on average less deleterious. This view was supported by some but not all mutagenesis experiments. Because laboratory measures of fitness effects have limited sensitivities and relative frequencies of different mutations in mutagenesis studies may not match those in nature, we here revisit this issue using comparative genomics. We extend the standard codon model of sequence evolution by adding the parameter η that quantifies the ratio of the fixation probability of transitional nonsynonymous mutations to that of transversional nonsynonymous mutations. We then estimate η from the concatenated alignment of all protein-coding DNA sequences of two closely related genomes. Surprisingly, η ranges from 0.13 to 2.0 across 90 species pairs sampled from the tree of life, with 51 incidences of η < 1 and 30 incidences of η >1 that are statistically significant. Hence, whether nonsynonymous transversions are overall more deleterious than nonsynonymous transitions is species-dependent. Because the corresponding groups of amino acid replacements differ between nonsynonymous transitions and transversions, η is influenced by the relative exchangeabilities of amino acid pairs. Indeed, an extensive search reveals that the large variation in η is primarily explainable by the recently reported among-species disparity in amino acid exchangeabilities. These findings demonstrate that genome-wide nucleotide substitution patterns in coding sequences have species-specific features and are more variable among evolutionary lineages than are currently thought.
AbstractCytoplasmic incompatibility is a selfish reproductive manipulation induced by the endosymbiont Wolbachia in arthropods. In males Wolbachia modifies sperm, leading to embryonic mortality in crosses with Wolbachia-free females. In females, Wolbachia rescues the cross and allows development to proceed normally. This provides a reproductive advantage to infected females, allowing the maternally transmitted symbiont to spread rapidly through host populations. We identified homologs of the genes underlying this phenotype, cifA and cifB, in 52 of 71 new and published Wolbachia genome sequences. They are strongly associated with cytoplasmic incompatibility. There are up to seven copies of the genes in each genome, and phylogenetic analysis shows that Wolbachia frequently acquires new copies due to pervasive horizontal transfer between strains. In many cases, the genes have subsequently acquired loss-of-function mutations to become pseudogenes. As predicted by theory, this tends to occur first in cifB, whose sole function is to modify sperm, and then in cifA, which is required to rescue the cross in females. Although cif genes recombine, recombination is largely restricted to closely related homologs. This is predicted under a model of coevolution between sperm modification and embryonic rescue, where recombination between distantly related pairs of genes would create a self-incompatible strain. Together, these patterns of gene gain, loss, and recombination support evolutionary models of cytoplasmic incompatibility.
AbstractIn correctly predicting that selection efficiency is positively correlated with the effective population size (Ne), the nearly neutral theory provides a coherent understanding of between-species variation in numerous genomic parameters, including heritable error (germline mutation) rates. Does the same theory also explain variation in phenotypic error rates and in abundance of error mitigation mechanisms? Translational read-through provides a model to investigate both issues as it is common, mostly nonadaptive, and has good proxy for rate (TAA being the least leaky stop codon) and potential error mitigation via “fail-safe” 3′ additional stop codons (ASCs). Prior theory of translational read-through has suggested that when population sizes are high, weak selection for local mitigation can be effective thus predicting a positive correlation between ASC enrichment and Ne. Contra to prediction, we find that ASC enrichment is not correlated with Ne. ASC enrichment, although highly phylogenetically patchy, is, however, more common both in unicellular species and in genes expressed in unicellular modes in multicellular species. By contrast, Ne does positively correlate with TAA enrichment. These results imply that local phenotypic error rates, not local mitigation rates, are consistent with a drift barrier/nearly neutral model.
AbstractAmino acid substitutions at nonconserved protein positions can have noncanonical and “long-distance” outcomes on protein function. Such outcomes might arise from changes in the internal protein communication network, which is often accompanied by changes in structural flexibility. To test this, we calculated flexibilities and dynamic coupling for positions in the linker region of the lactose repressor protein. This region contains nonconserved positions for which substitutions alter DNA-binding affinity. We first chose to study 11 substitutions at position 52. In computations, substitutions showed long-range effects on flexibilities of DNA-binding positions, and the degree of flexibility change correlated with experimentally measured changes in DNA binding. Substitutions also altered dynamic coupling to DNA-binding positions in a manner that captured other experimentally determined functional changes. Next, we broadened calculations to consider the dynamic coupling between 17 linker positions and the DNA-binding domain. Experimentally, these linker positions exhibited a wide range of substitution outcomes: Four conserved positions tolerated hardly any substitutions (“toggle”), ten nonconserved positions showed progressive changes from a range of substitutions (“rheostat”), and three nonconserved positions tolerated almost all substitutions (“neutral”). In computations with wild-type lactose repressor protein, the dynamic couplings between the DNA-binding domain and these linker positions showed varied degrees of asymmetry that correlated with the observed toggle/rheostat/neutral substitution outcomes. Thus, we propose that long-range and noncanonical substitutions outcomes at nonconserved positions arise from rewiring long-range communication among functionally important positions. Such calculations might enable predictions for substitution outcomes at a range of nonconserved positions.
AbstractDivergence of gene function and expression during development can give rise to phenotypic differences at the level of cells, tissues, organs, and ultimately whole organisms. To gain insights into the evolution of gene expression and novel genes at spatial resolution, we compared the spatially resolved transcriptomes of two distantly related nematodes, Caenorhabditis elegans and Pristionchus pacificus, that diverged 60–90 Ma. The spatial transcriptomes of adult worms show little evidence for strong conservation at the level of single genes. Instead, regional expression is largely driven by recent duplication and emergence of novel genes. Estimation of gene ages across anatomical structures revealed an enrichment of novel genes in sperm-related regions. This provides first evidence in nematodes for the “out of testis” hypothesis that has been previously postulated based on studies in Drosophila and mammals. “Out of testis” genes represent a mix of products of pervasive transcription as well as fast evolving members of ancient gene families. Strikingly, numerous novel genes have known functions during meiosis in Caenorhabditis elegans indicating that even universal processes such as meiosis may be targets of rapid evolution. Our study highlights the importance of novel genes in generating phenotypic diversity and explicitly characterizes gene origination in sperm-related regions. Furthermore, it proposes new functions for previously uncharacterized genes and establishes the spatial transcriptome of Pristionchus pacificus as a catalog for future studies on the evolution of gene expression and function.
AbstractTelomerase RNA (TR) is a noncoding RNA essential for the function of telomerase ribonucleoprotein. TRs from vertebrates, fungi, ciliates, and plants exhibit extreme diversity in size, sequence, secondary structure, and biogenesis pathway. However, the evolutionary pathways leading to such unusual diversity among eukaryotic kingdoms remain elusive. Within the metazoan kingdom, the study of TR has been limited to vertebrates and echinoderms. To understand the origin and evolution of TR across the animal kingdom, we employed a phylogeny-guided, structure-based bioinformatics approach to identify 82 novel TRs from eight previously unexplored metazoan phyla, including the basal-branching sponges. Synthetic TRs from two representative species, a hemichordate and a mollusk, reconstitute active telomerase in vitro with their corresponding telomerase reverse transcriptase components, confirming that they are authentic TRs. Comparative analysis shows that three functional domains, template-pseudoknot (T-PK), CR4/5, and box H/ACA, are conserved between vertebrate and the basal metazoan lineages, indicating a monophyletic origin of the animal TRs with a snoRNA-related biogenesis mechanism. Nonetheless, TRs along separate animal lineages evolved with divergent structural elements in the T-PK and CR4/5 domains. For example, TRs from echinoderms and protostomes lack the canonical CR4/5 and have independently evolved functionally equivalent domains with different secondary structures. In the T-PK domain, a P1.1 stem common in most metazoan clades defines the template boundary, which is replaced by a P1-defined boundary in vertebrates. This study provides unprecedented insight into the divergent evolution of detailed TR secondary structures across broad metazoan lineages, revealing ancestral and later-diversified elements.
AbstractThe recent technological advances underlying the screening of large combinatorial libraries in high-throughput mutational scans deepen our understanding of adaptive protein evolution and boost its applications in protein design. Nevertheless, the large number of possible genotypes requires suitable computational methods for data analysis, the prediction of mutational effects, and the generation of optimized sequences. We describe a computational method that, trained on sequencing samples from multiple rounds of a screening experiment, provides a model of the genotype–fitness relationship. We tested the method on five large-scale mutational scans, yielding accurate predictions of the mutational effects on fitness. The inferred fitness landscape is robust to experimental and sampling noise and exhibits high generalization power in terms of broader sequence space exploration and higher fitness variant predictions. We investigate the role of epistasis and show that the inferred model provides structural information about the 3D contacts in the molecular fold.
AbstractThe evolutionary transition from outcrossing to selfing can have important genomic consequences. Decreased effective population size and the reduced efficacy of selection are predicted to play an important role in the molecular evolution of the genomes of selfing species. We investigated evidence for molecular signatures of the genomic selfing syndrome using 66 species of Primula including distylous (outcrossing) and derived homostylous (selfing) taxa. We complemented our comparative analysis with a microevolutionary study of P. chungensis, which is polymorphic for mating system and consists of both distylous and homostylous populations. We generated chloroplast and nuclear genomic data sets for distylous, homostylous, and distylous–homostylous species and identified patterns of nonsynonymous to synonymous divergence (dN/dS) and polymorphism (πN/πS) in species or lineages with contrasting mating systems. Our analysis of coding sequence divergence and polymorphism detected strongly reduced genetic diversity and heterozygosity, decreased efficacy of purifying selection, purging of large-effect deleterious mutations, and lower rates of adaptive evolution in samples from homostylous compared with distylous populations, consistent with theoretical expectations of the genomic selfing syndrome. Our results demonstrate that self-fertilization is a major driver of molecular evolutionary processes with genomic signatures of selfing evident in both old and relatively young homostylous populations.
AbstractSex chromosomes are classically predicted to stop recombining in the heterogametic sex, thereby enforcing linkage between sex-determining (SD) and sex-antagonistic (SA) genes. With the same rationale, a pre-existing sex asymmetry in recombination is expected to affect the evolution of heterogamety, for example, a low rate of male recombination might favor transitions to XY systems, by generating immediate linkage between SD and SA genes. Furthermore, the accumulation of deleterious mutations on nonrecombining Y chromosomes should favor XY-to-XY transitions (which discard the decayed Y), but disfavor XY-to-ZW transitions (which fix the decayed Y as an autosome). Like many anuran amphibians, Hyla tree frogs have been shown to display drastic heterochiasmy (males only recombine at chromosome tips) and are typically XY, which seems to fit the above expectations. Instead, here we demonstrate that two species, H. sarda and H. savignyi, share a common ZW system since at least 11 Ma. Surprisingly, the typical pattern of restricted male recombination has been maintained since then, despite female heterogamety. Hence, sex chromosomes recombine freely in ZW females, not in ZZ males. This suggests that heterochiasmy does not constrain heterogamety (and vice versa), and that the role of SA genes in the evolution of sex chromosomes might have been overemphasized.
AbstractThe postsynaptic density extends across the postsynaptic dendritic spine with discs large (DLG) as the most abundant scaffolding protein. DLG dynamically alters the structure of the postsynaptic density, thus controlling the function and distribution of specific receptors at the synapse. DLG contains three PDZ domains and one important interaction governing postsynaptic architecture is that between the PDZ3 domain from DLG and a protein called cysteine-rich interactor of PDZ3 (CRIPT). However, little is known regarding functional evolution of the PDZ3:CRIPT interaction. Here, we subjected PDZ3 and CRIPT to ancestral sequence reconstruction, resurrection, and biophysical experiments. We show that the PDZ3:CRIPT interaction is an ancient interaction, which was likely present in the last common ancestor of Eukaryotes, and that high affinity is maintained in most extant animal phyla. However, affinity is low in nematodes and insects, raising questions about the physiological function of the interaction in species from these animal groups. Our findings demonstrate how an apparently established protein–protein interaction involved in cellular scaffolding in bilaterians can suddenly be subject to dynamic evolution including possible loss of function.
AbstractWe studied five chemically distinct but related 1,3,5-triazine antifolates with regard to their effects on growth of a set of mutants in dihydrofolate reductase. The mutants comprise a combinatorially complete data set of all 16 possible combinations of four amino acid replacements associated with resistance to pyrimethamine in the malaria parasite Plasmodium falciparum. Pyrimethamine was a mainstay medication for malaria for many years, and it is still in use in intermittent treatment during pregnancy or as a partner drug in artemisinin combination therapy. Our goal was to investigate the extent to which the alleles yield similar adaptive topographies and patterns of epistasis across chemically related drugs. We find that the adaptive topographies are indeed similar with the same or closely related alleles being fixed in computer simulations of stepwise evolution. For all but one of the drugs the topography features at least one suboptimal fitness peak. Our data are consistent with earlier results indicating that third order and higher epistatic interactions appear to contribute only modestly to the overall adaptive topography, and they are largely conserved. In regard to drug development, our data suggest that higher-order interactions are likely to be of little value as an advisory tool in the choice of lead compounds.
AbstractPhylogenetic dating is one of the most powerful and commonly used methods of drawing epidemiological interpretations from pathogen genomic data. Building such trees requires considering a molecular clock model which represents the rate at which substitutions accumulate on genomes. When the molecular clock rate is constant throughout the tree then the clock is said to be strict, but this is often not an acceptable assumption. Alternatively, relaxed clock models consider variations in the clock rate, often based on a distribution of rates for each branch. However, we show here that the distributions of rates across branches in commonly used relaxed clock models are incompatible with the biological expectation that the sum of the numbers of substitutions on two neighboring branches should be distributed as the substitution number on a single branch of equivalent length. We call this expectation the additivity property. We further show how assumptions of commonly used relaxed clock models can lead to estimates of evolutionary rates and dates with low precision and biased confidence intervals. We therefore propose a new additive relaxed clock model where the additivity property is satisfied. We illustrate the use of our new additive relaxed clock model on a range of simulated and real data sets, and we show that using this new model leads to more accurate estimates of mean evolutionary rates and ancestral dates.
AbstractSpermatogenesis is an essential process for producing sperm cells. Reproductive strategy is successfully evolved for a species to adapt to a certain ecological system. However, roles of newly evolved genes in testis autophagy remain unclear. In this study, we found that a newly evolved gene srag (Sox9-regulated autophagy gene) plays an important role in promoting autophagy in testis in the lineage of the teleost Monopterus albus. The gene integrated into an interaction network through a two-way strategy of evolution, via Sox9-binding in its promoter and interaction with Becn1 in the coding region. Its promoter region evolved a cis element for binding of Sox9, a transcription factor for male sex determination. Both in vitro and in vivo analyses demonstrated that transcription factor Sox9 could bind to and activate the srag promoter. Its coding region acquired ability to interact with key autophagy initiation factor Becn1 via the conserved C-terminal, indicating that srag integrated into preexisting autophagy network. Moreover, we determined that Srag enhanced autophagy by interacting with Becn1. Notably, srag transgenic zebrafish revealed that Srag exerted the same function by enhancing autophagy through the Srag–Becn1 pathway. Thus, the new gene srag regulated autophagy in testis by integrated into preexisting autophagy network.
AbstractHuman herpesvirus 6A and 6B (HHV-6) can integrate into the germline, and as a result, ∼70 million people harbor the genome of one of these viruses in every cell of their body. Until now, it has been largely unknown if 1) these integrations are ancient, 2) if they still occur, and 3) whether circulating virus strains differ from integrated ones. Here, we used next-generation sequencing and mining of public human genome data sets to generate the largest and most diverse collection of circulating and integrated HHV-6 genomes studied to date. In genomes of geographically dispersed, only distantly related people, we identified clades of integrated viruses that originated from a single ancestral event, confirming this with fluorescent in situ hybridization to directly observe the integration locus. In contrast to HHV-6B, circulating and integrated HHV-6A sequences form distinct clades, arguing against ongoing integration of circulating HHV-6A or “reactivation” of integrated HHV-6A. Taken together, our study provides the first comprehensive picture of the evolution of HHV-6, and reveals that integration of heritable HHV-6 has occurred since the time of, if not before, human migrations out of Africa.
AbstractLarge-scale re-engineering of synonymous sites is a promising strategy to generate vaccines either through synthesis of attenuated viruses or via codon-optimized genes in DNA vaccines. Attenuation typically relies on deoptimization of codon pairs and maximization of CpG dinucleotide frequencies. So as to formulate evolutionarily informed attenuation strategies that aim to force nucleotide usage against the direction favored by selection, here, we examine available whole-genome sequences of SARS-CoV-2 to infer patterns of mutation and selection on synonymous sites. Analysis of mutational profiles indicates a strong mutation bias toward U. In turn, analysis of observed synonymous site composition implicates selection against U. Accounting for dinucleotide effects reinforces this conclusion, observed UU content being a quarter of that expected under neutrality. Possible mechanisms of selection against U mutations include selection for higher expression, for high mRNA stability or lower immunogenicity of viral genes. Consistent with gene-specific selection against CpG dinucleotides, we observe systematic differences of CpG content between SARS-CoV-2 genes. We propose an evolutionarily informed approach to attenuation that, unusually, seeks to increase usage of the already most common synonymous codons. Comparable analysis of H1N1 and Ebola finds that GC3 deviated from neutral equilibrium is not a universal feature, cautioning against generalization of results.
AbstractThe ribosome is an essential cellular machine performing protein biosynthesis. Its structure and composition are highly conserved in all species. However, some bacteria have been reported to have an incomplete set of ribosomal proteins. We have analyzed ribosomal protein composition in 214 small bacterial genomes (<1 Mb) and found that although the ribosome composition is fairly stable, some ribosomal proteins may be absent, especially in bacteria with dramatically reduced genomes. The protein composition of the large subunit is less conserved than that of the small subunit. We have identified the set of frequently lost ribosomal proteins and demonstrated that they tend to be positioned on the ribosome surface and have fewer contacts to other ribosome components. Moreover, some proteins are lost in an evolutionary correlated manner. The reduction of ribosomal RNA is also common, with deletions mostly occurring in free loops. Finally, the loss of the anti-Shine–Dalgarno sequence is associated with the loss of a higher number of ribosomal proteins.
AbstractThe Kingman coalescent and its developments are often considered among the most important advances in population genetics of the last decades. Demographic inference based on coalescent theory has been used to reconstruct the population dynamics and evolutionary history of several species, including Mycobacterium tuberculosis (MTB), an important human pathogen causing tuberculosis. One key assumption of the Kingman coalescent is that the number of descendants of different individuals does not vary strongly, and violating this assumption could lead to severe biases caused by model misspecification. Individual lineages of MTB are expected to vary strongly in reproductive success because 1) MTB is potentially under constant selection due to the pressure of the host immune system and of antibiotic treatment, 2) MTB undergoes repeated population bottlenecks when it transmits from one host to the next, and 3) some hosts show much higher transmission rates compared with the average (superspreaders).Here, we used an approximate Bayesian computation approach to test whether multiple-merger coalescents (MMC), a class of models that allow for large variation in reproductive success among lineages, are more appropriate models to study MTB populations. We considered 11 publicly available whole-genome sequence data sets sampled from local MTB populations and outbreaks and found that MMC had a better fit compared with the Kingman coalescent for 10 of the 11 data sets. These results indicate that the null model for analyzing MTB outbreaks should be reassessed and that past findings based on the Kingman coalescent need to be revisited.
AbstractDirect comparisons between historical and contemporary populations allow for detecting changes in genetic diversity through time and assessment of the impact of habitat fragmentation. Here, we determined the genetic architecture of both historical and modern lions to document changes in genetic diversity over the last century. We surveyed microsatellite and mitochondrial genome variation from 143 high-quality museum specimens of known provenance, allowing us to directly compare this information with data from several recently published nuclear and mitochondrial studies. Our results provide evidence for male-mediated gene flow and recent isolation of local subpopulations, likely due to habitat fragmentation. Nuclear markers showed a significant decrease in genetic diversity from the historical (HE = 0.833) to the modern (HE = 0.796) populations, whereas mitochondrial genetic diversity was maintained (Hd = 0.98 for both). Although the historical population appears to have been panmictic based on nDNA data, hierarchical structure analysis identified four tiers of genetic structure in modern populations and was able to detect most sampling locations. Mitogenome analyses identified four clusters: Southern, Mixed, Eastern, and Western and were consistent between modern and historically sampled haplotypes. Within the last century, habitat fragmentation caused lion subpopulations to become more geographically isolated as human expansion changed the African landscape. This resulted in an increase in fine-scale nuclear genetic structure and loss of genetic diversity as lion subpopulations became more differentiated, whereas mitochondrial structure and diversity were maintained over time.
The past several decades have been hard on Apis mellifera, the Western honey bee. Originally native to Europe, Africa, and the Middle East, Western honey bees have spread worldwide thanks to the nutritional and medicinal value of their honey, pollen, royal jelly, beeswax, propolis, and venom. Even more recently, the rise of the mobile hive and increased demand for pollination services have resulted in an army of bees being unleashed on crops each year, most notably almonds, which require several million bee visits per acre. At the same time, the last 50 years have seen dramatic declines in honey bee populations due to pesticide use, climate change, and habitat destruction. Most notably, the spread of the parasitic mite Varroa destructor from Asia to Western Europe and North America in the 1970s decimated A. mellifera colonies, making it nearly impossible for honey bees to survive without human intervention and resulting in the loss of the vast majority of wild and feral honey bee colonies. Given this decline, scientists have speculated that loss of genetic diversity among honey bees may be contributing to further losses in bee populations. A new study in Genome Biology and Evolution, titled “Digging into the genomic past of Swiss honey bees by whole-genome sequencing museum specimens,” provides evidence that disputes this theory (Parejo et al. 2020), suggesting that loss of genetic diversity may not be among the long list of threats to bee survival.
AbstractEscherichia coli and many other bacterial species, which are incapable of sporulation, can nevertheless survive within resource exhausted media by entering a state termed long-term stationary phase (LTSP). We have previously shown that E. coli populations adapt genetically under LTSP in an extremely convergent manner. Here, we examine how the dynamics of LTSP genetic adaptation are influenced by varying a single parameter of the experiment—culture volume. We find that culture volume affects survival under LTSP, with viable counts decreasing as volumes increase. Across all volumes, mutations accumulate with time, and the majority of mutations accumulated demonstrate signals of being adaptive. However, positive selection appears to affect mutation accumulation more strongly at higher, compared with lower volumes. Finally, we find that several similar genes are likely involved in adaptation across volumes. However, the specific mutations within these genes that contribute to adaptation can vary in a consistent manner. Combined, our results demonstrate how varying a single parameter of an evolutionary experiment can substantially influence the dynamics of observed adaptation.
AbstractIn the context of the COVID-19 pandemic, we describe here the singular metabolic background that constrains enveloped RNA viruses to evolve toward likely attenuation in the long term, possibly after a step of increased pathogenicity. Cytidine triphosphate (CTP) is at the crossroad of the processes allowing SARS-CoV-2 to multiply, because CTP is in demand for four essential metabolic steps. It is a building block of the virus genome, it is required for synthesis of the cytosine-based liponucleotide precursors of the viral envelope, it is a critical building block of the host transfer RNAs synthesis and it is required for synthesis of dolichol-phosphate, a precursor of viral protein glycosylation. The CCA 3′-end of all the transfer RNAs required to translate the RNA genome and further transcripts into the proteins used to build active virus copies is not coded in the human genome. It must be synthesized de novo from CTP and ATP. Furthermore, intermediary metabolism is built on compulsory steps of synthesis and salvage of cytosine-based metabolites via uridine triphosphate that keep limiting CTP availability. As a consequence, accidental replication errors tend to replace cytosine by uracil in the genome, unless recombination events allow the sequence to return to its ancestral sequences. We document some of the consequences of this situation in the function of viral proteins. This unique metabolic setup allowed us to highlight and provide a raison d’être to viperin, an enzyme of innate antiviral immunity, which synthesizes 3ʹ-deoxy-3′,4ʹ-didehydro-CTP as an extremely efficient antiviral nucleotide.
AbstractReceptor adenylate cyclases (RACs) on the surface of trypanosomatids are important players in the host–parasite interface. They detect still unidentified environmental signals that affect the parasites’ responses to host immune challenge, coordination of social motility, and regulation of cell division. A lesser known class of oxygen-sensing adenylate cyclases (OACs) related to RACs has been lost in trypanosomes and expanded mostly in Leishmania species and related insect-dwelling trypanosomatids. In this work, we have undertaken a large-scale phylogenetic analysis of both classes of adenylate cyclases (ACs) in trypanosomatids and the free-living Bodo saltans. We observe that the expanded RAC repertoire in trypanosomatids with a two-host life cycle is not only associated with an extracellular lifestyle within the vertebrate host, but also with a complex path through the insect vector involving several life cycle stages. In Trypanosoma brucei, RACs are split into two major clades, which significantly differ in their expression profiles in the mammalian host and the insect vector. RACs of the closely related Trypanosoma congolense are intermingled within these two clades, supporting early RAC diversification. Subspecies of T. brucei that have lost the capacity to infect insects exhibit high numbers of pseudogenized RACs, suggesting many of these proteins have become redundant upon the acquisition of a single-host life cycle. OACs appear to be an innovation occurring after the expansion of RACs in trypanosomatids. Endosymbiont-harboring trypanosomatids exhibit a diversification of OACs, whereas these proteins are pseudogenized in Leishmania subgenus Viannia. This analysis sheds light on how ACs have evolved to allow diverse trypanosomatids to occupy multifarious niches and assume various lifestyles.
AbstractSex chromosomes often differ from autosomes with respect to their gene expression and regulation. In Drosophila melanogaster, X-linked genes are dosage compensated by having their expression upregulated in the male soma, a process mediated by the X-chromosome-specific binding of the dosage compensation complex (DCC). Previous studies of X-linked gene expression found a negative correlation between a gene’s male-to-female expression ratio and its distance to the nearest DCC binding site in somatic tissues, including head and brain, which suggests that dosage compensation influences sex-biased gene expression. A limitation of the previous studies, however, was that they focused on endogenous X-linked genes and, thus, could not disentangle the effects of chromosomal position from those of gene-specific regulation. To overcome this limitation, we examined the expression of an exogenous reporter gene inserted at many locations spanning the X chromosome. We observed a negative correlation between the male-to-female expression ratio of the reporter gene and its distance to the nearest DCC binding site in somatic tissues, but not in gonads. A reporter gene’s location relative to a DCC binding site had greater influence on its expression than the local regulatory elements of neighboring endogenous genes, suggesting that intra-chromosomal variation in the strength of dosage compensation is a major determinant of sex-biased gene expression. Average levels of sex-biased expression did not differ between head and brain, but there was greater positional effect variation in the brain, which may explain the observed excess of endogenous sex-biased genes located on the X chromosome in this tissue.
AbstractIdentification of the role of the MC1R gene has provided major insights into variation in skin pigmentation in several organisms, including humans, but the evolutionary genetics of this variation is less well established. Variation in this gene and its relationship with degree of melanism was analyzed in one of the world’s highest-elevation lizards, Phrynocephalus theobaldi from the Qinghai–Tibetan Plateau. Individuals from the low-elevation group were shown to have darker dorsal pigmentation than individuals from a high-elevation group. The existence of climatic variation across these elevations was quantified, with lower elevations exhibiting higher air pressure, temperatures, and humidity, but less wind and insolation. Analysis of the MC1R gene in 214 individuals revealed amino acid differences at five sites between intraspecific sister lineages from different elevations, with two sites showing distinct fixed residues at low elevations. Three of the four single-nucleotide polymorphisms that underpinned these amino acid differences were highly significant outliers, relative to the generalized MC1R population structuring, suggestive of selection. Transfection of cells with an MC1R allele from a lighter high-elevation population caused a 43% reduction in agonist-induced cyclic AMP accumulation, and hence lowered melanin synthesis, relative to transfection with an allele from a darker low-elevation population. The high-elevation allele led to less efficient integration of the MC1R protein into melanocyte membranes. Our study identifies variation in the degree of melanism that can be explained by four or fewer MC1R substitutions. We establish a functional link between these substitutions and melanin synthesis and demonstrate elevation-associated shifts in their frequencies.
AbstractWhat determines the level of genetic diversity of a species remains one of the enduring problems of population genetics. Because neutral diversity depends upon the product of the effective population size and mutation rate, there is an expectation that diversity should be correlated to measures of census population size. This correlation is often observed for nuclear but not for mitochondrial DNA. Here, we revisit the question of whether mitochondrial DNA sequence diversity is correlated to census population size by compiling the largest data set to date, using 639 mammalian species. In a multiple regression, we find that nucleotide diversity is significantly correlated to both range size and mass-specific metabolic rate, but not a variety of other factors. We also find that a measure of the effective population size, the ratio of nonsynonymous to synonymous diversity, is also significantly negatively correlated to both range size and mass-specific metabolic rate. These results together suggest that species with larger ranges have larger effective population sizes. The slope of the relationship between diversity and range is such that doubling the range increases diversity by 12–20%, providing one of the first quantifications of the relationship between diversity and the census population size.
AbstractDNA double-strand breaks (DSBs) are a threat to genome stability. In all domains of life, DSBs are faithfully fixed via homologous recombination. Recombination requires the presence of an uncut copy of duplex DNA which is used as a template for repair. Alternatively, in the absence of a template, cells utilize error-prone nonhomologous end joining (NHEJ). Although ubiquitously found in eukaryotes, NHEJ is not universally present in bacteria. It is unclear as to why many prokaryotes lack this pathway. Toward understanding what could have led to the current distribution of bacterial NHEJ, we carried out comparative genomics and phylogenetic analysis across ∼6,000 genomes. Our results show that this pathway is sporadically distributed across the phylogeny. Ancestral reconstruction further suggests that NHEJ was absent in the eubacterial ancestor and can be acquired via specific routes. Integrating NHEJ occurrence data for archaea, we also find evidence for extensive horizontal exchange of NHEJ genes between the two kingdoms as well as across bacterial clades. The pattern of occurrence in bacteria is consistent with correlated evolution of NHEJ with key genome characteristics of genome size and growth rate; NHEJ presence is associated with large genome sizes and/or slow growth rates, with the former being the dominant correlate. Given the central role these traits play in determining the ability to carry out recombination, it is possible that the evolutionary history of bacterial NHEJ may have been shaped by requirement for efficient DSB repair.
AbstractWolbachia are widespread intracellular bacteria that mediate many important biological processes in arthropod species. In this study, we identified 210 conserved single-copy genes in 33 genome-sequenced Wolbachia strains in the A–F supergroups. Phylogenomic analyses with these core genes indicate that all 33 Wolbachia strains maintain the supergroup relationship, which was classified previously based on the multilocus sequence typing (MLST) genes. Using an interclade recombination screening method, 14 inter-supergroup recombination events were discovered in six genes (2.9%) among 210 single-copy orthologs. This finding suggests a relatively low frequency of intergroup recombination. Interestingly, they have occurred not only between A and B supergroups (nine events) but also between A and E supergroups (five events). Maintenance of such transfers suggests possible roles in Wolbachia infection-related functions. Comparisons of strain divergence using the five genes of the MLST system show a high correlation (Pearson correlation coefficient r = 0.98) between MLST and whole-genome divergences, indicating that MLST is a reliable method for identifying related strains when whole-genome data are not available. The phylogenomic analysis and the identified core gene set in our study will serve as a valuable foundation for strain identification and the investigation of recombination and genome evolution in Wolbachia.
AbstractFungi of the genus Botrytis infect >1,400 plant species and cause losses in many crops. Besides the broad host range pathogen Botrytis cinerea, most other species are restricted to a single host. Long-read technology was used to sequence genomes of eight Botrytis species, mostly pathogenic on Allium species, and the related onion white rot fungus, Sclerotium cepivorum. Most assemblies contained <100 contigs, with the Botrytis aclada genome assembled in 16 gapless chromosomes. The core genome and pan-genome of 16 Botrytis species were defined and the secretome, effector, and secondary metabolite repertoires analyzed. Among those genes, none is shared among all Allium pathogens and absent from non-Allium pathogens. The genome of each of the Allium pathogens contains 8–39 predicted effector genes that are unique for that single species, none stood out as potential determinant for host specificity. Chromosome configurations of common ancestors of the genus Botrytis and family Sclerotiniaceae were reconstructed. The genomes of B. cinerea and B. aclada were highly syntenic with only 19 rearrangements between them. Genomes of Allium pathogens were compared with ten other Botrytis species (nonpathogenic on Allium) and with 25 Leotiomycetes for their repertoire of secondary metabolite gene clusters. The pattern was complex, with several clusters displaying patchy distribution. Two clusters involved in the synthesis of phytotoxic metabolites are at distinct genomic locations in different Botrytis species. We provide evidence that the clusters for botcinic acid production in B. cinerea and Botrytis sinoallii were acquired by horizontal transfer from taxa within the same genus.
AbstractEach day, as the amount of genomic data and bioinformatics resources grows, researchers are increasingly challenged with selecting the most appropriate approach to analyze their data. In addition, the opportunity to undertake comparative genomic analyses is growing rapidly. This is especially true for fungi due to their small genome sizes (i.e., mean 1C = 44.2 Mb). Given these opportunities and aiming to gain novel insights into the evolution of mutualisms, we focus on comparing the quality of whole genome assemblies for fungus-growing ants cultivars (Hymenoptera: Formicidae: Attini) and a free-living relative. Our analyses reveal that currently available methodologies and pipelines for analyzing whole-genome sequence data need refining. By using different genome assemblers, we show that the genome assembly size depends on what software is used. This, in turn, impacts gene number predictions, with higher gene numbers correlating positively with genome assembly size. Furthermore, the majority of fungal genome size data currently available are based on estimates derived from whole-genome assemblies generated from short-read genome data, rather than from the more accurate technique of flow cytometry. Here, we estimated the haploid genome sizes of three ant fungal symbionts by flow cytometry using the fungus Pleurotus ostreatus (Jacq.) P. Kumm. (1871) as a calibration standard. We found that published genome sizes based on genome assemblies are 2.5- to 3-fold larger than our estimates based on flow cytometry. We, therefore, recommend that flow cytometry is used to precalibrate genome assembly pipelines, to avoid incorrect estimates of genome sizes and ensure robust assemblies.
AbstractRhizobium–legume symbioses serve as paradigmatic examples for the study of mutualism evolution. The genus Ensifer (syn. Sinorhizobium) contains diverse plant-associated bacteria, a subset of which can fix nitrogen in symbiosis with legumes. To gain insights into the evolution of symbiotic nitrogen fixation (SNF), and interkingdom mutualisms more generally, we performed extensive phenotypic, genomic, and phylogenetic analyses of the genus Ensifer. The data suggest that SNF likely emerged several times within the genus Ensifer through independent horizontal gene transfer events. Yet, the majority (105 of 106) of the Ensifer strains with the nodABC and nifHDK nodulation and nitrogen fixation genes were found within a single, monophyletic clade. Comparative genomics highlighted several differences between the “symbiotic” and “nonsymbiotic” clades, including divergences in their pangenome content. Additionally, strains of the symbiotic clade carried 325 fewer genes, on average, and appeared to have fewer rRNA operons than strains of the nonsymbiotic clade. Initial characterization of a subset of ten Ensifer strains identified several putative phenotypic differences between the clades. Tested strains of the nonsymbiotic clade could catabolize 25% more carbon sources, on average, than strains of the symbiotic clade, and they were better able to grow in LB medium and tolerate alkaline conditions. On the other hand, the tested strains of the symbiotic clade were better able to tolerate heat stress and acidic conditions. We suggest that these data support the division of the genus Ensifer into two main subgroups, as well as the hypothesis that pre-existing genetic features are required to facilitate the evolution of SNF in bacteria.
AbstractEpigenetic processes in eukaryotes play important roles through regulation of gene expression, chromatin structure, and genome rearrangements. The roles of chromatin modification (e.g., DNA methylation and histone modification) and non-protein-coding RNAs have been well studied in animals and plants. With the exception of a few model organisms (e.g., Saccharomyces and Plasmodium), much less is known about epigenetic toolkits across the remainder of the eukaryotic tree of life. Even with limited data, previous work suggested the existence of an ancient epigenetic toolkit in the last eukaryotic common ancestor. We use PhyloToL, our taxon-rich phylogenomic pipeline, to detect homologs of epigenetic genes and evaluate their macroevolutionary patterns among eukaryotes. In addition to data from GenBank, we increase taxon sampling from understudied clades of SAR (Stramenopila, Alveolata, and Rhizaria) and Amoebozoa by adding new single-cell transcriptomes from ciliates, foraminifera, and testate amoebae. We focus on 118 gene families, 94 involved in chromatin modification and 24 involved in non-protein-coding RNA processes based on the epigenetics literature. Our results indicate 1) the presence of a large number of epigenetic gene families in the last eukaryotic common ancestor; 2) differential conservation among major eukaryotic clades, with a notable paucity of genes within Excavata; and 3) punctate distribution of epigenetic gene families between species consistent with rapid evolution leading to gene loss. Together these data demonstrate the power of taxon-rich phylogenomic studies for illuminating evolutionary patterns at scales of >1 billion years of evolution and suggest that macroevolutionary phenomena, such as genome conflict, have shaped the evolution of the eukaryotic epigenetic toolkit.
AbstractDendrobium huoshanense is used to treat various diseases in traditional Chinese medicine. Recent studies have identified active components. However, the lack of genomic data limits research on the biosynthesis and application of these therapeutic ingredients. To address this issue, we generated the first chromosome-level genome assembly and annotation of D. huoshanense. We integrated PacBio sequencing data, Illumina paired-end sequencing data, and Hi-C sequencing data to assemble a 1.285 Gb genome, with contig and scaffold N50 lengths of 598 kb and 71.79 Mb, respectively. We annotated 21,070 protein-coding genes and 0.96 Gb transposable elements, constituting 74.92% of the whole assembly. In addition, we identified 252 genes responsible for polysaccharide biosynthesis by Kyoto Encyclopedia of Genes and Genomes functional annotation. Our data provide a basis for further functional studies, particularly those focused on genes related to glycan biosynthesis and metabolism, and have implications for both conservation and medicine.
AbstractMicroRNAs are important regulators of gene expression in eukaryotes. Previously, we reported that in Phaseolus vulgaris, the precursor for miR2119 is located in the same gene as miR398a, conceiving a dicistronic MIR gene. Both miRNA precursors are transcribed and processed from a single transcript resulting in two mature microRNAs that regulate the mRNAs encoding ALCOHOL DEHYDROGENASE 1 (ADH1) and COPPER-ZINC SUPEROXIDE DISMUTASE 1 (CSD1). Genes for miR398 are distributed throughout the spermatophytes; however, miR2119 is only found in Leguminosae species, indicating its recent emergence. Here, we used public databases to explore the presence of the miR2119 sequence in several plant species. We found that miR2119 is present only in specific clades within the Papilionoideae subfamily, including important crops used for human consumption and forage. Within this subfamily, MIR2119 and MIR398a are found together as a single gene in the genomes of the Millettioids and Hologalegina. In contrast, in the Dalbergioids MIR2119 is located in a different locus from MIR398a, suggesting this as the ancestral genomic organization. To our knowledge, this is a unique example where two separate MIRNA genes have merged to generate a single polycistronic gene. Phylogenetic analysis of ADH1 gene sequences in the Papilionoideae subfamily revealed duplication events resulting in up to four ADH1 genes in certain species. Notably, the presence of MIR2119 correlates with the conservation of target sites in particular ADH1 genes in each clade. Our results suggest that post-transcriptional regulation of ADH1 genes by miR2119 has contributed to shaping the expansion and divergence of this gene family in the Papilionoideae. Future experimental work on ADH1 regulation by miR2119 in more legume species will help to further understand the evolutionary history of the ADH1 gene family and the relevance of miRNA regulation in this process.
AbstractDinoflagellates possess many cellular characteristics with unresolved evolutionary histories. These include nuclei with greatly expanded genomes and chromatin packaged using histone-like proteins and dinoflagellate-viral nucleoproteins instead of histones, highly reduced mitochondrial genomes with extensive RNA editing, a mix of photosynthetic and cryptic secondary plastids, and tertiary plastids. Resolving the evolutionary origin of these traits requires understanding their ancestral states and early intermediates. Several early-branching dinoflagellate lineages are good candidates for such reconstruction, however these cells tend to be delicate and environmentally sparse, complicating such analyses. Here, we employ transcriptome sequencing from manually isolated and microscopically documented cells to resolve the placement of two cells of one such genus, Abedinium, collected by remotely operated vehicle in deep waters off the coast of Monterey Bay, CA. One cell corresponds to the only described species, Abedinium dasypus, whereas the second cell is distinct and formally described as Abedinium folium, sp. nov. Abedinium has classically been assigned to the early-branching dinoflagellate subgroup Noctilucales, which is weakly supported by phylogenetic analyses of small subunit ribosomal RNA, the single characterized gene from any member of the order. However, an analysis based on 221 proteins from the transcriptome places Abedinium as a distinct lineage, separate from and basal to Noctilucales and the rest of the core dinoflagellates. The transcriptome also contains evidence of a cryptic plastid functioning in the biosynthesis of isoprenoids, iron–sulfur clusters, and heme, a mitochondrial genome with all three expected protein-coding genes (cob, cox1, and cox3), and the presence of some but not all dinoflagellate-specific chromatin packaging proteins.
AbstractPhenotypic plasticity is the ability of a single genotype to produce different phenotypes in response to environmental variation. The importance of phenotypic plasticity in natural populations and its contribution to phenotypic evolution during rapid environmental change is widely debated. Here, we show that thermal plasticity of gene expression in natural populations is a key component of its adaptation: evolution to novel thermal environments increases ancestral plasticity rather than mean genetic expression. We determined the evolution of plasticity in gene expression by conducting laboratory natural selection on a Drosophila simulans population in hot and cold environments. After more than 60 generations in the hot environment, 325 genes evolved a change in plasticity relative to the natural ancestral population. Plasticity increased in 75% of these genes, which were strongly enriched for several well-defined functional categories (e.g., chitin metabolism, glycolysis, and oxidative phosphorylation). Furthermore, we show that plasticity in gene expression of populations exposed to different temperatures is rather similar across species. We conclude that most of the ancestral plasticity can evolve further in more extreme environments.
AbstractAlthough numerous studies have found horizontal transposon transfer (HTT) to be widespread across metazoans, few have focused on HTT in marine ecosystems. To investigate potential recent HTTs into marine species, we searched for novel repetitive elements in sea snakes, a group of elapids which transitioned to a marine habitat at most 18 Ma. Our analysis uncovered repeated HTTs into sea snakes following their marine transition. The seven subfamilies of horizontally transferred LINE retrotransposons we identified in the olive sea snake (Aipysurus laevis) are transcribed, and hence are likely still active and expanding across the genome. A search of 600 metazoan genomes found all seven were absent from other amniotes, including terrestrial elapids, with the most similar LINEs present in fish and marine invertebrates. The one exception was a similar LINE found in sea kraits, a lineage of amphibious elapids which independently transitioned to a marine environment 25 Ma. Our finding of repeated horizontal transfer events into marine snakes greatly expands past findings that the marine environment promotes the transfer of transposons. Transposons are drivers of evolution as sources of genomic sequence and hence genomic novelty. We identified 13 candidate genes for HTT-induced adaptive change based on internal or neighboring HTT LINE insertions. One of these, ADCY4, is of particular interest as a part of the KEGG adaptation pathway “Circadian Entrainment.” This provides evidence of the ecological interactions between species influencing evolution of metazoans not only through specific selection pressures, but also by contributing novel genomic material.
AbstractMembers of the predatory Myxococcales (myxobacteria) possess large genomes, undergo multicellular development, and produce diverse secondary metabolites, which are being actively prospected for novel drug discovery. To direct such efforts, it is important to understand the relationships between myxobacterial ecology, evolution, taxonomy, and genomic variation.This study investigated the genomes and pan-genomes of organisms within the Myxococcaceae, including the genera Myxococcus and Corallococcus, the most abundant myxobacteria isolated from soils. Previously, ten species of Corallococcus were known, whereas six species of Myxococcus phylogenetically surrounded a third genus (Pyxidicoccus) composed of a single species. Here, we describe draft genome sequences of five novel species within the Myxococcaceae (Myxococcus eversor, Myxococcus llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogochensis, Myxococcus vastator, Pyxidicoccus caerfyrddinensis, and Pyxidicoccus trucidator) and for the Pyxidicoccus type species strain, Pyxidicoccus fallax DSM 14698T. Genomic and physiological comparisons demonstrated clear differences between the five novel species and every other Myxococcus or Pyxidicoccus spp. type strain.Subsequent analyses of type strain genomes showed that both the Corallococcus pan-genome and the combined Myxococcus and Pyxidicoccus (Myxococcus/Pyxidicoccus) pan-genome are large and open, but with clear differences. Genomes of Corallococcus spp. are generally smaller than those of Myxococcus/Pyxidicoccus spp. but have core genomes three times larger. Myxococcus/Pyxidicoccus spp. genomes are more variable in size, with larger and more unique sets of accessory genes than those of Corallococcus species. In both genera, biosynthetic gene clusters are relatively enriched in the shell pan-genomes, implying they grant a greater evolutionary benefit than other shell genes, presumably by conferring selective advantages during predation.
AbstractContaining fungal diseases often depends on the application of fungicidal compounds. Fungicides can rapidly lose effectiveness due to the rise of resistant individuals in populations. However, the lack of knowledge about resistance mutations beyond known target genes challenges investigations into pathways to resistance. We used whole-genome sequencing data and association mapping to reveal the multilocus genetic architecture of fungicide resistance in a global panel of 159 isolates of Parastagonospora nodorum, an important fungal pathogen of wheat. We found significant differences in azole resistance among global field populations. The populations evolved distinctive combinations of resistance alleles which can interact when co-occurring in the same genetic background. We identified 34 significantly associated single nucleotide polymorphisms located in close proximity to genes associated with fungicide resistance in other fungi, including a major facilitator superfamily transporter. Using fungal colony growth rates and melanin production at different temperatures as fitness proxies, we found no evidence that resistance was constrained by genetic trade-offs. Our study demonstrates how genome-wide association studies of a global collection of pathogen strains can recapitulate the emergence of fungicide resistance. The distinct complement of resistance mutations found among populations illustrates how the evolutionary trajectory of fungicide adaptation can be complex and challenging to predict.
AbstractGenome-wide nucleotide composition varies widely among species. Despite extensive research, the source of genome-wide nucleotide composition diversity remains elusive. Yeast mitochondrial genomes (mitogenomes) are highly A + T rich, and they provide a unique opportunity to study the evolution of AT-biased landscape. In this study, we sequenced ten complete mitogenomes of the Saccharomycodes ludwigii yeast with 8% G + C content, the lowest genome-wide %(G + C) in all published genomes to date. The S. ludwigii mitogenomes have high densities of short tandem repeats but severely underrepresented mononucleotide repeats. Comparative population genomics of these record-setting A + T-rich genomes shows dynamic indel mutations and strong mutation bias toward A/T. Indel mutations play a greater role in genomic variation among very closely related strains than nucleotide substitutions. Indels have resulted in presence–absence polymorphism of tRNAArg (ACG) among S. ludwigii mitogenomes. Interestingly, these mitogenomes have undergone recombination, a genetic process that can increase G + C content by GC-biased gene conversion. Finally, the expected equilibrium G + C content under mutation pressure alone is higher than observed G + C content, suggesting existence of mechanisms other than AT-biased mutation operating to increase A/T. Together, our findings shed new lights on mechanisms driving extremely AT-rich genomes.
AbstractCopy number variation (CNV) can promote phenotypic diversification and adaptive evolution. However, the genomic architecture of CNVs among Macaca species remains scarcely reported, and the roles of CNVs in adaptation and evolution of macaques have not been well addressed. Here, we identified and characterized 1,479 genome-wide hetero-specific CNVs across nine Macaca species with bioinformatic methods, along with 26 CNV-dense regions and dozens of lineage-specific CNVs. The genes intersecting CNVs were overrepresented in nutritional metabolism, xenobiotics/drug metabolism, and immune-related pathways. Population-level transcriptome data showed that nearly 46% of CNV genes were differentially expressed across populations and also mainly consisted of metabolic and immune-related genes, which implied the role of CNVs in environmental adaptation of Macaca. Several CNVs overlapping drug metabolism genes were verified with genomic quantitative polymerase chain reaction, suggesting that these macaques may have different drug metabolism features. The CNV-dense regions, including 15 first reported here, represent unstable genomic segments in macaques where biological innovation may evolve. Twelve gains and 40 losses specific to the Barbary macaque contain genes with essential roles in energy homeostasis and immunity defense, inferring the genetic basis of its unique distribution in North Africa. Our study not only elucidated the genetic diversity across Macaca species from the perspective of structural variation but also provided suggestive evidence for the role of CNVs in adaptation and genome evolution. Additionally, our findings provide new insights into the application of diverse macaques to drug study.
AbstractSex offers advantages even in primarily asexual species. Some ciliates appear to utilize such reproductive strategy with many mating types. However, the factors determining the composition of mating types in the unicellular ciliate Tetrahymena thermophila are poorly understood, and this is further complicated by non-Mendelian determination of mating type in the offspring. We therefore developed a novel population genetics model to predict how various factors influence the dynamics of mating type composition, including natural selection. The model predicted either the coexistence of all seven mating types or fixation of a single mating type in a population, depending on parameter combinations, irrespective of natural selection. To understand what factor(s) may be more influential and to test the validity of theoretical prediction, five replicate populations were maintained in laboratory such that several factors could be controlled or measured. Whole-genome sequencing was used to identify newly arising mutations and determine mating type composition. Strikingly, all populations were found to be driven by strong selection on newly arising beneficial mutations to fixation of their carrying mating types, and the trajectories of speed to fixation agreed well with our theoretical predictions. This study illustrates the evolutionary strategies that T. thermophila can utilize to optimize population fitness.
AbstractBusseola fusca (Fuller) (Lepidoptera: Noctuidae), the maize stalk borer, is a widespread crop pest in sub-Saharan Africa that has been the focus of biological research and intensive management strategies. Here, we present a comprehensive annotated transcriptome of B. fusca (originally collected in the Western Province of Kenya) based on ten pooled libraries including a wide array of developmental stages, tissue types, and exposures to parasitoid wasps. Parasitoid wasps have been used as a form of biocontrol to try and reduce crop losses with variable success, in part due to differential infectivities and immune responses among wasps and hosts. We identified a number of loci of interest for pest management, including genes potentially involved in chemoreception, immunity, and response to insecticides. The comprehensive sampling design used expands our current understanding of the transcriptome of this species and deepens the list of potential target genes for future crop loss mitigation, in addition to highlighting candidate loci for differential expression and functional genetic analyses in this important pest species.
AbstractHistorical specimens in museum collections provide opportunities to gain insights into the genomic past. For the Western honey bee, Apis mellifera L., this is particularly important because its populations are currently under threat worldwide and have experienced many changes in management and environment over the last century. Using Swiss Apis mellifera mellifera as a case study, our research provides important insights into the genetic diversity of native honey bees prior to the industrial-scale introductions and trade of non-native stocks during the 20th century—the onset of intensive commercial breeding and the decline of wild honey bees following the arrival of Varroa destructor. We sequenced whole-genomes of 22 honey bees from the Natural History Museum in Bern collected in Switzerland, including the oldest A. mellifera sample ever sequenced. We identify both, a historic and a recent migrant, natural or human-mediated, which corroborates with the population history of honey bees in Switzerland. Contrary to what we expected, we find no evidence for a significant genetic bottleneck in Swiss honey bees, and find that genetic diversity is not only maintained, but even slightly increased, most probably due to modern apicultural practices. Finally, we identify signals of selection between historic and modern honey bee populations associated with genes enriched in functions linked to xenobiotics, suggesting a possible selective pressure from the increasing use and diversity of chemicals used in agriculture and apiculture over the last century.
AbstractCytoplasmic male sterility (MS) in plants is caused by MS-inducing mitochondria, which have emerged frequently during plant evolution. Nuclear restorer-of-fertility (Rf)genes can suppress their cognate MS-inducing mitochondria. Whereas many Rfs encode a class of RNA-binding protein, the sugar beet (Caryophyllales) Rf encodes a protein resembling Oma1, which is involved in the quality control of mitochondria. In this study, we investigated the molecular evolution of Oma1 homologs in plants. We analyzed 37 plant genomes and concluded that a single copy is the ancestral state in Caryophyllales. Among the sugar beet Oma1 homologs, the orthologous copy is located in a syntenic region that is preserved in Arabidopsis thaliana. The sugar beet Rf is a complex locus consisting of a small Oma1 homolog family (RF-Oma1 family) unique to sugar beet. The gene arrangement in the vicinity of the locus is seen in some but not all Caryophyllalean plants and is absent from Ar. thaliana. This suggests a segmental duplication rather than a whole-genome duplication as the mechanism of RF-Oma1 evolution. Of thirty-seven positively selected codons in RF-Oma1, twenty-six of these sites are located in predicted transmembrane helices. Phylogenetic network analysis indicated that homologous recombination among the RF-Oma1 members played an important role to generate protein activity related to suppression. Together, our data illustrate how an evolutionarily young Rf has emerged from a lineage-specific paralog. Interestingly, several evolutionary features are shared with the RNA-binding protein type Rfs. Hence, the evolution of the sugar beet Rf is representative of Rf evolution in general.