AbstractThe transition to an aquatic lifestyle in cetaceans (whales and dolphins) resulted in a radical transformation in their sensory systems. Toothed whales acquired specialized high-frequency hearing tied to the evolution of echolocation, whereas baleen whales evolved low-frequency hearing. More generally, all cetaceans show adaptations for hearing and seeing underwater. To determine the extent to which these phenotypic changes have been driven by molecular adaptation, we performed large-scale targeted sequence capture of 179 sensory genes across the Cetacea, incorporating up to 54 cetacean species from all major clades as well as their closest relatives, the hippopotamuses. We screened for positive selection in 167 loci related to vision and hearing and found that the diversification of cetaceans has been accompanied by pervasive molecular adaptations in both sets of genes, including several loci implicated in nonsyndromic hearing loss. Despite these findings, however, we found no direct evidence of positive selection at the base of odontocetes coinciding with the origin of echolocation, as found in studies examining fewer taxa. By using contingency tables incorporating taxon- and gene-based controls, we show that, although numbers of positively selected hearing and nonsyndromic hearing loss genes are disproportionately high in cetaceans, counts of vision genes do not differ significantly from expected values. Alongside these adaptive changes, we find increased evidence of pseudogenization of genes involved in cone-mediated vision in mysticetes and deep-diving odontocetes.
AbstractR package pcadapt is a user-friendly R package for performing genome scans for local adaptation. Here, we present version 4 of pcadapt which substantially improves computational efficiency while providing similar results. This improvement is made possible by using a different format for storing genotypes and a different algorithm for computing principal components of the genotype matrix, which is the most computationally demanding step in method pcadapt. These changes are seamlessly integrated into the existing pcadapt package, and users will experience a large reduction in computation time (by a factor of 20–60 in our analyses) as compared with previous versions.
AbstractGoats are one of the most widespread farmed animals across the world; however, their migration route to East Asia and local evolutionary history remain poorly understood. Here, we sequenced 27 ancient Chinese goat genomes dating from the Late Neolithic period to the Iron Age. We found close genetic affinities between ancient and modern Chinese goats, demonstrating their genetic continuity. We found that Chinese goats originated from the eastern regions around the Fertile Crescent, and we estimated that the ancestors of Chinese goats diverged from this population in the Chalcolithic period. Modern Chinese goats were divided into a northern and a southern group, coinciding with the most prominent climatic division in China, and two genes related to hair follicle development, FGF5 and EDA2R, were highly divergent between these populations. We identified a likely causal de novo deletion near FGF5 in northern Chinese goats that increased to high frequency over time, whereas EDA2R harbored standing variation dating to the Neolithic. Our findings add to our understanding of the genetic composition and local evolutionary process of Chinese goats.
AbstractAlthough epigenetic factors may influence the expression of defense genes in plants, their role in antiviral responses and the impact of viral adaptation and evolution in shaping these interactions are still poorly explored. We used two isolates of turnip mosaic potyvirus with varying degrees of adaptation to Arabidopsis thaliana to address these issues. One of the isolates was experimentally evolved in the plant and presented increased load and virulence relative to the ancestral isolate. The magnitude of the transcriptomic responses was larger for the evolved isolate and indicated a role of innate immunity systems triggered by molecular patterns and effectors in the infection process. Several transposable elements located in different chromatin contexts and epigenetic-related genes were also affected. Correspondingly, mutant plants having loss or gain of repressive marks were, respectively, more tolerant and susceptible to turnip mosaic potyvirus, with a more efficient response against the ancestral isolate. In wild-type plants, both isolates induced similar levels of cytosine methylation changes, including in and around transposable elements and stress-related genes. Results collectively suggested that apart from RNA silencing and basal immunity systems, DNA methylation and histone modification pathways may also be required for mounting proper antiviral defenses and that the effectiveness of this type of regulation strongly depends on the degree of viral adaptation to the host.
AbstractGroup II (gII) introns are mobile retroelements that can spread to new DNA sites through retrotransposition, which can be influenced by a variety of host factors. To determine if these host factors bear any relationship to the genomic location of gII introns, we developed a bioinformatic pipeline wherein we focused on the genomic neighborhoods of bacterial gII introns within their native contexts and sought to determine global relationships between introns and their surrounding genes. We found that, although gII introns inhabit diverse regions, these neighborhoods are often functionally enriched for genes that could promote gII intron retention or proliferation. On one hand, we observe that gII introns are frequently found hiding in mobile elements or after transcription terminators. On the other hand, gII introns are enriched in locations in which they could hijack host functions for their movement, potentially timing expression of the intron with genes that produce favorable conditions for retrotransposition. Thus, we propose that gII intron distributions have been shaped by relationships with their surrounding genomic neighbors.
AbstractToll-like receptors (TLRs) play an important role for the innate immune system by detecting pathogen-associated molecular patterns. TLR5 encodes the major extracellular receptor for bacterial flagellin and frequently evolves under positive selection, consistent with coevolutionary arms races between the host and pathogens. Furthermore, TLR5 is inactivated in several vertebrates and a TLR5 stop codon polymorphism is widespread in human populations. Here, we analyzed the genomes of 120 mammals and discovered that TLR5 is convergently lost in four independent lineages, comprising guinea pigs, Yangtze river dolphin, pinnipeds, and pangolins. Validated inactivating mutations, absence of protein-coding transcript expression, and relaxed selection on the TLR5 remnants confirm these losses. PCR analysis further confirmed the loss of TLR5 in the pinniped stem lineage. Finally, we show that TLR11, encoding a second extracellular flagellin receptor, is also absent in these four lineages. Independent losses of TLR5 and TLR11 suggest that a major pathway for detecting flagellated bacteria is not essential for different mammals and predicts an impaired capacity to sense extracellular flagellin.
AbstractTransient receptor potential melastatins (TRPMs) are most well known as cold and menthol sensors, but are in fact broadly critical for life, from ion homeostasis to reproduction. Yet, the evolutionary relationship between TRPM channels remains largely unresolved, particularly with respect to the placement of several highly divergent members. To characterize the evolution of TRPM and like channels, we performed a large-scale phylogenetic analysis of >1,300 TRPM-like sequences from 14 phyla (Annelida, Arthropoda, Brachiopoda, Chordata, Cnidaria, Echinodermata, Hemichordata, Mollusca, Nematoda, Nemertea, Phoronida, Priapulida, Tardigrada, and Xenacoelomorpha), including sequences from a variety of recently sequenced genomes that fill what would otherwise be substantial taxonomic gaps. These findings suggest: 1) the previously recognized TRPM family is in fact two distinct families, including canonical TRPM channels and an eighth major previously undescribed family of animal TRP channel, TRP soromelastatin; 2) two TRPM clades predate the last bilaterian–cnidarian ancestor; and 3) the vertebrate–centric trend of categorizing TRPM channels as 1–8 is inappropriate for most phyla, including other chordates.
AbstractPhenotypic plasticity, the ability of an organism to alter its phenotype in response to an environmental cue, facilitates rapid adaptation to changing environments. Plastic changes in morphology and behavior are underpinned by widespread gene expression changes. However, it is unknown if, or how, genomes are structured to ensure these robust responses. Here, we use repression of honeybee worker ovaries as a model of plasticity. We show that the honeybee genome is structured with respect to plasticity; genes that respond to an environmental trigger are colocated in the honeybee genome in a series of gene clusters, many of which have been assembled in the last 80 My during the evolution of the Apidae. These clusters are marked by histone modifications that prefigure the gene expression changes that occur as the ovary activates, suggesting that these genomic regions are poised to respond plastically. That the linear sequence of the honeybee genome is organized to coordinate widespread gene expression changes in response to environmental influences and that the chromatin organization in these regions is prefigured to respond to these influences is perhaps unexpected and has implications for other examples of plasticity in physiology, evolution, and human disease.
AbstractWe explore sequence determinants of enzyme activity and specificity in a major enzyme family of terpene synthases. Most enzymes in this family catalyze reactions that produce cyclic terpenes—complex hydrocarbons widely used by plants and insects in diverse biological processes such as defense, communication, and symbiosis. To analyze the molecular mechanisms of emergence of terpene cyclization, we have carried out in-depth examination of mutational space around (E)-β-farnesene synthase, an Artemisia annua enzyme which catalyzes production of a linear hydrocarbon chain. Each mutant enzyme in our synthetic libraries was characterized biochemically, and the resulting reaction rate data were used as input to the Michaelis–Menten model of enzyme kinetics, in which free energies were represented as sums of one-amino-acid contributions and two-amino-acid couplings. Our model predicts measured reaction rates with high accuracy and yields free energy landscapes characterized by relatively few coupling terms. As a result, the Michaelis–Menten free energy landscapes have simple, interpretable structure and exhibit little epistasis. We have also developed biophysical fitness models based on the assumption that highly fit enzymes have evolved to maximize the output of correct products, such as cyclic products or a specific product of interest, while minimizing the output of byproducts. This approach results in nonlinear fitness landscapes that are considerably more epistatic. Overall, our experimental and computational framework provides focused characterization of evolutionary emergence of novel enzymatic functions in the context of microevolutionary exploration of sequence space around naturally occurring enzymes.
AbstractIt is regarded as best practice in phylogenetic reconstruction to perform relative model selection to determine an appropriate evolutionary model for the data. This procedure ranks a set of candidate models according to their goodness of fit to the data, commonly using an information theoretic criterion. Users then specify the best-ranking model for inference. Although it is often assumed that better-fitting models translate to increase accuracy, recent studies have shown that the specific model employed may not substantially affect inferences. We examine whether there is a systematic relationship between relative model fit and topological inference accuracy in protein phylogenetics, using simulations and real sequences. Simulations employed site-heterogeneous mechanistic codon models that are distinct from protein-level phylogenetic inference models, allowing us to investigate how protein models performs when they are misspecified to the data, as will be the case for any real sequence analysis. We broadly find that phylogenies inferred across models with vastly different fits to the data produce highly consistent topologies. We additionally find that all models infer similar proportions of false-positive splits, raising the possibility that all available models of protein evolution are similarly misspecified. Moreover, we find that the parameter-rich GTR (general time reversible) model, whose amino acid exchangeabilities are free parameters, performs similarly to models with fixed exchangeabilities, although the inference precision associated with GTR models was not examined. We conclude that, although relative model selection may not hinder phylogenetic analysis on protein data, it may not offer specific predictable improvements and is not a reliable proxy for accuracy.
AbstractDespite its important biological role, the evolution of recombination rates remains relatively poorly characterized. This owes, in part, to the lack of high-quality genomic resources to address this question across diverse species. Humans and our closest evolutionary relatives, anthropoid apes, have remained a major focus of large-scale sequencing efforts, and thus recombination rate variation has been comparatively well studied in this group—with earlier work revealing a conservation at the broad- but not the fine-scale. However, in order to better understand the nature of this variation, and the time scales on which substantial modifications occur, it is necessary to take a broader phylogenetic perspective. I here present the first fine-scale genetic map for vervet monkeys based on whole-genome population genetic data from ten individuals and perform a series of comparative analyses with the great apes. The results reveal a number of striking features. First, owing to strong positive correlations with diversity and weak negative correlations with divergence, analyses suggest a dominant role for purifying and background selection in shaping patterns of variation in this species. Second, results support a generally reduced broad-scale recombination rate compared with the great apes, as well as a narrower fraction of the genome in which the majority of recombination events are observed to occur. Taken together, this data set highlights the great necessity of future research to identify genomic features and quantify evolutionary processes that are driving these rate changes across primates.
AbstractDivergence in gene expression regulation is common between closely related species and may give rise to incompatibilities in their hybrid progeny. In this study, we investigated the relationship between regulatory evolution within species and reproductive isolation between species. We focused on a well-studied case of hybrid sterility between two closely related yellow monkeyflower species, Mimulus guttatus and Mimulus nasutus, that is caused by two epistatic loci, hybrid male sterility 1 (hms1) and hybrid male sterility 2 (hms2). We compared genome-wide transcript abundance across male and female reproductive tissues (i.e., stamens and carpels) from four genotypes: M. guttatus, M. nasutus, and sterile and fertile progeny from an advanced M. nasutus–M. guttatus introgression line carrying the hms1–hms2 incompatibility. We observed substantial variation in transcript abundance between M. guttatus and M. nasutus, including distinct but overlapping patterns of tissue-biased expression, providing evidence for regulatory divergence between these species. We also found rampant genome-wide misexpression, but only in the affected tissues (i.e., stamens) of sterile introgression hybrids carrying incompatible alleles at hms1 and hms2. Examining patterns of allele-specific expression in sterile and fertile introgression hybrids, we found evidence for interspecific divergence in cis- and trans-regulation, including compensatory cis–trans mutations likely to be driven by stabilizing selection. Nevertheless, species divergence in gene regulatory networks cannot explain the vast majority of the gene misexpression we observe in Mimulus introgression hybrids, which instead likely manifests as a downstream consequence of sterility itself.
AbstractSensory systems are tuned by selection to maximize organismal fitness in particular environments. This tuning has implications for intraspecies communication, the maintenance of species boundaries, and speciation. Tuning of color vision largely depends on the sequence of the expressed opsin proteins. To improve tuning of visual sensitivities to shifts in habitat or foraging ecology over the course of development, many organisms change which opsins are expressed. Changes in this developmental sequence (heterochronic shifts) can create differences in visual sensitivity among closely related species. The genetic mechanisms by which these developmental shifts occur are poorly understood. Here, we use quantitative trait locus analyses, genome sequencing, and gene expression studies in African cichlid fishes to identify a role for the transcription factor Tbx2a in driving a switch between long wavelength sensitive (LWS) and Rhodopsin-like (RH2) opsin expression. We identify binding sites for Tbx2a in the LWS promoter and the highly conserved locus control region of RH2 which concurrently promote LWS expression while repressing RH2 expression. We also present evidence that a single change in Tbx2a regulatory sequence has led to a species difference in visual tuning, providing the first mechanistic model for the evolution of rapid switches in sensory tuning. This difference in visual tuning likely has important roles in evolution as it corresponds to differences in diet, microhabitat choice, and male nuptial coloration.
AbstractEvolutionary changes in gene expression are often driven by gains and losses of cis-regulatory elements (CREs). The dynamics of CRE evolution can be examined using multispecies epigenomic data, but so far such analyses have generally been descriptive and model-free. Here, we introduce a probabilistic modeling framework for the evolution of CREs that operates directly on raw chromatin immunoprecipitation and sequencing (ChIP-seq) data and fully considers the phylogenetic relationships among species. Our framework includes a phylogenetic hidden Markov model, called epiPhyloHMM, for identifying the locations of multiply aligned CREs, and a combined phylogenetic and generalized linear model, called phyloGLM, for accounting for the influence of a rich set of genomic features in describing their evolutionary dynamics. We apply these methods to previously published ChIP-seq data for the H3K4me3 and H3K27ac histone modifications in liver tissue from nine mammals. We find that enhancers are gained and lost during mammalian evolution at about twice the rate of promoters, and that turnover rates are negatively correlated with DNA sequence conservation, expression level, and tissue breadth, and positively correlated with distance from the transcription start site, consistent with previous findings. In addition, we find that the predicted dosage sensitivity of target genes positively correlates with DNA sequence constraint in CREs but not with turnover rates, perhaps owing to differences in the effect sizes of the relevant mutations. Altogether, our probabilistic modeling framework enables a variety of powerful new analyses.
AbstractExtreme environments offer powerful opportunities to study how different organisms have adapted to similar selection pressures at the molecular level. Arctic plants have adapted to some of the coldest and driest biomes on Earth and typically possess suites of similar morphological and physiological adaptations to extremes in light and temperature. Here, we compare patterns of molecular evolution in three Brassicaceae species that have independently colonized the Arctic and present some of the first genetic evidence for plant adaptations to the Arctic environment. By testing for positive selection and identifying convergent substitutions in orthologous gene alignments for a total of 15 Brassicaceae species, we find that positive selection has been acting on different genes, but similar functional pathways in the three Arctic lineages. The positively selected gene sets identified in the three Arctic species showed convergent functional profiles associated with extreme abiotic stress characteristic of the Arctic. However, there was little evidence for independently fixed mutations at the same sites and for positive selection acting on the same genes. The three species appear to have evolved similar suites of adaptations by modifying different components in similar stress response pathways, implying that there could be many genetic trajectories for adaptation to the Arctic environment. By identifying candidate genes and functional pathways potentially involved in Arctic adaptation, our results provide a framework for future studies aimed at testing for the existence of a functional syndrome of Arctic adaptation in the Brassicaceae and perhaps flowering plants in general.
AbstractGenetic variation in the enzymes that catalyze posttranslational modification of proteins is a potentially important source of phenotypic variation during evolution. Ubiquitination is one such modification that affects turnover of virtually all of the proteins in the cell in addition to roles in signaling and epigenetic regulation. UBE2D3 is a promiscuous E2 enzyme, which acts as an ubiquitin donor for E3 ligases that catalyze ubiquitination of developmentally important proteins. We have used protein sequence comparison of UBE2D3 orthologs to identify a position in the C-terminal α-helical region of UBE2D3 that is occupied by a conserved serine in amniotes and by alanine in anamniote vertebrate and invertebrate lineages. Acquisition of the serine (S138) in the common ancestor to modern amniotes created a phosphorylation site for Aurora B. Phosphorylation of S138 disrupts the structure of UBE2D3 and reduces the level of the protein in mouse embryonic stem cells (ESCs). Substitution of S138 with the anamniote alanine (S138A) increases the level of UBE2D3 in ESCs as well as being a gain of function early embryonic lethal mutation in mice. When mutant S138A ESCs were differentiated into extraembryonic primitive endoderm, levels of the PDGFRα and FGFR1 receptor tyrosine kinases were reduced and primitive endoderm differentiation was compromised. Proximity ligation analysis showed increased interaction between UBE2D3 and the E3 ligase CBL and between CBL and the receptor tyrosine kinases. Our results identify a sequence change that altered the ubiquitination landscape at the base of the amniote lineage with potential effects on amniote biology and evolution.
AbstractAccurate estimates of divergence times are essential to understand the evolutionary history of species. It allows linking evolutionary histories of the diverging lineages with past geological, climatic, and other changes in environment and shed light on the processes involved in speciation. The pea aphid radiation includes multiple host races adapted to different legume host plants. It is thought that diversification in this system occurred very recently, over the past 8,000–16,000 years. This young age estimate was used to link diversification in pea aphids to the onset of human agriculture, and led to the establishment of the pea aphid radiation as a model system in the study of speciation with gene flow. Here, we re-examine the age of the pea aphid radiation, by combining a mutation accumulation experiment with a genome-wide estimate of divergence between distantly related pea aphid host races. We estimate the spontaneous mutation rate for pea aphids as 2.7×10-10 per haploid genome per parthenogenic generation. Using this estimate of mutation rate and the genome-wide genetic differentiation observed between pea aphid host races, we show that the pea aphid radiation is much more ancient than assumed previously, predating Neolithic agriculture by several hundreds of thousands of years. Our results rule out human agriculture as the driver of diversification of the pea aphid radiation, and call for re-assessment of the role of allopatric isolation during Pleistocene climatic oscillations in divergence of the pea aphid complex.
AbstractAlternative translation initiation (ATLI) refers to the existence of multiple translation initiation sites per gene and is a widespread phenomenon in eukaryotes. ATLI is commonly assumed to be advantageous through creating proteome diversity or regulating protein synthesis. We here propose an alternative hypothesis that ATLI arises primarily from nonadaptive initiation errors presumably due to the limited ability of ribosomes to distinguish sequence motifs truly signaling translation initiation from similar sequences. Our hypothesis, but not the adaptive hypothesis, predicts a series of global patterns of ATLI, all of which are confirmed at the genomic scale by quantitative translation initiation sequencing in multiple human and mouse cell lines and tissues. Similarly, although many codons differing from AUG by one nucleotide can serve as start codons, our analysis suggests that using non-AUG start codons is mostly disadvantageous. These and other findings strongly suggest that ATLI predominantly results from molecular error, requiring a major revision of our understanding of the precision and regulation of translation initiation.
AbstractLarge (>10 kb), nearly identical (>99% nucleotide identity), palindromic sequences are enriched on mammalian sex chromosomes. Primate Y-palindromes undergo high rates of arm-to-arm gene conversion, a proposed mechanism for maintaining their sequence integrity in the absence of X–Y recombination. It is unclear whether X-palindromes, which can freely recombine in females, undergo arm-to-arm gene conversion and, if so, at what rate. We generated high-quality sequence assemblies of Mus molossinus and M. spretus X-palindromic regions and compared them with orthologous M. musculus X-palindromes. Our evolutionary sequence comparisons find evidence of X-palindrome arm-to-arm gene conversion at rates comparable to autosomal allelic gene conversion rates in mice. Mus X-palindromes also carry more derived than ancestral variants between species, suggesting that their sequence is rapidly diverging. We speculate that in addition to maintaining genes’ sequence integrity via sequence homogenization, palindrome arm-to-arm gene conversion may also facilitate rapid sequence divergence.
AbstractThe FADS locus contains the genes FADS1 and FADS2 that encode enzymes involved in the synthesis of long-chain polyunsaturated fatty acids. This locus appears to have been a repeated target of selection in human evolution, likely because dietary input of long-chain polyunsaturated fatty acids varied over time depending on environment and subsistence strategy. Several recent studies have identified selection at the FADS locus in Native American populations, interpreted as evidence for adaptation during or subsequent to the passage through Beringia. Here, we show that these signals are confounded by independent selection—postdating the split from Native Americans—in the European and, possibly, the East Asian populations used in the population branch statistic test. This is supported by direct evidence from ancient DNA that one of the putatively selected haplotypes was already common in Northern Eurasia at the time of the separation of Native American ancestors. An explanation for the present-day distribution of the haplotype that is more consistent with the data is that Native Americans retain the ancestral state of Paleolithic Eurasians. Another haplotype at the locus may reflect a secondary selection signal, although its functional impact is unknown.
AbstractPolycyclic triterpenes are members of the terpene family produced by the cyclization of squalene. The most representative polycyclic triterpenes are hopanoids and sterols, the former are mostly found in bacteria, whereas the latter are largely limited to eukaryotes, albeit with a growing number of bacterial exceptions. Given their important role and omnipresence in most eukaryotes, contrasting with their scant representation in bacteria, sterol biosynthesis was long thought to be a eukaryotic innovation. Thus, their presence in some bacteria was deemed to be the result of lateral gene transfer from eukaryotes. Elucidating the origin and evolution of the polycyclic triterpene synthetic pathways is important to understand the role of these compounds in eukaryogenesis and their geobiological value as biomarkers in fossil records. Here, we have revisited the phylogenies of the main enzymes involved in triterpene synthesis, performing gene neighborhood analysis and phylogenetic profiling. Squalene can be biosynthesized by two different pathways containing the HpnCDE or Sqs proteins. Our results suggest that the HpnCDE enzymes are derived from carotenoid biosynthesis ones and that they assembled in an ancestral squalene pathway in bacteria, while remaining metabolically versatile. Conversely, the Sqs enzyme is prone to be involved in lateral gene transfer, and its emergence is possibly related to the specialization of squalene biosynthesis. The biosynthesis of hopanoids seems to be ancestral in the Bacteria domain. Moreover, no triterpene cyclases are found in Archaea, invoking a potential scenario in which eukaryotic genes for sterol biosynthesis assembled from ancestral bacterial contributions in early eukaryotic lineages.
AbstractDespite their essential role in chromosome segregation in most eukaryotes, centromeric histones (CenH3s) evolve rapidly and are subject to gene turnover. We previously identified four instances of gene duplication and specialization of Cid, which encodes for the CenH3 in Drosophila. We hypothesized that retention of specialized Cid paralogs could be selectively advantageous to resolve the intralocus conflict that occurs on essential genes like Cid, which are subject to divergent selective pressures to perform multiple functions. We proposed that intralocus conflict could be a widespread phenomenon that drives evolutionary innovation in centromeric proteins. If this were the case, we might expect to find other instances of coretention and specialization of centromeric proteins during animal evolution. Consistent with this hypothesis, we find that most mosquito species encode two CenH3 (mosqCid) genes, mosqCid1 and mosqCid2, which have been coretained for over 150 My. In addition, Aedes species encode a third mosqCid3 gene, which arose from an independent gene duplication of mosqCid1. Like Drosophila Cid paralogs, mosqCid paralogs evolve under different selective constraints and show tissue-specific expression patterns. Analysis of mosqCid N-terminal protein motifs further supports the model that mosqCid paralogs have functionally diverged. Extending our survey to other centromeric proteins, we find that all Anopheles mosquitoes encode two CAL1 paralogs, which are the chaperones that deposit CenH3 proteins at centromeres in Diptera, but a single CENP-C paralog. The ancient coretention of paralogs of centromeric proteins adds further support to the hypothesis that intralocus conflict can drive their coretention and functional specialization.
AbstractDuring biological invasions, invasive populations can suffer losses of genetic diversity that are predicted to negatively impact their fitness/performance. Despite examples of invasive populations harboring lower diversity than conspecific populations in their native range, few studies have linked this lower diversity to a decrease in fitness. Using genome sequences, we show that invasive populations of the African fig fly, Zaprionus indianus, have less genetic diversity than conspecific populations in their native range and that diversity is proportionally lower in regions of the genome experiencing low recombination rates. This result suggests that selection may have played a role in lowering diversity in the invasive populations. We next use interspecific comparisons to show that genetic diversity remains relatively high in invasive populations of Z. indianus when compared with other closely related species. By comparing genetic diversity in orthologous gene regions, we also show that the genome-wide landscape of genetic diversity differs between invasive and native populations of Z. indianus indicating that invasion not only affects amounts of genetic diversity but also how that diversity is distributed across the genome. Finally, we use parameter estimates from thermal performance curves for 13 species of Zaprionus to show that Z. indianus has the broadest thermal niche of measured species, and that performance does not differ between invasive and native populations. These results illustrate how aspects of genetic diversity in invasive species can be decoupled from measures of fitness, and that a broad thermal niche may have helped facilitate Z. indianus’s range expansion.
AbstractTranscriptional silencing of retrotransposons via DNA methylation is paramount for mammalian fertility and reproductive fitness. During germ cell development, most mammalian species utilize the de novo DNA methyltransferases DNMT3A and DNMT3B to establish DNA methylation patterns. However, many rodent species deploy a third enzyme, DNMT3C, to selectively methylate the promoters of young retrotransposon insertions in their germline. The evolutionary forces that shaped DNMT3C’s unique function are unknown. Using a phylogenomic approach, we confirm here that Dnmt3C arose through a single duplication of Dnmt3B that occurred ∼60 Ma in the last common ancestor of muroid rodents. Importantly, we reveal that DNMT3C is composed of two independently evolving segments: the latter two-thirds have undergone recurrent gene conversion with Dnmt3B, whereas the N-terminus has instead evolved under strong diversifying selection. We hypothesize that positive selection of Dnmt3C is the result of an ongoing evolutionary arms race with young retrotransposon lineages in muroid genomes. Interestingly, although primates lack DNMT3C, we find that the N-terminus of DNMT3A has also evolved under diversifying selection. Thus, the N-termini of two independent de novo methylation enzymes have evolved under diversifying selection in rodents and primates. We hypothesize that repression of young retrotransposons might be driving the recurrent innovation of a functional domain in the N-termini on germline DNMT3s in mammals.
AbstractDemographic inference using the site frequency spectrum (SFS) is a common way to understand historical events affecting genetic variation. However, most methods for estimating demography from the SFS assume random mating within populations, precluding these types of analyses in inbred populations. To address this issue, we developed a model for the expected SFS that includes inbreeding by parameterizing individual genotypes using beta-binomial distributions. We then take the convolution of these genotype probabilities to calculate the expected frequency of biallelic variants in the population. Using simulations, we evaluated the model’s ability to coestimate demography and inbreeding using one- and two-population models across a range of inbreeding levels. We also applied our method to two empirical examples, American pumas (Puma concolor) and domesticated cabbage (Brassica oleracea var. capitata), inferring models both with and without inbreeding to compare parameter estimates and model fit. Our simulations showed that we are able to accurately coestimate demographic parameters and inbreeding even for highly inbred populations (F = 0.9). In contrast, failing to include inbreeding generally resulted in inaccurate parameter estimates in simulated data and led to poor model fit in our empirical analyses. These results show that inbreeding can have a strong effect on demographic inference, a pattern that was especially noticeable for parameters involving changes in population size. Given the importance of these estimates for informing practices in conservation, agriculture, and elsewhere, our method provides an important advancement for accurately estimating the demographic histories of these species.
A billion years ago, a single-celled eukaryote engulfed a cyanobacterium—an organism capable of converting the sun’s energy into food in the form of carbohydrates. In one of the single most pivotal events in the history of life, instead of the bacterium being digested, an endosymbiosis was formed, with the bacterial cell persisting inside the host eukaryote for millennia and giving rise to the first photosynthetic eukaryotes. The descendants of this merger include plants, as well as a large number of single-celled eukaryotes that are collectively referred to as algae (i.e., kelp, nori). The remnants of the cyanobacterium eventually evolved into an organelle known as a plastid or chloroplast, which allows photosynthetic eukaryotes to produce their own food—and thus to provide food to animals like us. Despite the importance of this event, a variety of aspects of plastid evolution have long remained shrouded in mystery. In a review in Genome Biology and Evolution, Shannon Sibbald and John Archibald highlight emerging genome data in this field and provide new insight into plastid evolution (Sibbald and Archibald 2020).
AbstractThe origin of plastids (chloroplasts) by endosymbiosis stands as one of the most important events in the history of eukaryotic life. The genetic, biochemical, and cell biological integration of a cyanobacterial endosymbiont into a heterotrophic host eukaryote approximately a billion years ago paved the way for the evolution of diverse algal groups in a wide range of aquatic and, eventually, terrestrial environments. Plastids have on multiple occasions also moved horizontally from eukaryote to eukaryote by secondary and tertiary endosymbiotic events. The overall picture of extant photosynthetic diversity can best be described as “patchy”: Plastid-bearing lineages are spread far and wide across the eukaryotic tree of life, nested within heterotrophic groups. The algae do not constitute a monophyletic entity, and understanding how, and how often, plastids have moved from branch to branch on the eukaryotic tree remains one of the most fundamental unsolved problems in the field of cell evolution. In this review, we provide an overview of recent advances in our understanding of the origin and spread of plastids from the perspective of comparative genomics. Recent years have seen significant improvements in genomic sampling from photosynthetic and nonphotosynthetic lineages, both of which have added important pieces to the puzzle of plastid evolution. Comparative genomics has also allowed us to better understand how endosymbionts become organelles.