Molecular Biology and Evolution, Volume 35, Issue 9, 1 September 2018, Pages 2230–2239, https://doi.org/10.1093/molbev/msy123
AbstractFungi are evolutionary shape shifters and adapt quickly to new environments. Ectomycorrhizal (EM) symbioses are mutualistic associations between fungi and plants and have evolved repeatedly and independently across the fungal tree of life, suggesting lineages frequently reconfigure genome content to take advantage of open ecological niches. To date analyses of genomic mechanisms facilitating EM symbioses have involved comparisons of distantly related species, but here, we use the genomes of three EM and two asymbiotic (AS) fungi from the genus Amanita as well as an AS outgroup to study genome evolution following a single origin of symbiosis. Our aim was to identify the defining features of EM genomes, but our analyses suggest no clear differentiation of genome size, gene repertoire size, or transposable element content between EM and AS species. Phylogenetic inference of gene gains and losses suggests the transition to symbiosis was dominated by the loss of plant cell wall decomposition genes, a confirmation of previous findings. However, the same dynamic defines the AS species A. inopinata, suggesting loss is not strictly associated with origin of symbiosis. Gene expansions in the common ancestor of EM Amanita were modest, but lineage specific and large gene family expansions are found in two of the three EM extant species. Even closely related EM genomes appear to share few common features. The genetic toolkit required for symbiosis appears already encoded in the genomes of saprotrophic species, and this dynamic may explain the pervasive, recurrent evolution of ectomycorrhizal associations.
AbstractVertebrate estrogen receptors (ERs) perform numerous cell signaling and transcriptional regulatory functions. ERɑ (Esr1) and ERβ (Esr2) likely evolved from an ancestral receptor that duplicated and diverged at the protein and cis-regulatory levels, but the evolutionary history of ERs, including the timing of proposed duplications, remains unresolved. Here we report on identification of two distinct ERs in cartilaginous fishes and demonstrate their orthology to ERα and ERβ. Phylogenetic analyses place the ERα/ERβ duplication near the base of crown gnathostomes (jawed vertebrates). We find that ERα and ERβ from little skate (Leucoraja erinacea) and mammals share key subtype-specific residues, indicating conserved protein evolution. In contrast, jawless fishes have multiple non-orthologous Esr genes that arose by parallel duplications. Esr1 and Esr2 are expressed in subtype-specific and sexually dimorphic patterns in skate embryos, suggesting that ERs might have functioned in sexually dimorphic development before the divergence of cartilaginous and bony fishes.
AbstractEnzymes are known to fine-tune their sequences to optimize catalytic function, yet quantitative evolutionary design principles of enzymes remain elusive on the proteomic scale. Recently, it was found that the catalytic site in enzymes induces long-range evolutionary constraint, where even sites distant to the catalytic site are more conserved than expected. Given that protein-fold usage is generally different between enzymes and nonenzymes, it remains an open question to what extent this long-range evolutionary constraint in enzymes is dictated, either directly or indirectly, by the special three-dimensional structure of the enzyme. To investigate this question, we have compared evolutionary properties of enzymes with those of counterpart pseudoenzymes that share the same protein fold but are catalytically inactive. We found that the long-range evolutionary constraint observed in enzymes is significantly reduced in pseudoenzyme counterparts, despite very high structural similarity (∼1.5 Å RMSD on average). Furthermore, this significant reduction in long-range evolutionary constraint is observed even in pseudoenzyme counterparts which retain the ligand-binding ability of enzymes. Finally, the distance between the site that induces the highest gradient of sequence conservation and the pseudocatalytic site in pseudoenzymes is significantly larger than the corresponding distance in enzymes. Taken together, our results suggest that the long-range evolutionary constraint in enzymes is induced mainly by the presence of the catalytic site rather than by the special three-dimensional structure of the enzyme, and that such long-range evolutionary constraint in enzymes depends mainly on the catalytic function of the active site rather than on the ligand-binding ability of the enzyme.
AbstractA key question in molecular evolutionary biology concerns the relative roles of mutation and selection in shaping genomic data. Moreover, features of mutation and selection are heterogeneous along the genome and over time. Mechanistic codon substitution models based on the mutation–selection framework are promising approaches to separating these effects. In practice, however, several complications arise, since accounting for such heterogeneities often implies handling models of high dimensionality (e.g., amino acid preferences), or leads to across-site dependence (e.g., CpG hypermutability), making the likelihood function intractable. Approximate Bayesian Computation (ABC) could address this latter issue. Here, we propose a new approach, named Conditional ABC (CABC), which combines the sampling efficiency of MCMC and the flexibility of ABC. To illustrate the potential of the CABC approach, we apply it to the study of mammalian CpG hypermutability based on a new mutation-level parameter implying dependence across adjacent sites, combined with site-specific purifying selection on amino-acids captured by a Dirichlet process. Our proof-of-concept of the CABC methodology opens new modeling perspectives. Our application of the method reveals a high level of heterogeneity of CpG hypermutability across loci and mild heterogeneity across taxonomic groups; and finally, we show that CpG hypermutability is an important evolutionary factor in rendering relative synonymous codon usage. All source code is available as a GitHub repository (https://github.com/Simonll/LikelihoodFreePhylogenetics.git).
AbstractFor 30 years, it has been clear that angiosperm mitochondrial genomes evolve rapidly in sequence arrangement (i.e., synteny), yet absolute rates of rearrangement have not been measured in any plant group, nor is it known how much these rates vary. To investigate these issues, we sequenced and reconstructed the rearrangement history of seven mitochondrial genomes in Monsonia (Geraniaceae). We show that rearrangements (occurring mostly as inversions) not only take place at generally high rates in these genomes but also uncover significant variation in rearrangement rates. For example, the hyperactive mitochondrial genome of Monsonia ciliata has accumulated at least 30 rearrangements over the last million years, whereas the branch leading to M. ciliata and its sister species has sustained rearrangement at a rate that is at least ten times lower. Furthermore, our analysis of published data shows that rates of mitochondrial genome rearrangement in seed plants vary by at least 600-fold. We find that sites of rearrangement are highly preferentially located in very close proximity to repeated sequences in Monsonia. This provides strong support for the hypothesis that rearrangement in angiosperm mitochondrial genomes occurs largely through repeat-mediated recombination. Because there is little variation in the amount of repeat sequence among Monsonia genomes, the variable rates of rearrangement in Monsonia probably reflect variable rates of mitochondrial recombination itself. Finally, we show that mitochondrial synonymous substitutions occur in a clock-like manner in Monsonia; rates of mitochondrial substitutions and rearrangements are therefore highly uncoupled in this group.
AbstractMeiotic recombination is an evolutionary force that generates new genetic diversity upon which selection can act. Whereas multiple studies have assessed genome-wide patterns of recombination and specific cases of intragenic recombination, few studies have assessed intragenic recombination genome-wide in higher eukaryotes. We identified recombination events within or near genes in a population of maize recombinant inbred lines (RILs) using RNA-sequencing data. Our results are consistent with case studies that have shown that intragenic crossovers cluster at the 5′ ends of some genes. Further, we identified cases of intragenic crossovers that generate transgressive transcript accumulation patterns, that is, recombinant alleles displayed higher or lower levels of expression than did nonrecombinant alleles in any of ∼100 RILs, implicating intragenic recombination in the generation of new variants upon which selection can act. Thousands of apparent gene conversion events were identified, allowing us to estimate the genome-wide rate of gene conversion at SNP sites (4.9 × 10−5). The density of syntenic genes (i.e., those conserved at the same genomic locations since the divergence of maize and sorghum) exhibits a substantial correlation with crossover frequency, whereas the density of nonsyntenic genes (i.e., those which have transposed or been lost subsequent to the divergence of maize and sorghum) shows little correlation, suggesting that crossovers occur at higher rates in syntenic genes than in nonsyntenic genes. Increased rates of crossovers in syntenic genes could be either a consequence of the evolutionary conservation of synteny or a biological process that helps to maintain synteny.
AbstractThe oxymonad Monocercomonoides exilis was recently reported to be the first eukaryote that has completely lost the mitochondrial compartment. It was proposed that an important prerequisite for such a radical evolutionary step was the acquisition of the SUF Fe–S cluster assembly pathway from prokaryotes, making the mitochondrial ISC pathway dispensable. We have investigated genomic and transcriptomic data from six oxymonad species and their relatives, composing the group Preaxostyla (Metamonada, Excavata), for the presence and absence of enzymes involved in Fe–S cluster biosynthesis. None possesses enzymes of mitochondrial ISC pathway and all apparently possess the SUF pathway, composed of SufB, C, D, S, and U proteins, altogether suggesting that the transition from ISC to SUF preceded their last common ancestor. Interestingly, we observed that SufDSU were fused in all three oxymonad genomes, and in the genome of Paratrimastix pyriformis. The donor of the SUF genes is not clear from phylogenetic analyses, but the enzyme composition of the pathway and the presence of SufDSU fusion suggests Firmicutes, Thermotogae, Spirochaetes, Proteobacteria, or Chloroflexi as donors. The inventory of the downstream CIA pathway enzymes is consistent with that of closely related species that retain ISC, indicating that the switch from ISC to SUF did not markedly affect the downstream process of maturation of cytosolic and nuclear Fe–S proteins.
AbstractThe genomics era has expanded our knowledge about the diversity of the living world, yet harnessing high-throughput sequencing data to investigate alternative evolutionary trajectories, such as hybridization, is still challenging. Here we present sppIDer, a pipeline for the characterization of interspecies hybrids and pure species, that illuminates the complete composition of genomes. sppIDer maps short-read sequencing data to a combination genome built from reference genomes of several species of interest and assesses the genomic contribution and relative ploidy of each parental species, producing a series of colorful graphical outputs ready for publication. As a proof-of-concept, we use the genus Saccharomyces to detect and visualize both interspecies hybrids and pure strains, even with missing parental reference genomes. Through simulation, we show that sppIDer is robust to variable reference genome qualities and performs well with low-coverage data. We further demonstrate the power of this approach in plants, animals, and other fungi. sppIDer is robust to many different inputs and provides visually intuitive insight into genome composition that enables the rapid identification of species and their interspecies hybrids. sppIDer exists as a Docker image, which is a reusable, reproducible, transparent, and simple-to-run package that automates the pipeline and installation of the required dependencies (https://github.com/GLBRC/sppIDer; last accessed September 6, 2018).
AbstractUnderstanding how microalgae adapt to rapidly changing environments is not only important to science but can help clarify the potential impact of climate change on the biology of primary producers. We sequenced and analyzed the nuclear genome of multiple Picochlorum isolates (Chlorophyta) to elucidate strategies of environmental adaptation. It was previously found that coordinated gene regulation is involved in adaptation to salinity stress, and here we show that gene gain and loss also play key roles in adaptation. We determined the extent of horizontal gene transfer (HGT) from prokaryotes and their role in the origin of novel functions in the Picochlorum clade. HGT is an ongoing and dynamic process in this algal clade with adaptation being driven by transfer, divergence, and loss. One HGT candidate that is differentially expressed under salinity stress is indolepyruvate decarboxylase that is involved in the production of a plant auxin that mediates bacteria–diatom symbiotic interactions. Large differences in levels of heterozygosity were found in diploid haplotypes among Picochlorum isolates. Biallelic divergence was pronounced in P. oklahomensis (salt plains environment) when compared with its closely related sister taxon Picochlorum SENEW3 (brackish water environment), suggesting a role of diverged alleles in response to environmental stress. Our results elucidate how microbial eukaryotes with limited gene inventories expand habitat range from mesophilic to halophilic through allelic diversity, and with minor but important contributions made by HGT. We also explore how the nature and quality of genome data may impact inference of nuclear ploidy.
AbstractMolluscan shells, mainly composed of calcium carbonate, also contain organic components such as proteins and polysaccharides. Shell organic matrices construct frameworks of shell structures and regulate crystallization processes during shell formation. To date, a number of shell matrix proteins (SMPs) have been identified, and their functions in shell formation have been studied. However, previous studies focused only on SMPs extracted from adult shells, secreted after metamorphosis. Using proteomic analyses combined with genomic and transcriptomic analyses, we have identified 31 SMPs from larval shells of the pearl oyster, Pinctada fucata, and 111 from the Pacific oyster, Crassostrea gigas. Larval SMPs are almost entirely different from those of adults in both species. RNA-seq data also confirm that gene expression profiles for larval and adult shell formation are nearly completely different. Therefore, bivalves have two repertoires of SMP genes to construct larval and adult shells. Despite considerable differences in larval and adult SMPs, some functional domains are shared by both SMP repertoires. Conserved domains include von Willebrand factor type A (VWA), chitin-binding (CB), carbonic anhydrase (CA), and acidic domains. These conserved domains are thought to play crucial roles in shell formation. Furthermore, a comprehensive survey of animal genomes revealed that the CA and VWA–CB domain-containing protein families expanded in molluscs after their separation from other Lophotrochozoan linages such as the Brachiopoda. After gene expansion, some family members were co-opted for molluscan SMPs that may have triggered to develop mineralized shells from ancestral, nonmineralized chitinous exoskeletons.
AbstractAs are most non-European populations, the Han Chinese are relatively understudied in population and medical genetics studies. From low-coverage whole-genome sequencing of 11,670 Han Chinese women we present a catalog of 25,057,223 variants, including 548,401 novel variants that are seen at least 10 times in our data set. Individuals from this data set came from 24 out of 33 administrative divisions across China (including 19 provinces, 4 municipalities, and 1 autonomous region), thus allowing us to study population structure, genetic ancestry, and local adaptation in Han Chinese. We identified previously unrecognized population structure along the East–West axis of China, demonstrated a general pattern of isolation-by-distance among Han Chinese, and reported unique regional signals of admixture, such as European influences among the Northwestern provinces of China. Furthermore, we identified a number of highly differentiated, putatively adaptive, loci (e.g., MTHFR, ADH7, and FADS, among others) that may be driven by immune response, climate, and diet in the Han Chinese. Finally, we have made available allele frequency estimates stratified by administrative divisions across China in the Geography of Genetic Variant browser for the broader community. By leveraging the largest currently available genetic data set for Han Chinese, we have gained insights into the history and population structure of the world’s largest ethnic group.
AbstractHuman populations often exhibit contrasting patterns of genetic diversity in the mtDNA and the nonrecombining portion of the Y-chromosome (NRY), which reflect sex-specific cultural behaviors and population histories. Here, we sequenced 2.3 Mb of the NRY from 284 individuals representing more than 30 Native American groups from Northwestern Amazonia (NWA) and compared these data to previously generated mtDNA genomes from the same groups, to investigate the impact of cultural practices on genetic diversity and gain new insights about NWA population history. Relevant cultural practices in NWA include postmarital residential rules and linguistic exogamy, a marital practice in which men are required to marry women speaking a different language. We identified 2,969 SNPs in the NRY sequences, only 925 of which were previously described. The NRY and mtDNA data showed different sex-specific demographic histories: female effective population size has been larger than that of males through time, which might reflect larger variance in male reproductive success. Both markers show an increase in lineage diversification beginning ∼5,000 years ago, which may reflect the intensification of agriculture, technological innovations, and the expansion of regional trade networks documented in the archaeological evidence. Furthermore, we find similar excesses of NRY versus mtDNA between-population divergence at both the local and continental scale, suggesting long-term stability of female versus male migration. We also find evidence of the impact of sociocultural practices on diversity patterns. Finally, our study highlights the importance of analyzing high-resolution mtDNA and NRY sequences to reconstruct demographic history, since this can differ considerably between sexes.
AbstractBacteria regulate genes to survive antibiotic stress, but regulation can be far from perfect. When regulation is not optimal, mutations that change gene expression can contribute to antibiotic resistance. It is not systematically understood to what extent natural gene regulation is or is not optimal for distinct antibiotics, and how changes in expression of specific genes quantitatively affect antibiotic resistance. Here we discover a simple quantitative relation between fitness, gene expression, and antibiotic potency, which rationalizes our observation that a multitude of genes and even innate antibiotic defense mechanisms have expression that is critically nonoptimal under antibiotic treatment. First, we developed a pooled-strain drug-diffusion assay and screened Escherichia coli overexpression and knockout libraries, finding that resistance to a range of 31 antibiotics could result from changing expression of a large and functionally diverse set of genes, in a primarily but not exclusively drug-specific manner. Second, by synthetically controlling the expression of single-drug and multidrug resistance genes, we observed that their fitness–expression functions changed dramatically under antibiotic treatment in accordance with a log-sensitivity relation. Thus, because many genes are nonoptimally expressed under antibiotic treatment, many regulatory mutations can contribute to resistance by altering expression and by activating latent defenses.
AbstractUnder the nearly neutral theory of molecular evolution, the proportion of effectively neutral mutations is expected to depend upon the effective population size (Ne). Here, we investigate whether this is the case across the genome of Drosophila melanogaster using polymorphism data from North American and African lines. We show that the ratio of the number of nonsynonymous and synonymous polymorphisms is negatively correlated to the number of synonymous polymorphisms, even when the nonindependence is accounted for. The relationship is such that the proportion of effectively neutral nonsynonymous mutations increases by ∼45% as Ne is halved. However, we also show that this relationship is steeper than expected from an independent estimate of the distribution of fitness effects from the site frequency spectrum. We investigate a number of potential explanations for this and show, using simulation, that this is consistent with a model of genetic hitchhiking: Genetic hitchhiking depresses diversity at neutral and weakly selected sites, but has little effect on the diversity of strongly selected sites.
AbstractPhylogeny estimation is difficult for closely related populations and species, especially if they have been exchanging genes. We present a hierarchical Bayesian, Markov-chain Monte Carlo method with a state space that includes all possible phylogenies in a full Isolation-with-Migration model framework. The method is based on a new type of genealogy augmentation called a “hidden genealogy” that enables efficient updating of the phylogeny. This is the first likelihood-based method to fully incorporate directional gene flow and genetic drift for estimation of a species or population phylogeny. Application to human hunter-gatherer populations from Africa revealed a clear phylogenetic history, with strong support for gene exchange with an unsampled ghost population, and relatively ancient divergence between a ghost population and modern human populations, consistent with human/archaic divergence. In contrast, a study of five chimpanzee populations reveals a clear phylogeny with several pairs of populations having exchanged DNA, but does not support a history with an unsampled ghost population.
AbstractAdaptive divergence between marine and freshwater (FW) environments is important in generating phyletic diversity within fishes, but the genetic basis of this process remains poorly understood. Genome selection scans can identify adaptive loci, but incomplete knowledge of genotype–phenotype connections makes interpreting their significance difficult. In contrast, association mapping (genome-wide association mapping [GWAS], random forest [RF] analyses) links genotype to phenotype, but offer limited insight into the evolutionary forces shaping variation. Here, we combined GWAS, RF, and selection scans to identify loci important in adaptation to FW environments. We utilized FW-native and brackish water (BW)-native populations of Atlantic killifish (Fundulus heteroclitus) as well as a naturally admixed population between the two. We measured morphology and multiple physiological traits that differ between populations and may contribute to osmotic adaptation (salinity tolerance, hypoxia tolerance, metabolic rate, body shape) and used a reduced representation approach for genome-wide genotyping. Our results show patterns of population divergence in physiological capabilities that are consistent with local adaptation. Population genomic scans between BW-native and FW-native populations identified genomic regions evolving by natural selection, whereas association mapping revealed loci that contribute to variation for each trait. There was substantial overlap in the genomic regions putatively under selection and loci associated with phenotypic traits, particularly for salinity tolerance, suggesting that these regions and genes are important for adaptive divergence between BW and FW environments. Together, these data provide insight into the mechanisms that enable diversification of fishes across osmotic boundaries.
AbstractCytolytic pore-forming proteins are widespread in living organisms, being mostly involved in both sides of the host–pathogen interaction, either contributing to the innate defense or promoting infection. In venomous organisms, such as spiders, insects, scorpions, and sea anemones, pore-forming proteins are often secreted as key components of the venom. Coluporins are pore-forming proteins recently discovered in the Mediterranean hematophagous snail Cumia reticulata (Colubrariidae), highly expressed in the salivary glands that discharge their secretion at close contact with the host. To understand their putative functional role, we investigated coluporins’ molecular diversity and evolutionary patterns. Coluporins is a well-diversified family including at least 30 proteins, with an overall low sequence similarity but sharing a remarkably conserved actinoporin-like predicted structure. Tracking the evolutionary history of the molluscan porin genes revealed a scattered distribution of this family, which is present in some other lineages of predatory gastropods, including venomous conoidean snails. Comparative transcriptomic analyses highlighted the expansion of porin genes as a lineage-specific feature of colubrariids. Coluporins seem to have evolved from a single ancestral porin gene present in the latest common ancestor of all Caenogastropoda, undergoing massive expansion and diversification in this colubrariid lineage through repeated gene duplication events paired with widespread episodic positive selection. As for other parasites, these findings are congruent with a “one-sided arms race,” equipping the parasite with multiple variants in order to broaden its host spectrum. Overall, our results pinpoint a crucial adaptive role for coluporins in the evolution of the peculiar trophic ecology of vampire snails.
AbstractVariola virus is at risk of re-emergence either through accidental release, bioterrorism, or synthetic biology. The use of phylogenetics and phylogeography to support epidemic field response is expected to grow as sequencing technology becomes miniaturized, cheap, and ubiquitous. In this study, we aimed to explore the use of common VARV diagnostic targets hemagglutinin (HA), cytokine response modifier B (CrmB), and A-type inclusion protein (ATI) for phylogenetic characterization as well as the representativeness of modelling strategies in phylogeography to support epidemic response should smallpox re-emerge. We used Bayesian discrete-trait phylogeography using the most complete data set currently available of whole genome (n = 51) and partially sequenced (n = 20) VARV isolates. We show that multilocus models combining HA, ATI, and CrmB genes may represent a useful heuristic to differentiate between VARV Major and subclades of VARV Minor which have been associated with variable case-fatality rates. Where whole genome sequencing is unavailable, phylogeography models of HA, ATI, and CrmB may provide preliminary but uncertain estimates of transmission, while supplementing whole genome models with additional isolates sequenced only for HA can improve sample representativeness, maintaining similar support for transmission relative to whole genome models. We have also provided empirical evidence delineating historic international VARV transmission using phylogeography. Due to the persistent threat of re-emergence, our results provide important research for smallpox epidemic preparedness in the posteradication era as recommended by the World Health Organisation.
AbstractGenes are “born,” and eventually they “die.” These processes shape the phenotypic evolution of organisms and are hence of great biological interest. If genes die in plants, they generally do so quite rapidly. Here, we describe the fate of GOA-like genes that evolve in a dramatically different manner. GOA-like genes belong to the subfamily of Bsister genes of MIKC-type MADS-box genes. Typical MIKC-type genes encode conserved transcription factors controlling plant development. We show that ABS-like genes, a clade of Bsister genes, are indeed highly conserved in crucifers (Brassicaceae) maintaining the ancestral function of Bsister genes in ovule and seed development. In contrast, their closest paralogs, the GOA-like genes, have been undergoing convergent gene death in Brassicaceae. Intriguingly, erosion of GOA-like genes occurred after millions of years of coexistence with ABS-like genes. We thus describe Delayed Convergent Asymmetric Degeneration, a so far neglected but possibly frequent pattern of duplicate gene evolution that does not fit classical scenarios. Delayed Convergent Asymmetric Degeneration of GOA-like genes may have been initiated by a reduction in the expression of an ancestral GOA-like gene in the stem group of Brassicaceae and driven by dosage subfunctionalization. Our findings have profound implications for gene annotations in genomics, interpreting patterns of gene evolution and using genes in phylogeny reconstructions of species.
Nature provides countless examples of evolutionary arms races, in which species develop adaptations and counter-adaptations in a struggle for survival and reproduction. Such arms races are common between predator and prey or between parasite and host. Understanding this coevolutionary process can aid in our ability to develop necessary countermeasures, such as overcoming bacterial resistance to antibiotics.
AbstractGene duplication and loss contribute to gene content differences as well as phenotypic divergence across species. However, the extent to which gene content varies among closely related plant species and the factors responsible for such variation remain unclear. Here, using the Solanaceae family as a model and Pfam domain families as a proxy for gene families, we investigated variation in gene family sizes across species and the likely factors contributing to the variation. We found that genes in highly variable families have high turnover rates and tend to be involved in processes that have diverged between Solanaceae species, whereas genes in low-variability families tend to have housekeeping roles. In addition, genes in high- and low-variability gene families tend to be duplicated by tandem and whole genome duplication, respectively. This finding together with the observation that genes duplicated by different mechanisms experience different selection pressures suggest that duplication mechanism impacts gene family turnover. We explored using pseudogene number as a proxy for gene loss but discovered that a substantial number of pseudogenes are actually products of pseudogene duplication, contrary to the expectation that most plant pseudogenes are remnants of once-functional duplicates. Our findings reveal complex relationships between variation in gene family size, gene functions, duplication mechanism, and evolutionary rate. The patterns of lineage-specific gene family expansion within the Solanaceae provide the foundation for a better understanding of the genetic basis underlying phenotypic diversity in this economically important family.
AbstractEndogenous viral sequences in eukaryotic genomes, such as those derived from plant pararetroviruses (PRVs), can serve as genomic fossils to study viral macroevolution. Many aspects of viral evolutionary rates are heterogeneous, including substitution rate differences between genes. However, the evolutionary dynamics of this viral gene rate heterogeneity (GRH) have been rarely examined. Characterizing such GRH may help to elucidate viral adaptive evolution. In this study, based on robust phylogenetic analysis, we determined an ancient endogenous PRV group in Oryza genomes in the range of being 2.41–15.00 Myr old. We subsequently used this ancient endogenous PRV group and three younger groups to estimate the GRH of PRVs. Long-term substitution rates for the most conserved gene and a divergent gene were 2.69 × 10−8 to 8.07 × 10−8 and 4.72 × 10−8 to 1.42 × 10−7 substitutions/site/year, respectively. On the basis of a direct comparison, a long-term GRH of 1.83-fold was identified between these two genes, which is unexpectedly low and lower than the short-term GRH (>3.40-fold) of PRVs calculated using published data. The lower long-term GRH of PRVs was due to the slightly faster rate decay of divergent genes than of conserved genes during evolution. To the best of our knowledge, we quantified for the first time the long-term GRH of viral genes using paleovirological analyses, and proposed that the GRH of PRVs might be heterogeneous on time scales (time-dependent GRH). Our findings provide special insights into viral gene macroevolution and should encourage a more detailed examination of the viral GRH.
AbstractVitellogenin (Vtg) is a glycolipophosphoprotein produced by oviparous and ovoviviparous species and is the precursor protein of the yolk, an essential nutrient reserve for embryonic development and early larval stages. Vtg is encoded by a family of paralog genes whose number varies in the different vertebrate lineages. Its evolution has been the subject of considerable analyses but it remains still unclear. In this work, microsyntenic and phylogenetic analyses were performed in order to increase our knowledge on the evolutionary history of this gene family in vertebrates. Our results support the hypothesis that the vitellogenin gene family is expanded from two genes both present at the beginning of vertebrate radiation through multiple independent duplication events occurred in the diverse lineages.
AbstractIt is often unavoidable to combine data from different sequencing centers or sequencing platforms when compiling data sets with a large number of individuals. However, the different data are likely to contain specific systematic errors that will appear as SNPs. Here, we devise a method to detect systematic errors in combined data sets. To measure quality differences between individual genomes, we study pairs of variants that reside on different chromosomes and co-occur in individuals. The abundance of these pairs of variants in different genomes is then used to detect systematic errors due to batch effects. Applying our method to the 1000 Genomes data set, we find that coding regions are enriched for errors, where ∼1% of the higher frequency variants are predicted to be erroneous, whereas errors outside of coding regions are much rarer (<0.001%). As expected, predicted errors are found less often than other variants in a data set that was generated with a different sequencing technology, indicating that many of the candidates are indeed errors. However, predicted 1000 Genomes errors are also found in other large data sets; our observation is thus not specific to the 1000 Genomes data set. Our results show that batch effects can be turned into a virtue by using the resulting variation in large scale data sets to detect systematic errors.
AbstractChlamydiae are an example of obligate intracellular bacteria that possess highly reduced, compact genomes (1.0–3.5 Mbp), reflective of their abilities to sequester many essential nutrients from the host that they no longer need to synthesize themselves. The Chlamydiae is a phylum with a very wide host range spanning mammals, birds, fish, invertebrates, and unicellular protists. This ecological and phylogenetic diversity offers ongoing opportunities to study intracellular survival and metabolic pathways and adaptations. Of particular evolutionary significance are Chlamydiae from the recently proposed Ca. Parilichlamydiaceae, the earliest diverging clade in this phylum, species of which are found only in aquatic vertebrates. Gill extracts from three Chlamydiales-positive Australian aquaculture species (Yellowtail kingfish, Striped trumpeter, and Barramundi) were subject to DNA preparation to deplete host DNA and enrich microbial DNA, prior to metagenome sequencing. We assembled chlamydial genomes corresponding to three Ca. Parilichlamydiaceae species from gill metagenomes, and conducted functional genomics comparisons with diverse members of the phylum. This revealed highly reduced genomes more similar in size to the terrestrial Chlamydiaceae, standing in contrast to members of the Chlamydiae with a demonstrated cosmopolitan host range. We describe a reduction in genes encoding synthesis of nucleotides and amino acids, among other nutrients, and an enrichment of predicted transport proteins. Ca. Parilichlamydiaceae share 342 orthologs with other chlamydial families. We hypothesize that the genome reduction exhibited by Ca. Parilichlamydiaceae and Chlamydiaceae is an example of within-phylum convergent evolution. The factors driving these events remain to be elucidated.
AbstractPrediction of evolutionary trajectories has been an elusive goal, requiring a deep knowledge of underlying mechanisms that relate genotype to phenotype plus understanding how phenotype impacts organismal fitness. We tested our ability to predict molecular regulatory evolution in a bacteriophage (T7) whose RNA polymerase (RNAP) was altered to recognize a heterologous promoter differing by three nucleotides from the wild-type promoter. A mutant of wild-type T7 lacking its RNAP gene was passaged on a bacterial strain providing the novel RNAP in trans. Higher fitness rapidly evolved. Predicting the evolutionary trajectory of this adaptation used measured in vitro transcription rates of the novel RNAP on the six promoter sequences capturing all possible one-step pathways between the wild-type and the heterologous promoter sequences. The predictions captured some of the regulatory evolution but failed both in explaining 1) a set of T7 promoters that consistently failed to evolve and 2) some promoter evolution that fell outside the expected one-step pathways. Had a more comprehensive set of transcription assays been undertaken initially, all promoter evolution would have fallen within predicted bounds, but the lack of evolution in some promoters is unresolved. Overall, this study points toward the increasing feasibility of predicting evolution in well-characterized, simple systems.
AbstractThe Lennoaceae, a small monophyletic plant family of root parasites endemic to the Americas, are one of the last remaining independently evolved lineages of parasitic angiosperms lacking a published plastome. In this study, we present the assembled and annotated plastomes of two species spanning the crown node of Lennoaceae, Lennoa madreporoides and Pholisma arenarium, as well as their close autotrophic relative from the sister family Ehretiaceae, Tiquilia plicata. We find that the plastomes of L. madreporoides and P. arenarium are similar in size and gene content, and substantially reduced compared to T. plicata, consistent with trends seen in other holoparasitic lineages. In particular, most plastid genes involved in photosynthesis function have been lost, whereas housekeeping genes (ribosomal protein-coding genes, rRNAs, and tRNAs) are retained. One notable exception is the persistence of a rbcL open reading frame in P. arenarium but not L. madreporoides suggesting a nonphotosynthetic function for this gene. Of the retained coding genes, dN/dS ratios indicate that some remain under purifying selection, whereas others show relaxed selection. Overall, this study supports the mounting evidence for convergent plastome evolution in flowering plants following the shift to heterotrophy.
AbstractFeather diversity is striking in many aspects. Although the development of feather has been studied for decades, genetic and genomic studies of feather diversity have begun only recently. Many questions remain to be answered by multidisciplinary approaches. In this review, we discuss three levels of feather diversity: Feather morphotypes, intraspecific variations, and interspecific variations. We summarize recent studies of feather evolution in terms of genetics, genomics, and developmental biology and provide perspectives for future research. Specifically, this review includes the following topics: 1) Diversity of feather morphotype; 2) feather diversity among different breeds of domesticated birds, including variations in pigmentation pattern, in feather length or regional identity, in feather orientation, in feather distribution, and in feather structure; and 3) diversity of feathers among avian species, including plumage color and morph differences between species and the regulatory differences in downy feather development between altricial and precocial birds. Finally, we discussed future research directions.
AbstractAphids are a diverse group of taxa that contain agronomically important species, which vary in their host range and ability to infest crop plants. The genome evolution underlying agriculturally important aphid traits is not well understood. We generated draft genome assemblies for two aphid species: Myzus cerasi (black cherry aphid) and the cereal specialist Rhopalosiphum padi. Using a de novo gene prediction pipeline on both these, and three additional aphid genome assemblies (Acyrthosiphon pisum, Diuraphis noxia, and Myzus persicae), we show that aphid genomes consistently encode similar gene numbers. We compare gene content, gene duplication, synteny, and putative effector repertoires between these five species to understand the genome evolution of globally important plant parasites. Aphid genomes show signs of relatively distant gene duplication, and substantial, relatively recent, gene birth. Putative effector repertoires, originating from duplicated and other loci, have an unusual genomic organization and evolutionary history. We identify a highly conserved effector pair that is tightly physically linked in the genomes of all aphid species tested. In R. padi, this effector pair is tightly transcriptionally linked and shares an unknown transcriptional control mechanism with a subset of ∼50 other putative effectors and secretory proteins. This study extends our current knowledge on the evolution of aphid genomes and reveals evidence for an as-of-yet unknown shared control mechanism, which underlies effector expression, and ultimately plant parasitism.
AbstractThe frequency of horizontal transfers of transposable elements (HTTs) varies among the types of elements according to the transposition mode and the geographical and temporal overlap of the species involved in the transfer. The drosophilid species of the genus Zaprionus and those of the melanogaster, obscura, repleta, and virilis groups of the genus Drosophila investigated in this study shared space and time at some point in their evolutionary history. This is particularly true of the subgenus Zaprionus and the melanogaster subgroup, which overlapped both geographically and temporally in Tropical Africa during their period of origin and diversification. Here, we tested the hypothesis that this overlap may have facilitated the transfer of retrotransposons without long terminal repeats (non-LTRs) between these species. We estimated the HTT frequency of the non-LTRs BS and Helena at the genome-wide scale by using a phylogenetic framework and a vertical and horizontal inheritance consistence analysis (VHICA). An excessively low synonymous divergence among distantly related species and incongruities between the transposable element and species phylogenies allowed us to propose at least four relatively recent HTT events of Helena and BS involving ancestors of the subgroup melanogaster and ancestors of the subgenus Zaprionus during their concomitant diversification in Tropical Africa, along with older possible events between species of the subgenera Drosophila and Sophophora. This study provides the first evidence for HTT of non-LTRs retrotransposons between Drosophila and Zaprionus, including an in-depth reconstruction of the time frame and geography of these events.
AbstractPlastid genomes display remarkable organizational stability over evolutionary time. From green algae to angiosperms, most plastid genomes are largely collinear, with only a few cases of inversion, gene loss, or, in extremely rare cases, gene addition. These plastome insertions are mostly clade-specific and are typically of nuclear or mitochondrial origin. Here, we expand on these findings and present the first family-level survey of plastome evolution in ferns, revealing a novel suite of dynamic mobile elements. Comparative plastome analyses of the Pteridaceae expose several mobile open reading frames that vary in sequence length, insertion site, and configuration among sampled taxa. Even between close relatives, the presence and location of these elements is widely variable when viewed in a phylogenetic context. We characterize these elements and refer to them collectively as Mobile Open Reading Frames in Fern Organelles (MORFFO). We further note that the presence of MORFFO is not restricted to Pteridaceae, but is found across ferns and other plant clades. MORFFO elements are regularly associated with inversions, intergenic expansions, and changes to the inverted repeats. They likewise appear to be present in mitochondrial and nuclear genomes of ferns, indicating that they can move between genomic compartments with relative ease. The origins and functions of these mobile elements are unknown, but MORFFO appears to be a major driver of structural genome evolution in the plastomes of ferns, and possibly other groups of plants.
AbstractThe transcriptome of the venom duct of the Atlantic piscivorous cone species Chelyconus ermineus (Born, 1778) was determined. The venom repertoire of this species includes at least 378 conotoxin precursors, which could be ascribed to 33 known and 22 new (unassigned) protein superfamilies, respectively. Most abundant superfamilies were T, W, O1, M, O2, and Z, accounting for 57% of all detected diversity. A total of three individuals were sequenced showing considerable intraspecific variation: each individual had many exclusive conotoxin precursors, and only 20% of all inferred mature peptides were common to all individuals. Three different regions (distal, medium, and proximal with respect to the venom bulb) of the venom duct were analyzed independently. Diversity (in terms of number of distinct members) of conotoxin precursor superfamilies increased toward the distal region whereas transcripts detected toward the proximal region showed higher expression levels. Only the superfamilies A and I3 showed statistically significant differential expression across regions of the venom duct. Sequences belonging to the alpha (motor cabal) and kappa (lightning-strike cabal) subfamilies of the superfamily A were mainly detected in the proximal region of the venom duct. The mature peptides of the alpha subfamily had the α4/4 cysteine spacing pattern, which has been shown to selectively target muscle nicotinic-acetylcholine receptors, ultimately producing paralysis. This function is performed by mature peptides having a α3/5 cysteine spacing pattern in piscivorous cone species from the Indo-Pacific region, thereby supporting a convergent evolution of piscivory in cones.
AbstractThis work presents a systematic approach to study the conservation of genes between fruit flies and mammals. We have listed 971 Drosophila genes involved in female reproduction at the ovarian level and systematically looked for orthologs in the Ciona, zebrafish, coelacanth, lizard, chicken, and mouse. Depending on the species, the percentage of these Drosophila genes with at least one ortholog varies between 69% and 78%. In comparison, only 42% of all the Drosophila genes have an ortholog in the mouse genome (P < 0.0001), suggesting a dramatically higher evolutionary conservation of ovarian genes. The 177 Drosophila genes that have no ortholog in mice and other vertebrates correspond to genes that are involved in mechanisms of oogenesis that are specific to the fruit fly or the insects. Among 759 genes with at least one ortholog in the zebrafish, 73 have an expression enriched in the ovary in this species (RNA-seq data). Among 760 genes that have at least one ortholog in the mouse; 76 and 11 orthologs are reported to be preferentially and exclusively expressed in the mouse ovary, respectively (based on the UniGene expressed sequence tag database). Several of them are already known to play a key role in murine oogenesis and/or to be enriched in the mouse/zebrafish oocyte, whereas others have remained unreported. We have investigated, by RNA-seq and real-time quantitative PCR, the exclusive ovarian expression of 10 genes in fish and mammals. Overall, we have found several novel candidates potentially involved in mammalian oogenesis by an evolutionary approach and using the fruit fly as an animal model.