The article 10.1093/molbev/msy216 has been withdrawn because it is a duplicate of 10.1093/molbev/msy210. The publisher regrets the error.
AbstractGgtree is a comprehensive R package for visualizing and annotating phylogenetic trees with associated data. It can also map and visualize associated external data on phylogenies with two general methods. Method 1 allows external data to be mapped on the tree structure and used as visual characteristic in tree and data visualization. Method 2 plots the data with the tree side by side using different geometric functions after reordering the data based on the tree structure. These two methods integrate data with phylogeny for further exploration and comparison in the evolutionary biology context. Ggtree is available from http://www.bioconductor.org/packages/ggtree.
AbstractAdenosine deaminases (ADAs) play a pivotal role in regulating the level of adenosine, an important signaling molecule that controls a variety of cellular responses. Two distinct ADAs, ADA1 and adenosine deaminase growth factor (ADGF aka ADA2), are known. Cytoplasmic ADA1 plays a key role in purine metabolism and is widely distributed from prokaryotes to mammals. On the other hand, secreted ADGF/ADA2 is a cell-signaling protein that was thought to be present only in multicellular organisms. Here, we discovered a bacterial homologue of ADGF/ADA2. Bacterial and eukaryotic ADGF/ADA2 possess the dimerization and PRB domains characteristic for the family, have nearly identical catalytic sites, and show similar catalytic characteristics. Most surprisingly, the bacterial enzyme has a signal sequence similar to that of eukaryotic ADGF/ADA2 and is specifically secreted into the extracellular space, where it may potentially control the level of extracellular adenosine. This finding provides the first example of evolution of an extracellular eukaryotic signaling protein from a secreted bacterial analogue with identical activity and suggests a potential role of ADGF/ADA2 in bacterial communication.
AbstractMitochondrial genomes of vertebrates are generally thought to evolve under strong selection for size reduction and gene order conservation. Therefore, a growing number of mitogenomes with duplicated regions changes our view on the genome evolution. Among Aves, order Psittaciformes (parrots) is especially noteworthy because of its large morphological, ecological, and taxonomical diversity, which offers an opportunity to study genome evolution in various aspects. Former analyses showed that tandem duplications comprising the control region with adjacent genes are restricted to several lineages in which the duplication occurred independently. However, using an appropriate polymerase chain reaction strategy, we demonstrate that early diverged parrot groups contain mitogenomes with the duplicated region. These findings together with mapping duplication data from other mitogenomes onto parrot phylogeny indicate that the duplication was an ancestral state for Psittaciformes. The state was inherited by main parrot groups and was lost several times in some lineages. The duplicated regions were subjected to concerted evolution with a frequency higher than the rate of speciation. The duplicated control regions may provide a selective advantage due to a more efficient initiation of replication or transcription and a larger number of replicating genomes per organelle, which may lead to a more effective energy production by mitochondria. The mitogenomic duplications were associated with phenotypic features and parrots with the duplicated region can live longer, show larger body mass as well as predispositions to a more active flight. The results have wider implications on the presence of duplications and their evolution in mitogenomes of other avian groups.
AbstractA major goal of population genetics has been to determine the extent by which selection at linked sites influences patterns of neutral nucleotide diversity in the genome. Multiple lines of evidence suggest that diversity is influenced by both positive and negative selection. For example, in many species there are troughs in diversity surrounding functional genomic elements, consistent with the action of either background selection (BGS) or selective sweeps. In this study, we investigated the causes of the diversity troughs that are observed in the wild house mouse genome. Using the unfolded site frequency spectrum, we estimated the strength and frequencies of deleterious and advantageous mutations occurring in different functional elements in the genome. We then used these estimates to parameterize forward-in-time simulations of chromosomes, using realistic distributions of functional elements and recombination rate variation in order to determine whether selection at linked sites can explain the observed patterns of nucleotide diversity. The simulations suggest that BGS alone cannot explain the dips in diversity around either exons or conserved noncoding elements. A combination of BGS and selective sweeps produces deeper dips in diversity than BGS alone, but the inferred parameters of selection cannot fully explain the patterns observed in the genome. Our results provide evidence of sweeps shaping patterns of nucleotide diversity across the mouse genome and also suggest that infrequent, strongly advantageous mutations play an important role in this. The limitations of using the unfolded site frequency spectrum for inferring the frequency and effects of advantageous mutations are discussed.
AbstractDespite the genetic resemblance of Canary Islanders to other southern European populations, their geographical isolation and the historical admixture of aborigines (from North Africa) with sub-Saharan Africans and Europeans have shaped a distinctive genetic makeup that likely affects disease susceptibility and health disparities. Based on single nucleotide polymorphism array data and whole genome sequencing (30×), we inferred that the last African admixture took place ∼14 generations ago and estimated that up to 34% of the Canary Islander genome is of recent African descent. The length of regions in homozygosis and the ancestry-related mosaic organization of the Canary Islander genome support the view that isolation has been strongest on the two smallest islands. Furthermore, several genomic regions showed significant and large deviations in African or European ancestry and were significantly enriched in genes involved in prevalent diseases in this community, such as diabetes, asthma, and allergy. The most prominent of these regions were located near LCT and the HLA, two well-known targets of selection, at which 40‒50% of the Canarian genome is of recent African descent according to our estimates. Putative selective signals were also identified in these regions near the SLC6A11-SLC6A1, KCNMB2, and PCDH20-PCDH9 genes. Taken together, our findings provide solid evidence of a significant recent African admixture, population isolation, and adaptation in this part of Europe, with the favoring of African alleles in some chromosome regions. These findings may have medical implications for populations of recent African ancestry.
AbstractVariation at the FADS1/FADS2 gene cluster is functionally associated with differences in lipid metabolism and is often hypothesized to reflect adaptation to an agricultural diet. Here, we test the evidence for this relationship using both modern and ancient DNA data. We show that almost all the inhabitants of Europe carried the ancestral allele until the derived allele was introduced ∼8,500 years ago by Early Neolithic farming populations. However, we also show that it was not under strong selection in these populations. We find that this allele, and other proposed agricultural adaptations at LCT/MCM6 and SLC22A4, were not strongly selected until much later, perhaps as late as the Bronze Age. Similarly, increased copy number variation at the salivary amylase gene AMY1 is not linked to the development of agriculture although, in this case, the putative adaptation precedes the agricultural transition. Our analysis shows that selection at the FADS locus was not tightly linked to the initial introduction of agriculture and the Neolithic transition. Further, it suggests that the strongest signals of recent human adaptation in Europe did not coincide with the Neolithic transition but with more recent changes in environment, diet, or efficiency of selection due to increases in effective population size.
AbstractElucidating the genomic determinants of morphological differences between species is key to understanding how morphological diversity evolved. While differences in cis-regulatory elements are an important genetic source for morphological evolution, it remains challenging to identify regulatory elements involved in phenotypic differences. Here, we present Regulatory Element forward genomics (REforge), a computational approach that detects associations between transcription factor binding site divergence in putative regulatory elements and phenotypic differences between species. By simulating regulatory element evolution in silico, we show that this approach has substantial power to detect such associations. To validate REforge on real data, we used known binding motifs for eye-related transcription factors and identified significant binding site divergence in vision-impaired subterranean mammals in 1% of all conserved noncoding elements. We show that these genomic regions are significantly enriched in regulatory elements that are specifically active in mouse eye tissues, and that several of them are located near genes, which are required for eye development and photoreceptor function and are implicated in human eye disorders. Thus, our genome-wide screen detects widespread divergence of eye-regulatory elements and highlights regulatory regions that likely contributed to eye degeneration in subterranean mammals. REforge has broad applicability to detect regulatory elements that could be involved in many other phenotypes, which will help to reveal the genomic basis of morphological diversity.
AbstractWe sequenced the genome of the strawberry poison frog, Oophaga pumilio, at a depth of 127.5× using variable insert size libraries. The total genome size is estimated to be 6.76 Gb, of which 4.76 Gb are from high copy number repetitive elements with low differentiation across copies. These repeats encompass DNA transposons, RNA transposons, and LTR retrotransposons, including at least 0.4 and 1.0 Gb of Mariner/Tc1 and Gypsy elements, respectively. Expression data indicate high levels of gypsy and Mariner/Tc1 expression in ova of O. pumilio compared with Xenopus laevis. We further observe phylogenetic evidence for horizontal transfer (HT) of Mariner elements, possibly between fish and frogs. The elements affected by HT are present in high copy number and are highly expressed, suggesting ongoing proliferation after HT. Our results suggest that the large amphibian genome sizes, at least partially, can be explained by a process of repeated invasion of new transposable elements that are not yet suppressed in the germline. We also find changes in the spliceosome that we hypothesize are related to permissiveness of O. pumilio to increases in intron length due to transposon proliferation. Finally, we identify the complement of ion channels in the first genomic sequenced poison frog and discuss its relation to the evolution of autoresistance to toxins sequestered in the skin.
AbstractThe competitive endogenous RNA (ceRNA) hypothesis is an attractively simple model to explain the biological role of many putatively functionless noncoding RNAs. Under this model, there exist transcripts in the cell whose role is to titrate out microRNAs such that the expression level of another target sequence is altered. That it is logistically possible for expression of one microRNA recognition element (MRE)-containing transcript to affect another is seen in the multiple examples of pathogenic effects of inappropriate expression of MRE-containing RNAs. However, the role, if any, of ceRNAs in normal biological processes and at physiological levels is disputed. By comparison of parent genes and pseudogenes we show, both for a specific example and genome-wide, that the pseudo-3′ untranslated regions (3′UTRs) of expressed pseudogenes are frequently retained and are under selective constraint in mammalian genomes. We found that the pseudo-3′UTR of BRAFP1, a previously described oncogenic ceRNA, has reduced substitutions relative to its pseudo-coding sequence, and we show sequence constraint on MREs shared between the parent gene, BRAF, and the pseudogene. Investigation of RNA-seq data reveals expression of BRAFP1 in normal somatic tissues in human and in other primates, consistent with biological ceRNA functionality of this pseudogene in nonpathogenic cellular contexts. Furthermore, we find that on a genome-wide scale pseudo-3′UTRs of mammalian pseudogenes (n = 1,629) are under stronger selective constraint than their pseudo-coding sequence counterparts, and are more often retained and expressed. Our results suggest that many human pseudogenes, often considered nonfunctional, may have an evolutionarily constrained role, consistent with the ceRNA hypothesis.
AbstractPheromones are crucial for eliciting social and sexual behaviors in diverse animal species. The vomeronasal receptor type-1 (V1R) genes, encoding members of a pheromone receptor family, are highly variable in number and repertoire among mammals due to extensive gene gain and loss. Here, we report a novel pheromone receptor gene belonging to the V1R family, named ancient V1R (ancV1R), which is shared among most Osteichthyes (bony vertebrates) from the basal lineage of ray-finned fishes to mammals. Phylogenetic and syntenic analyses of ancV1R using 115 vertebrate genomes revealed that it represents an orthologous gene conserved for >400 My of vertebrate evolution. Interestingly, the loss of ancV1R in some tetrapods is coincident with the degeneration of the vomeronasal organ in higher primates, cetaceans, and some reptiles including birds and crocodilians. In addition, ancV1R is expressed in most mature vomeronasal sensory neurons in contrast with canonical V1Rs, which are sparsely expressed in a manner that is consistent with the “one neuron–one receptor” rule. Our results imply that a previously undescribed V1R gene inherited from an ancient Silurian ancestor may have played an important functional role in the evolution of vertebrate vomeronasal organ.
AbstractThe rate of molecular evolution varies widely among species. Life history traits (LHTs) have been proposed as a major driver of these variations. However, the relative contribution of each trait is poorly understood. Here, we test the influence of metabolic rate (MR), longevity, and generation time (GT) on the nuclear and mitochondrial synonymous substitution rates using a group of isopod species that have made multiple independent transitions to subterranean environments. Subterranean species have repeatedly evolved a lower MR, a longer lifespan and a longer GT. We assembled the nuclear transcriptomes and the mitochondrial genomes of 13 pairs of closely related isopods, each pair composed of one surface and one subterranean species. We found that subterranean species have a lower rate of nuclear synonymous substitution than surface species whereas the mitochondrial rate remained unchanged. We propose that this decoupling between nuclear and mitochondrial rates comes from different DNA replication processes in these two compartments. In isopods, the nuclear rate is probably tightly controlled by GT alone. In contrast, mitochondrial genomes appear to replicate and mutate at a rate independent of LHTs. These results are incongruent with previous studies, which were mostly devoted to vertebrates. We suggest that this incongruence can be explained by developmental differences between animal clades, with a quiescent period during female gametogenesis in mammals and birds which imposes a nuclear and mitochondrial rate coupling, as opposed to the continuous gametogenesis observed in most arthropods.
AbstractThe establishment of new interactions between transcriptional regulators increases the regulatory diversity that drives phenotypic novelty. To understand how such interactions evolve, we have studied a regulatory module (DDR) composed by three MYB-like proteins: DIVARICATA (DIV), RADIALIS (RAD), and DIV-and-RAD-Interacting Factor (DRIF). The DIV and DRIF proteins form a transcriptional complex that is disrupted in the presence of RAD, a small interfering peptide, due to the formation of RAD–DRIF dimers. This dynamic interaction result in a molecular switch mechanism responsible for the control of distinct developmental processes in plants. Here, we have determined how the DDR regulatory module was established by analyzing the origin and evolution of the DIV, DRIF, and RAD protein families and the evolutionary history of their interactions. We show that duplications of a pre-existing MYB domain originated the DIV and DRIF protein families in the ancestral lineage of green algae, and, later, the RAD family in seed plants. Intraspecies interactions between the MYB domains of DIV and DRIF proteins are detected in green algae, whereas the earliest evidence of an interaction between DRIF and RAD proteins occurs in the gymnosperms, coincident with the establishment of the RAD family. Therefore, the DDR module evolved in a stepwise progression with the DIV–DRIF transcription complex evolving prior to the antagonistic RAD–DRIF interaction that established the molecular switch mechanism. Our results suggest that the successive rearrangement and divergence of a single protein domain can be an effective evolutionary mechanism driving new protein interactions and the establishment of novel regulatory modules.
AbstractA driving hypothesis of evolutionary developmental biology is that animal morphological diversity is shaped both by adaptation and by developmental constraints. Here, we have tested Darwin’s “selection opportunity” hypothesis, according to which high evolutionary divergence in late development is due to strong positive selection. We contrasted it to a “developmental constraint” hypothesis, according to which late development is under relaxed negative selection. Indeed, the highest divergence between species, both at the morphological and molecular levels, is observed late in embryogenesis and postembryonically. To distinguish between adaptation and relaxation hypotheses, we investigated the evidence of positive selection on protein-coding genes in relation to their expression over development, in fly Drosophila melanogaster, zebrafish Danio rerio, and mouse Mus musculus. First, we found that genes specifically expressed in late development have stronger signals of positive selection. Second, over the full transcriptome, genes with evidence for positive selection trend to be expressed in late development. Finally, genes involved in pathways with cumulative evidence of positive selection have higher expression in late development. Overall, there is a consistent signal that positive selection mainly affects genes and pathways expressed in late embryonic development and in adult. Our results imply that the evolution of embryogenesis is mostly conservative, with most adaptive evolution affecting some stages of postembryonic gene expression, and thus postembryonic phenotypes. This is consistent with the diversity of environmental challenges to which juveniles and adults are exposed.
AbstractThe origin of novel traits can promote expansion into new niches and drive speciation. Ctenophores (comb jellies) are unified by their possession of a novel cell type: the colloblast, an adhesive cell found only in the tentacles. Although colloblast-laden tentacles are fundamental for prey capture among ctenophores, some species have tentacles lacking colloblasts and others have lost their tentacles completely. We used transcriptomes from 36 ctenophore species to identify gene losses that occurred specifically in lineages lacking colloblasts and tentacles. We cross-referenced these colloblast- and tentacle-specific candidate genes with temporal RNA-Seq during embryogenesis in Mnemiopsis leidyi and found that both sets of candidates are preferentially expressed during tentacle morphogenesis. We also demonstrate significant upregulation of candidates from both data sets in the tentacle bulb of adults. Both sets of candidates were enriched for an N-terminal signal peptide and protein domains associated with secretion; among tentacle candidates we also identified orthologs of cnidarian toxin proteins, presenting tantalizing evidence that ctenophore tentacles may secrete toxins along with their adhesive. Finally, using cell lineage tracing, we demonstrate that colloblasts and neurons share a common progenitor, suggesting the evolution of colloblasts involved co-option of a neurosecretory gene regulatory network. Together these data offer an initial glimpse into the genetic architecture underlying ctenophore cell-type diversity.
AbstractThe oomycetes are a class of microscopic, filamentous eukaryotes within the stramenopiles–alveolates–rhizaria eukaryotic supergroup. They include some of the most destructive pathogens of animals and plants, such as Phytophthora infestans, the causative agent of late potato blight. Despite the threat they pose to worldwide food security and natural ecosystems, there is a lack of tools and databases available to study oomycete genetics and evolution. To this end, we have developed the Oomycete Gene Order Browser (OGOB), a curated database that facilitates comparative genomic and syntenic analyses of oomycete species. OGOB incorporates genomic data for 20 oomycete species including functional annotations and a number of bioinformatics tools. OGOB hosts a robust set of orthologous oomycete genes for evolutionary analyses. Here, we present the structure and function of OGOB as well as a number of comparative genomic analyses we have performed to better understand oomycete genome evolution. We analyze the extent of oomycete gene duplication and identify tandem gene duplication as a driving force of the expansion of secreted oomycete genes. We identify core genes that are present and microsyntenically conserved (termed syntenologs) in oomycete lineages and identify the degree of microsynteny between each pair of the 20 species housed in OGOB. Consistent with previous comparative synteny analyses between a small number of oomycete species, our results reveal an extensive degree of microsyntenic conservation amongst genes with housekeeping functions within the oomycetes. OGOB is available at https://ogob.ie.
AbstractPlastid genome (ptDNA) data of Glaucophyta have been limited for many years to the genus Cyanophora. Here, we sequenced the ptDNAs of Gloeochaete wittrockiana, Cyanoptyche gloeocystis, Glaucocystis incrassata, and Glaucocystis sp. BBH. The reported sequences are the first genome-scale plastid data available for these three poorly studied glaucophyte genera. Although the Glaucophyta plastids appear morphologically “ancestral,” they actually bear derived genomes not radically different from those of red algae or viridiplants. The glaucophyte plastid coding capacity is highly conserved (112 genes shared) and the architecture of the plastid chromosomes is relatively simple. Phylogenomic analyses recovered Glaucophyta as the earliest diverging Archaeplastida lineage, but the position of viridiplants as the first branching group was not rejected by the approximately unbiased test. Pairwise distances estimated from 19 different plastid genes revealed that the highest sequence divergence between glaucophyte genera is frequently higher than distances between species of different classes within red algae or viridiplants. Gene synteny and sequence similarity in the ptDNAs of the two Glaucocystis species analyzed is conserved. However, the ptDNA of Gla. incrassata contains a 7.9-kb insertion not detected in Glaucocystis sp. BBH. The insertion contains ten open reading frames that include four coding regions similar to bacterial serine recombinases (two open reading frames), DNA primases, and peptidoglycan aminohydrolases. These three enzymes, often encoded in bacterial plasmids and bacteriophage genomes, are known to participate in the mobilization and replication of DNA mobile elements. It is therefore plausible that the insertion in Gla. incrassata ptDNA is derived from a DNA mobile element.
AbstractYak is one of the largest native mammalian species at the Himalayas, the highest plateau area in the world with an average elevation of >4,000 m above the sea level. Yak is well adapted to high altitude environment with a set of physiological features for a more efficient blood flow for oxygen delivery under hypobaric hypoxia. Yet, the genetic mechanism underlying its adaptation remains elusive. We conducted a cross-tissue, cross-altitude, and cross-species study to characterize the transcriptomic landscape of domestic yaks. The generated multi-tissue transcriptomic data greatly improved the current yak genome annotation by identifying tens of thousands novel transcripts. We found that among the eight tested tissues (lung, heart, kidney, liver, spleen, muscle, testis, and brain), lung and heart are two key organs showing adaptive transcriptional changes and >90% of the cross-altitude differentially expressed genes in lung display a nonlinear regulation. Pathways related to cell survival and proliferation are enriched, including PI3K-Akt, HIF-1, focal adhesion, and ECM–receptor interaction. These findings, in combination with the comprehensive transcriptome data set, are valuable to understanding the genetic mechanism of hypoxic adaptation in yak.
AbstractThe phylum Apicomplexa is a quintessentially parasitic lineage, whose members infect a broad range of animals. One exception to this may be the apicomplexan genus Nephromyces, which has been described as having a mutualistic relationship with its host. Here we analyze transcriptome data from Nephromyces and its parasitic sister taxon, Cardiosporidium, revealing an ancestral purine degradation pathway thought to have been lost early in apicomplexan evolution. The predicted localization of many of the purine degradation enzymes to peroxisomes, and the in silico identification of a full set of peroxisome proteins, indicates that loss of both features in other apicomplexans occurred multiple times. The degradation of purines is thought to play a key role in the unusual relationship between Nephromyces and its host. Transcriptome data confirm previous biochemical results of a functional pathway for the utilization of uric acid as a primary nitrogen source for this unusual apicomplexan.
AbstractHaematococcus pluvialis is a freshwater species of Chlorophyta, family Haematococcaceae. It is well known for its capacity to synthesize high amounts of astaxanthin, which is a strong antioxidant that has been utilized in aquaculture and cosmetics. To improve astaxanthin yield and to establish genetic resources for H. pluvialis, we performed whole-genome sequencing, assembly, and annotation of this green microalga. A total of 83.1 Gb of raw reads were sequenced. After filtering the raw reads, we subsequently generated a draft assembly with a genome size of 669.0 Mb, a scaffold N50 of 288.6 kb, and predicted 18,545 genes. We also established a robust phylogenetic tree from 14 representative algae species. With additional transcriptome data, we revealed some novel potential genes that are involved in the synthesis, accumulation, and regulation of astaxanthin production. In addition, we generated an isoform-level reference transcriptome set of 18,483 transcripts with high confidence. Alternative splicing analysis demonstrated that intron retention is the most frequent mode. In summary, we report the first draft genome of H. pluvialis. These genomic resources along with transcriptomic data provide a solid foundation for the discovery of the genetic basis for theoretical and commercial astaxanthin enrichment.
AbstractThe huge increase in the availability of bacterial genomes led us to a point in which we can investigate and query pan-genomes, for example, the full set of genes of a given bacterial species or clade. Here, we used a data set of 1,311 high-quality genomes from the human pathogen Pseudomonas aeruginosa, 619 of which were newly sequenced, to show that a pan-genomic approach can greatly refine the population structure of bacterial species, provide new insights to define species boundaries, and generate hypotheses on the evolution of pathogenicity. The 665-gene P. aeruginosa core genome presented here, which constitutes only 1% of the entire pan-genome, is the first to be in the same order of magnitude as the minimal bacterial genome and represents a conservative estimate of the actual core genome. Moreover, the phylogeny based on this core genome provides strong evidence for a five-group population structure that includes two previously undescribed groups of isolates. Comparative genomics focusing on antimicrobial resistance and virulence genes showed that variation among isolates was partly linked to this population structure. Finally, we hypothesized that horizontal gene transfer had an important role in this respect, and found a total of 3,010 putative complete and fragmented plasmids, 5% and 12% of which contained resistance or virulence genes, respectively. This work provides data and strategies to study the evolutionary trajectories of resistance and virulence in P. aeruginosa.
AbstractThe propensity of protein sites to be occupied by any of the 20 amino acids is known as site-specific amino acid preferences (SSAP). Under the assumption that SSAP are conserved among homologs, they can be used to parameterize evolutionary models for the reconstruction of accurate phylogenetic trees. However, simulations and experimental studies have not been able to fully assess the relative conservation of SSAP as a function of sequence divergence between protein homologs. Here, we implement a computational procedure to predict the SSAP of proteins based on the effect of changes in thermodynamic stability upon mutation. An advantage of this computational approach is that it allows us to interrogate a large and unbiased sample of homologous proteins, over the entire spectrum of sequence divergence, and under selection for the same molecular trait. We show that computational predictions have reproducibilities that resemble those obtained in experimental replicates, and can largely recapitulate the SSAP observed in a large-scale mutagenesis experiment. Our results support recent experimental reports on the conservation of SSAP of related homologs, with a slowly increasing fraction of up to 15% of different sites at sequence distances lower than 40%. However, even under the sole contribution of thermodynamic stability, our conservative approach identifies up to 30% of significant different sites between divergent homologs. We show that this relation holds for homologs of diverse sizes and structural classes. Analyses of residue contact networks suggest that an important determinant of these differences is the increasing accumulation of structural deviations that results from sequence divergence.
AbstractMutations spawn genetic variation which, in turn, fuels evolution. Hence, experimental investigations into the rate and fitness effects of spontaneous mutations are central to the study of evolution. Mutation accumulation (MA) experiments have served as a cornerstone for furthering our understanding of spontaneous mutations for four decades. In the pregenomic era, phenotypic measurements of fitness-related traits in MA lines were used to indirectly estimate key mutational parameters, such as the genomic mutation rate, new mutational variance per generation, and the average fitness effect of mutations. Rapidly emerging next-generating sequencing technology has supplanted this phenotype-dependent approach, enabling direct empirical estimates of the mutation rate and a more nuanced understanding of the relative contributions of different classes of mutations to the standing genetic variation. Whole-genome sequencing of MA lines bears immense potential to provide a unified account of the evolutionary process at multiple levels—the genetic basis of variation, and the evolutionary dynamics of mutations under the forces of selection and drift. In this review, we have attempted to synthesize key insights into the spontaneous mutation process that are rapidly emerging from the partnering of classical MA experiments with high-throughput sequencing, with particular emphasis on the spontaneous rates and molecular properties of different mutational classes in nuclear and mitochondrial genomes of diverse taxa, the contribution of mutations to the evolution of gene expression, and the rate and stability of transgenerational epigenetic modifications. Future advances in sequencing technologies will enable greater species representation to further refine our understanding of mutational parameters and their functional consequences.
AbstractThe order Lagomorpha unifies pikas (Ochotonidae) and the hares plus rabbits (Leporidae). Phylogenetic reconstructions of the species within Leporidae based on traditional morphological or molecular sequence data provide support for conflicting hypotheses. The retroposon presence/absence patterns analyzed in this study revealed strong support for the broadly accepted splitting of lagomorphs into ochotonids and leporids with Pronolagus as the first divergence in the leporid tree. Furthermore, the retroposon presence/absence patterns nested the rare volcano rabbit, Romerolagus diazi, within an unresolved network of deeper leporid relationships and provide the first homoplasy-free image of incomplete lineage sorting and/or ancestral hybridization/introgression in rapidly radiated Leporidae. At the same time, the strongest retroposon presence/absence signal supports the volcano rabbit as a separate branch between the Pronolagus junction and a unified cluster of the remaining leporids.
AbstractThe evolution of the major histocompatibility complex (MHC) is shaped by frequent gene duplications and deletions, which generate extensive variation in the number of loci (gene copies) between different taxa. Here, we collected estimates of copy number at the MHC for over 250 bird species from 68 families. We found contrasting patterns of copy number evolution between MHC class I and class IIB, which encode receptors for intra- and extracellular pathogens, respectively. Across the avian evolutionary tree, there was evidence of accelerated evolution and stabilizing selection acting on copy number at class I, while copy number at class IIB was primarily influenced by fluctuating selection and drift. Reconstruction of MHC copy number variation showed ancestrally low numbers of MHC loci in nonpasserines and evolution toward larger numbers of loci in passerines. Different passerine lineages had the highest duplication rates for MHC class I (Sylvioidea) and class IIB (Muscicapoidea and Passeroidea). We also found support for the correlated evolution of MHC copy number and life-history traits such as lifespan and migratory behavior. These results suggest that MHC copy number evolution in birds has been driven by life histories and differences in exposure to intra- and extracellular pathogens.
AbstractThe diverse array of codon reassignments demonstrate that the genetic code is not universal in nature. Exploring mechanisms underlying codon reassignment is critical for understanding the evolution of the genetic code during translation. Hemichordata, comprising worm-like Enteropneusta and colonial filter-feeding Pterobranchia, is the sister taxon of echinoderms and is more distantly related to chordates. However, only a few hemichordate mitochondrial genomes have been sequenced, hindering our understanding of mitochondrial genome evolution within Deuterostomia. In this study, we sequenced four mitochondrial genomes and two transcriptomes, including representatives of both major hemichordate lineages and analyzed together with public available data. Contrary to the current understanding of the mitochondrial genetic code in hemichordates, our comparative analyses suggest that UAA encodes Tyr instead of a “Stop” codon in the pterobranch lineage Cephalodiscidae. We also predict that AAA encodes Lys in pterobranch and enteropneust mitochondrial genomes, contradicting the previous assumption that hemichordates share the same genetic code with echinoderms for which AAA encodes Asn. Thus, we propose a new mitochondrial genetic code for Cephalodiscus and a revised code for enteropneusts. Moreover, our phylogenetic analyses are largely consistent with previous phylogenomic studies. The only exception is the phylogenetic position of the enteropneust Stereobalanus, whose placement as sister to all other described enteropneusts. With broader taxonomic sampling, we provide evidence that evolution of mitochondrial gene order and genetic codes in Hemichordata are more dynamic than previously thought and these findings provide insights into mitochondrial genome evolution within this clade.
AbstractWe employed phylogenomic methods to study molecular evolutionary processes and phylogeny in the geographically widely dispersed New World diploid cottons (Gossypium, subg. Houzingenia). Whole genome resequencing data (average of 33× genomic coverage) were generated to reassess the phylogenetic history of the subgenus and provide a temporal framework for its diversification. Phylogenetic analyses indicate that the subgenus likely originated following transoceanic dispersal from Africa about 6.6 Ma, but that nearly all of the biodiversity evolved following rapid diversification in the mid-Pleistocene (0.5–2.0 Ma), with multiple long-distance dispersals required to account for range expansion to Arizona, the Galapagos Islands, and Peru. Comparative analyses of cpDNAversus nuclear data indicate that this history was accompanied by several clear cases of interspecific introgression. Repetitive DNAs contribute roughly half of the total 880 Mb genome, but most transposable element families are relatively old and stable among species. In the genic fraction, pairwise synonymous mutation rates average 1% per Myr, with nonsynonymous changes being about seven times less frequent. Over 1.1 million indels were detected and phylogenetically polarized, revealing a 2-fold bias toward deletions over small insertions. We suggest that this genome down-sizing bias counteracts genome size growth by TE amplification and insertions, and helps explain the relatively small genomes that are restricted to this subgenus. Compared with the rate of nucleotide substitution, the rate of indel occurrence is much lower averaging about 17 nucleotide substitutions per indel event.
AbstractSymbiosis is now recognized as a driving force in evolution, a role that finds its ultimate expression in the variety of associations bonding insects with microbial symbionts. These associations have contributed to the evolutionary success of insects, with the hosts acquiring the capacity to exploit novel ecological niches, and the symbionts passing from facultative associations to obligate, mutualistic symbioses. In bacterial symbiont of insects, the transition from the free-living life style to mutualistic symbiosis often resulted in a reduction in the genome size, with the generation of the smallest bacterial genomes thus far described. Here, we show that the process of genome reduction is still occurring in Asaia, a group of bacterial symbionts associated with a variety of insects. Indeed, comparative genomics of Asaia isolated from different mosquito species revealed a substantial genome size and gene content reduction in Asaia from Anopheles darlingi, a South-American malaria vector. We thus propose Asaia as a novel model to study genome reduction dynamics, within a single bacterial taxon, evolving in a common biological niche.
AbstractHistidine kinases (HKs) are primary sensor proteins that act in cell signaling pathways generically referred to as “two-component systems” (TCSs). TCSs are among the most widely distributed transduction systems used by both prokaryotic and eukaryotic organisms to detect and respond to a broad range of environmental cues. The structure and distribution of HK proteins are now well documented in prokaryotes, but information is still fragmentary for eukaryotes. Here, we have taken advantage of recent genomic resources to explore the structural diversity and the phylogenetic distribution of HKs in the prominent eukaryotic supergroups. Searches of the genomes of 67 eukaryotic species spread evenly throughout the phylogenetic tree of life identified 748 predicted HK proteins. Independent phylogenetic analyses of predicted HK proteins were carried out for each of the major eukaryotic supergroups. This allowed most of the compiled sequences to be categorized into previously described HK groups. Beyond the phylogenetic analysis of eukaryotic HKs, this study revealed some interesting findings: 1) characterization of some previously undescribed eukaryotic HK groups with predicted functions putatively related to physiological traits; 2) discovery of HK groups that were previously believed to be restricted to a single kingdom in additional supergroups, and 3) indications that some evolutionary paths have led to the appearance, transfer, duplication, and loss of HK genes in some phylogenetic lineages. This study provides an unprecedented overview of the structure and distribution of HKs in the Eukaryota and represents a first step toward deciphering the evolution of TCS signaling in living organisms.