Ever since Darwin first set foot on the Galapagos, evolutionary biologists have long known that the geographic isolation of archipelagos has helped spur the formation of new species.
Mol. Biol. Evol. doi:10.1093/molbev/msy226
AbstractThe emergence of islands has been linked to spectacular radiations of diverse organisms. Although penguins spend much of their lives at sea, they rely on land for nesting, and a high proportion of extant species are endemic to geologically young islands. Islands may thus have been crucial to the evolutionary diversification of penguins. We test this hypothesis using a fossil-calibrated phylogeny of mitochondrial genomes (mitogenomes) from all extant and recently extinct penguin taxa. Our temporal analysis demonstrates that numerous recent island-endemic penguin taxa diverged following the formation of their islands during the Plio-Pleistocene, including the Galápagos (Galápagos Islands), northern rockhopper (Gough Island), erect-crested (Antipodes Islands), Snares crested (Snares) and royal (Macquarie Island) penguins. Our analysis also reveals two new recently extinct island-endemic penguin taxa from New Zealand’s Chatham Islands: Eudyptes warhami sp. nov. and a dwarf subspecies of the yellow-eyed penguin, Megadyptes antipodes richdalei ssp. nov. Eudyptes warhami diverged from the Antipodes Islands erect-crested penguin between 1.1 and 2.5 Ma, shortly after the emergence of the Chatham Islands (∼3 Ma). This new finding of recently evolved taxa on this young archipelago provides further evidence that the radiation of penguins over the last 5 Ma has been linked to island emergence. Mitogenomic analyses of all penguin species, and the discovery of two new extinct penguin taxa, highlight the importance of island formation in the diversification of penguins, as well as the extent to which anthropogenic extinctions have affected island-endemic taxa across the Southern Hemisphere’s isolated archipelagos.
AbstractThe evolution of HIV-1 protein sequences should be governed by a combination of factors including nucleotide mutational probabilities, the genetic code, and fitness. The impact of these factors on protein sequence evolution is interdependent, making it challenging to infer the individual contribution of each factor from phylogenetic analyses alone. We investigated the protein sequence evolution of HIV-1 by determining an experimental fitness landscape of all individual amino acid changes in protease. We compared our experimental results to the frequency of protease variants in a publicly available data set of 32,163 sequenced isolates from drug-naïve individuals. The most common amino acids in sequenced isolates supported robust experimental fitness, indicating that the experimental fitness landscape captured key features of selection acting on protease during viral infections of hosts. Amino acid changes requiring multiple mutations from the likely ancestor were slightly less likely to support robust experimental fitness than single mutations, consistent with the genetic code favoring chemically conservative amino acid changes. Amino acids that were common in sequenced isolates were predominantly accessible by single mutations from the likely protease ancestor. Multiple mutations commonly observed in isolates were accessible by mutational walks with highly fit single mutation intermediates. Our results indicate that the prevalence of multiple-base mutations in HIV-1 protease is strongly influenced by mutational sampling.
AbstractThe pattern of molecular evolution varies among gene sites and genes in a genome. By taking into account the complex heterogeneity of evolutionary processes among sites in a genome, Bayesian infinite mixture models of genomic evolution enable robust phylogenetic inference. With large modern data sets, however, the computational burden of Markov chain Monte Carlo sampling techniques becomes prohibitive. Here, we have developed a variational Bayesian procedure to speed up the widely used PhyloBayes MPI program, which deals with the heterogeneity of amino acid profiles. Rather than sampling from the posterior distribution, the procedure approximates the (unknown) posterior distribution using a manageable distribution called the variational distribution. The parameters in the variational distribution are estimated by minimizing Kullback–Leibler divergence. To examine performance, we analyzed three empirical data sets consisting of mitochondrial, plastid-encoded, and nuclear proteins. Our variational method accurately approximated the Bayesian inference of phylogenetic tree, mixture proportions, and the amino acid propensity of each component of the mixture while using orders of magnitude less computational time.
AbstractWe present version 10 of OrthoMaM, a database of orthologous mammalian markers. OrthoMaM is already 11 years old and since the outset it has kept on improving, providing alignments and phylogenetic trees of high-quality computed with state-of-the-art methods on up-to-date data. The main contribution of this version is the increase in the number of taxa: 116 mammalian genomes for 14,509 one-to-one orthologous genes. This has been made possible by the combination of genomic data deposited in Ensembl complemented by additional good-quality genomes only available in NCBI. Version 10 users will benefit from pipeline improvements and a completely redesigned web-interface.
AbstractGenetic code deviations involving stop codons have been previously reported in mitochondrial genomes of several green plants (Viridiplantae), most notably chlorophyte algae (Chlorophyta). However, as changes in codon recognition from one amino acid to another are more difficult to infer, such changes might have gone unnoticed in particular lineages with high evolutionary rates that are otherwise prone to codon reassignments. To gain further insight into the evolution of the mitochondrial genetic code in green plants, we have conducted an in-depth study across mtDNAs from 51 green plants (32 chlorophytes and 19 streptophytes). Besides confirming known stop-to-sense reassignments, our study documents the first cases of sense-to-sense codon reassignments in Chlorophyta mtDNAs. In several Sphaeropleales, we report the decoding of AGG codons (normally arginine) as alanine, by tRNA(CCU) of various origins that carry the recognition signature for alanine tRNA synthetase. In Chromochloris, we identify tRNA variants decoding AGG as methionine and the synonymous codon CGG as leucine. Finally, we find strong evidence supporting the decoding of AUA codons (normally isoleucine) as methionine in Pycnococcus. Our results rely on a recently developed conceptual framework (CoreTracker) that predicts codon reassignments based on the disparity between DNA sequence (codons) and the derived protein sequence. These predictions are then validated by an evaluation of tRNA phylogeny, to identify the evolution of new tRNAs via gene duplication and loss, and structural modifications that lead to the assignment of new tRNA identities and a change in the genetic code.
AbstractResolving the relationships of animals (Metazoa) is crucial to our understanding of the origin of key traits such as muscles, guts, and nerves. However, a broadly accepted metazoan consensus phylogeny has yet to emerge. In part, this is because the genomes of deeply diverging and fast-evolving lineages may undergo significant gene turnover, reducing the number of orthologs shared with related phyla. This can limit the usefulness of traditional phylogenetic methods that rely on alignments of orthologous sequences. Phylogenetic analysis of gene content has the potential to circumvent this orthology requirement, with binary presence/absence of homologous gene families representing a source of phylogenetically informative characters. Applying binary substitution models to the gene content of 26 complete animal genomes, we demonstrate that patterns of gene conservation differ markedly depending on whether gene families are defined by orthology or homology, that is, whether paralogs are excluded or included. We conclude that the placement of some deeply diverging lineages may exceed the limit of resolution afforded by the current methods based on comparisons of orthologous protein sequences, and novel approaches are required to fully capture the evolutionary signal from genes within genomes.
AbstractNew species arise from pre-existing species and inherit similar genomes and environments. This predicts greater similarity of the tempo of molecular evolution between direct ancestors and descendants, resulting in autocorrelation of evolutionary rates in the tree of life. Surprisingly, molecular sequence data have not confirmed this expectation, possibly because available methods lack the power to detect autocorrelated rates. Here, we present a machine learning method, CorrTest, to detect the presence of rate autocorrelation in large phylogenies. CorrTest is computationally efficient and performs better than the available state-of-the-art method. Application of CorrTest reveals extensive rate autocorrelation in DNA and amino acid sequence evolution of mammals, birds, insects, metazoans, plants, fungi, parasitic protozoans, and prokaryotes. Therefore, rate autocorrelation is a common phenomenon throughout the tree of life. These findings suggest concordance between molecular and nonmolecular evolutionary patterns, and they will foster unbiased and precise dating of the tree of life.
AbstractThe resolution of the broad-scale tree of eukaryotes is constantly improving, but the evolutionary origin of several major groups remains unknown. Resolving the phylogenetic position of these “orphan” groups is important, especially those that originated early in evolution, because they represent missing evolutionary links between established groups. Telonemia is one such orphan taxon for which little is known. The group is composed of molecularly diverse biflagellated protists, often prevalent although not abundant in aquatic environments. Telonemia has been hypothesized to represent a deeply diverging eukaryotic phylum but no consensus exists as to where it is placed in the tree. Here, we established cultures and report the phylogenomic analyses of three new transcriptome data sets for divergent telonemid lineages. All our phylogenetic reconstructions, based on 248 genes and using site-heterogeneous mixture models, robustly resolve the evolutionary origin of Telonemia as sister to the Sar supergroup. This grouping remains well supported when as few as 60% of the genes are randomly subsampled, thus is not sensitive to the sets of genes used but requires a minimal alignment length to recover enough phylogenetic signal. Telonemia occupies a crucial position in the tree to examine the origin of Sar, one of the most lineage-rich eukaryote supergroups. We propose the moniker “TSAR” to accommodate this new mega-assemblage in the phylogeny of eukaryotes.
AbstractThe terrestrial isopod Armadillidium vulgare is an original model to study the evolution of sex determination and symbiosis in animals. Its sex can be determined by ZW sex chromosomes, or by feminizing Wolbachia bacterial endosymbionts. Here, we report the sequence and analysis of the ZW female genome of A. vulgare. A distinguishing feature of the 1.72 gigabase assembly is the abundance of repeats (68% of the genome). We show that the Z and W sex chromosomes are essentially undifferentiated at the molecular level and the W-specific region is extremely small (at most several hundreds of kilobases). Our results suggest that recombination suppression has not spread very far from the sex-determining locus, if at all. This is consistent with A. vulgare possessing evolutionarily young sex chromosomes. We characterized multiple Wolbachia nuclear inserts in the A. vulgare genome, none of which is associated with the W-specific region. We also identified several candidate genes that may be involved in the sex determination or sexual differentiation pathways. The A. vulgare genome serves as a resource for studying the biology and evolution of crustaceans, one of the most speciose and emblematic metazoan groups.
AbstractThe mitochondrial intermembrane space evolved from the bacterial periplasm. Presumably as a consequence of their common origin, most proteins of these compartments are stabilized by structural disulfide bonds. The molecular machineries that mediate oxidative protein folding in bacteria and mitochondria, however, appear to share no common ancestry. Here we tested whether the enzymes Erv1 and Mia40 of the yeast mitochondrial disulfide relay could be functionally replaced by corresponding components of other compartments. We found that the sulfhydryl oxidase Erv1 could be replaced by the Ero1 oxidase or the protein disulfide isomerase from the endoplasmic reticulum, however at the cost of respiration deficiency. In contrast to Erv1, the mitochondrial oxidoreductase Mia40 proved to be indispensable and could not be replaced by thioredoxin-like enzymes, including the cytoplasmic reductase thioredoxin, the periplasmic dithiol oxidase DsbA, and Pdi1. From our studies we conclude that the profound inertness against glutathione, its slow oxidation kinetics and its high affinity to substrates renders Mia40 a unique and essential component of mitochondrial biogenesis. Evidently, the development of a specific mitochondrial disulfide relay system represented a crucial step in the evolution of the eukaryotic cell.
AbstractSubstitutions between chemically distant amino acids are known to occur less frequently than those between more similar amino acids. This knowledge, however, is not reflected in most codon substitution models, which treat all nonsynonymous changes as if they were equivalent in terms of impact on the protein. A variety of methods for integrating chemical distances into models have been proposed, with a common approach being to divide substitutions into radical or conservative categories. Nevertheless, it remains unclear whether the resulting models describe sequence evolution better than their simpler counterparts.We propose a parametric codon model that distinguishes between radical and conservative substitutions, allowing us to assess if radical substitutions are preferentially removed by selection. Applying our new model to a range of phylogenomic data, we find differentiating between radical and conservative substitutions provides significantly better fit for large populations, but see no equivalent improvement for smaller populations. Comparing codon and amino acid models using these same data shows that alignments from large populations tend to select phylogenetic models containing information about amino acid exchangeabilities, whereas the structure of the genetic code is more important for smaller populations.Our results suggest selection against radical substitutions is, on average, more pronounced in large populations than smaller ones. The reduced observable effect of selection in smaller populations may be due to stronger genetic drift making it more challenging to detect preferences. Our results imply an important connection between the life history of a phylogenetic group and the model that best describes its evolution.
AbstractAllopolyploidy, combining interspecific hybridization with whole genome duplication, has had significant impact on plant evolution. Its evolutionary success is related to the rapid and profound genome reorganizations that allow neoallopolyploids to form and adapt. Nevertheless, how neoallopolyploid genomes adapt to regulate their expression remains poorly understood. The hypothesis of a major role for small noncoding RNAs (sRNAs) in mediating the transcriptional response of neoallopolyploid genomes has progressively emerged. Generally, 21-nt sRNAs mediate posttranscriptional gene silencing by mRNA cleavage, whereas 24-nt sRNAs repress transcription (transcriptional gene silencing) through epigenetic modifications. Here, we characterize the global response of sRNAs to allopolyploidy in Brassica, using three independently resynthesized Brassica napus allotetraploids originating from crosses between diploid Brassica oleracea and Brassica rapa accessions, surveyed at two different generations in comparison with their diploid progenitors. Our results suggest an immediate but transient response of specific sRNA populations to allopolyploidy. These sRNA populations mainly target noncoding components of the genome but also target the transcriptional regulation of genes involved in response to stresses and in metabolism; this suggests a broad role in adapting to allopolyploidy. We finally identify the early accumulation of both 21- and 24-nt sRNAs involved in regulating the same targets, supporting a posttranscriptional gene silencing to transcriptional gene silencing shift at the first stages of the neoallopolyploid formation. We propose that reorganization of sRNA production is an early response to allopolyploidy in order to control the transcriptional reactivation of various noncoding elements and stress-related genes, thus ensuring genome stability during the first steps of neoallopolyploid formation.
AbstractPre-existing and de novo genetic variants can both drive adaptation to environmental changes, but their relative contributions and interplay remain poorly understood. Here we investigated the evolutionary dynamics in drug-treated yeast populations with different levels of pre-existing variation by experimental evolution coupled with time-resolved sequencing and phenotyping. We found a doubling of pre-existing variation alone boosts the adaptation by 64.1% and 51.5% in hydroxyurea and rapamycin, respectively. The causative pre-existing and de novo variants were selected on shared targets: RNR4 in hydroxyurea and TOR1, TOR2 in rapamycin. Interestingly, the pre-existing and de novo TOR variants map to different functional domains and act via distinct mechanisms. The pre-existing TOR variants from two domesticated strains exhibited opposite rapamycin resistance effects, reflecting lineage-specific functional divergence. This study provides a dynamic view on how pre-existing and de novo variants interactively drive adaptation and deepens our understanding of clonally evolving populations.
AbstractGene-environment association (GEA) studies are essential to understand the past and ongoing adaptations of organisms to their environment, but those studies are complicated by confounding due to unobserved demographic factors. Although the confounding problem has recently received considerable attention, the proposed approaches do not scale with the high-dimensionality of genomic data. Here, we present a new estimation method for latent factor mixed models (LFMMs) implemented in an upgraded version of the corresponding computer program. We developed a least-squares estimation approach for confounder estimation that provides a unique framework for several categories of genomic data, not restricted to genotypes. The speed of the new algorithm is several order faster than existing GEA approaches and then our previous version of the LFMM program. In addition, the new method outperforms other fast approaches based on principal component or surrogate variable analysis. We illustrate the program use with analyses of the 1000 Genomes Project data set, leading to new findings on adaptation of humans to their environment, and with analyses of DNA methylation profiles providing insights on how tobacco consumption could affect DNA methylation in patients with rheumatoid arthritis.Software availability: Software is available in the R package lfmm at https://bcm-uga.github.io/lfmm/.
AbstractMicroRNAs (miRNAs) are important posttranscriptional regulators of gene expression. However, comprehensive expression profiles of miRNAs during mammalian spermatogenesis are lacking. Herein, we sequenced small RNAs in highly purified mouse spermatogenic cells at different stages. We found that a family of X-linked miRNAs named spermatogenesis-related miRNAs (spermiRs) is predominantly expressed in the early meiotic phases and has a conserved testis-specific high expression pattern in different mammals. We identified one spermiR homolog in opossum; this homolog might originate from THER1, a retrotransposon that is active in marsupials but extinct in current placental mammals. SpermiRs have expanded rapidly with mammalian evolution and are diverged into two clades, spermiR-L and spermiR-R, which are likely to have been generated at least in part by tandem duplication mediated by flanking retrotransposable elements. Notably, despite having undergone highly frequent lineage-specific duplication events, the sequences encoding all spermiR family members are strictly located between two protein-coding genes, Slitrk2 and Fmr1. Moreover, spermiR-Ls and spermiR-Rs have evolved different expression patterns during spermatogenesis in different mammals. Intriguingly, the seed sequences of spermiRs, which are critical for the recognition of target genes, are highly divergent within and among mammals, whereas spermiR target genes largely overlap. When miR-741, the most highly expressed spermiR, is knocked out in cultured mouse spermatogonial stem cells (SSCs), another spermiR, miR-465a-5p, is dramatically upregulated and becomes the most abundant miRNA. Notably, miR-741−/− SSCs grow normally, and the genome-wide expression levels of mRNAs remain unchanged. All these observations indicate functional compensation between spermiR family members and strong coevolution between spermiRs and their targets.
AbstractThe modification of adenosine to inosine at the first position of transfer RNA (tRNA) anticodons (I34) is widespread among bacteria and eukaryotes. In bacteria, the modification is found in tRNAArg and is catalyzed by tRNA adenosine deaminase A, a homodimeric enzyme. In eukaryotes, I34 is introduced in up to eight different tRNAs by the heterodimeric adenosine deaminase acting on tRNA. This substrate expansion significantly influenced the evolution of eukaryotic genomes in terms of codon usage and tRNA gene composition. However, the selective advantages driving this process remain unclear. Here, we have studied the evolution of I34, tRNA adenosine deaminase A, adenosine deaminase acting on tRNA, and their relevant codons in a large set of bacterial and eukaryotic species. We show that a functional expansion of I34 to tRNAs other than tRNAArg also occurred within bacteria, in a process likely initiated by the emergence of unmodified A34-containing tRNAs. In eukaryotes, we report on a large variability in the use of I34 in protists, in contrast to a more uniform presence in fungi, plans, and animals. Our data support that the eukaryotic expansion of I34-tRNAs was driven by the improvement brought by these tRNAs to the synthesis of proteins highly enriched in certain amino acids.
AbstractWe present a new phylogenetic approach, selection on amino acids and codons (SelAC), whose substitution rates are based on a nested model linking protein expression to population genetics. Unlike simpler codon models that assume a single substitution matrix for all sites, our model more realistically represents the evolution of protein-coding DNA under the assumption of consistent, stabilizing selection using a cost-benefit approach. This cost–benefit approach allows us to generate a set of 20 optimal amino acid-specific matrix families using just a handful of parameters and naturally links the strength of stabilizing selection to protein synthesis levels, which we can estimate. Using a yeast data set of 100 orthologs for 6 taxa, we find SelAC fits the data much better than popular models by 104–105 Akike information criterion units adjusted for small sample bias. Our results also indicated that nested, mechanistic models better predict observed data patterns highlighting the improvement in biological realism in amino acid sequence evolution that our model provides. Additional parameters estimated by SelAC indicate that a large amount of nonphylogenetic, but biologically meaningful, information can be inferred from existing data. For example, SelAC prediction of gene-specific protein synthesis rates correlates well with both empirical (r=0.33–0.48) and other theoretical predictions (r=0.45–0.64) for multiple yeast species. SelAC also provides estimates of the optimal amino acid at each site. Finally, because SelAC is a nested approach based on clearly stated biological assumptions, future modifications, such as including shifts in the optimal amino acid sequence within or across lineages, are possible.
Liuqi Gu and James R. Walters
AbstractThe regulation of gene expression and RNA maturation underlies fundamental processes such as cell homeostasis, development, and stress acclimation. The biogenesis and modification of RNA is tightly controlled by an array of regulatory RNAs and nucleic acid-binding proteins. While the role of small RNAs (sRNAs) in gene expression has been studied in-depth in select model organisms, little is known about sRNA biology across the eukaryotic tree of life. We used deep sequencing to explore the repertoires of sRNAs encoded by the miniaturized, endosymbiotically derived “nucleomorph” genomes of two single-celled algae, the cryptophyte Guillardia theta and the chlorarachniophyte Bigelowiella natans. A total of 32.3 and 35.3 million reads were generated from G. theta and B. natans, respectively. In G. theta, we identified nucleomorph U1, U2, and U4 spliceosomal small nuclear RNAs (snRNAs) as well as 11 C/D box small nucleolar RNAs (snoRNAs), five of which have potential plant and animal homologs. The snoRNAs are predicted to perform 2′-O methylation of rRNA (but not snRNA). In B. natans, we found the previously undetected 5S rRNA as well as six orphan sRNAs. Analysis of chlorarachniophyte snRNAs shed light on the removal of the miniature 18–21 nt introns found in B. natans nucleomorph genes. Neither of the nucleomorph genomes appears to encode RNA pseudouridylation machinery, and U5 snRNA cannot be found in the cryptophyte G. theta. Considering the central roles of U5 snRNA and RNA modifications in other organisms, cytoplasm-to-nucleomorph RNA shuttling in cryptophyte algae is a distinct possibility.
AbstractThe Chinese forest musk deer (Moschus berezovskii; FMD) is an artiodactyl mammal and is both economically valuable and highly endangered. To investigate the genetic mechanisms of musk secretion and adaptive immunity in FMD, we compared its genome to nine other artiodactyl genomes. Comparative genomics demonstrated that eight positively selected genes (PSGs) in FMD were annotated in three KEGG pathways that were related to metabolic and synthetic activity of musk, similar to previous transcriptome studies. Functional enrichment analysis indicated that many PSGs were involved in the regulation of immune system processes, implying important reorganization of the immune system in FMD. FMD-specific missense mutations were found in two PSGs (MHC class II antigen DRA and ADA) that were classified as deleterious by PolyPhen-2, possibly contributing to immune adaptation to infectious diseases. Functional assessment showed that the FMD-specific mutation enhanced the ADA activity, which was likely to strengthen the immune defense against pathogenic invasion. Single nucleotide polymorphism-based inference showed the recent demographic trajectory for FMD. Our data and findings provide valuable genomic resources not only for studying the genetic mechanisms of musk secretion and adaptive immunity, but also for facilitating more effective management of the captive breeding programs for this endangered species.
AbstractSouth Asia has a complex history of migrations and is characterized by substantial pigmentary and genetic diversity. For this reason, it is an ideal region to study the genetic architecture of normal pigmentation variation. Here, we present a meta-analysis of two genome-wide association studies (GWASs) of skin pigmentation using skin reflectance (M-index) as a quantitative phenotype. The meta-analysis includes a sample of individuals of South Asian descent living in Canada (N = 348), and a sample of individuals from two caste and four tribal groups from West Maharashtra, India (N = 480). We also present the first GWAS of iris color in South Asian populations. This GWAS was based on quantitative measures of iris color obtained from high-resolution iris pictures. We identified genome-wide significant associations of variants within the well-known gene SLC24A5, including the nonsynonymous rs1426654 polymorphism, with both skin pigmentation and iris color, highlighting the pleiotropic effects of this gene on pigmentation. Variants in the HERC2 gene (e.g., rs12913832) were also associated with iris color and iris heterochromia. Our study emphasizes the usefulness of quantitative methods to study iris color variation. We also identified novel genome-wide significant associations with skin pigmentation and iris color, but we could not replicate these associations due to the lack of independent samples. It will be critical to expand the number of studies in South Asian populations in order to better understand the genetic variation driving the diversity of skin pigmentation and iris color observed in this region.
AbstractPIWI proteins and their guiding Piwi-interacting (pi-) RNAs direct the silencing of target nucleic acids in the animal germline and soma. Although in mammal testes fetal piRNAs are involved in extensive silencing of transposons, pachytene piRNAs have additionally been shown to act in post-transcriptional gene regulation. The bulk of pachytene piRNAs is produced from large genomic loci, named piRNA clusters. Recently, the presence of reversed pseudogenes within piRNA clusters prompted the idea that piRNAs derived from such sequences might direct regulation of their parent genes. Here, we examine primate piRNA clusters and integrated pseudogenes in a comparative approach to gain a deeper understanding about mammalian piRNA cluster evolution and the presumed gene-regulatory role of pseudogene-derived piRNAs. Initially, we provide a broad analysis of the evolutionary relationships of piRNA clusters and their differential activity among six primate species. Subsequently, we show that pseudogenes in reserve orientation relative to piRNA cluster transcription direction generally do not exhibit signs of selection pressure and cause weakly conserved targeting of homologous genes among species, suggesting a lack of functional constraints and thus only a minor significance for gene regulation in most cases. Finally, we report that piRNA-producing loci generally tend to be located in active genomic regions with elevated gene and pseudogene density. Thus, we conclude that the presence of most pseudogenes in piRNA clusters might be regarded as a byproduct of piRNA cluster generation, whereas this does not exclude that some pseudogenes nevertheless play critical roles in individual cases.
AbstractMales and females of Artemia franciscana, a crustacean commonly used in the aquarium trade, are highly dimorphic. Sex is determined by a pair of ZW chromosomes, but the nature and extent of differentiation of these chromosomes is unknown. Here, we characterize the Z chromosome by detecting genomic regions that show lower genomic coverage in female than in male samples, and regions that harbor an excess of female-specific SNPs. We detect many Z-specific genes, which no longer have homologs on the W, but also Z-linked genes that appear to have diverged very recently from their existing W-linked homolog. We assess patterns of male and female expression in two tissues with extensive morphological dimorphism, gonads, and heads. In agreement with their morphology, sex-biased expression is common in both tissues. Interestingly, the Z chromosome is not enriched for sex-biased genes, and seems to in fact have a mechanism of dosage compensation that leads to equal expression in males and in females. Both of these patterns are contrary to most ZW systems studied so far, making A. franciscana an excellent model for investigating the interplay between the evolution of sexual dimorphism and dosage compensation, as well as Z chromosome evolution in general.
AbstractCodon usage patterns are affected by both mutational biases and translational selection. The frequency at which each codon is used in the genome is directly linked to the cellular concentrations of their corresponding tRNAs. Transfer RNA abundances—as well as the abundances of other potentially relevant factors, such as RNA-binding proteins—may vary across different tissues, making it possible that genes expressed in different tissues are subject to different translational selection regimes, and thus differ in their patterns of codon usage. These differences, however, are poorly understood, having been studied only in Arabidopsis, rice and human, with controversial results in human. Drosophila melanogaster is a suitable model organism to study tissue-specific codon adaptation given its large effective population size. Here, we compare 2,046 genes, each expressed specifically in one tissue of D. melanogaster. We show that genes expressed in different tissues exhibit significant differences in their patterns of codon usage, and that these differences are only partially due to differences in GC content, expression levels, or protein lengths. Remarkably, these differences are stronger when analyses are restricted to highly expressed genes. Our results strongly suggest that genes expressed in different tissues are subject to different regimes of translational selection.
AbstractChlorarachniophyte and cryptophyte algae are unique among plastid-containing species in that they have a nucleomorph genome: a compact, highly reduced nuclear genome from a photosynthetic eukaryotic endosymbiont. Despite their independent origins, the nucleomorph genomes of these two lineages have similar genomic architectures, but little is known about the evolutionary pressures impacting nucleomorph DNA, particularly how their rates of evolution compare to those of the neighboring genetic compartments (the mitochondrion, plastid, and nucleus). Here, we use synonymous substitution rates to estimate relative mutation rates in the four genomes of nucleomorph-bearing algae. We show that the relative mutation rates of the host versus endosymbiont nuclear genomes are similar in both chlorarachniophytes and cryptophytes, despite the fact that nucleomorph gene sequences are notoriously highly divergent. There is some evidence, however, for slightly elevated mutation rates in the nucleomorph DNA of chlorarachniophytes—a feature not observed in that of cryptophytes. For both lineages, relative mutation rates in the plastid appear to be lower than those in the nucleus and nucleomorph (and, in one case, the mitochondrion), which is consistent with studies of other plastid-bearing protists. Given the divergent nature of nucleomorph genes, our finding of relatively low evolutionary rates in these genomes suggests that for both lineages a burst of evolutionary change and/or decreased selection pressures likely occurred early in the integration of the secondary endosymbiont.
AbstractTaxonomic and phylogenetic relationships of Streptococcus mitis and Streptococcus oralis have been difficult to establish biochemically and genetically. We used core-genome analyses of S. mitis and S. oralis, as well as the closely related species Streptococcus pneumoniae and Streptococcus parasanguinis, to clarify the phylogenetic relationships between S. mitis and S. oralis, as well as within subclades of S. oralis. All S. mitis (n = 67), S. oralis (n = 89), S. parasanguinis (n = 27), and 27 S. pneumoniae genome assemblies were downloaded from NCBI and reannotated. All genes were delineated into homologous clusters and maximum-likelihood phylogenies built from putatively nonrecombinant core gene sets. Population structure was determined using Bayesian genome clustering, and patristic distance was calculated between populations. Population-specific gene content was assessed using a phylogenetic-based genome-wide association approach. Streptococcus mitis and S. oralis formed distinct clades, but species mixing suggests taxonomic misassignment. Patristic distance between populations suggests that S. oralis subsp. dentisani is a distinct species, whereas S. oralis subsp. tigurinus and subsp. oralis are supported as subspecies, and that S. mitis comprises two subspecies. None of the genes within the pan-genomes of S. mitis and S. oralis could be statistically correlated with either, and the dispensable genomes showed extensive variation among isolates. These are likely important factors contributing to established overlap in biochemical characteristics for these taxa. Based on core-genome analysis, the substructure of S. oralis and S. mitis should be redefined, and species assignments within S. oralis and S. mitis should be made based on whole-genome analysis to be robust to misassignment.
AbstractElucidating the molecular basis of adaptation to different environmental conditions is important because adaptive ability of a species can shape its distribution, influence speciation, and also drive a variety of evolutionary processes. For crustaceans, colonization of freshwater habitats has significantly impacted diversity, but the molecular basis of this process is poorly understood. In the current study, we examined three prawn species from the genus Macrobrachium (M. australiense, M. tolmerum, and M. novaehollandiae) to better understand the molecular basis of freshwater adaptation using a comparative transcriptomics approach. Each of these species naturally inhabit environments with different salinity levels; here, we exposed them to the same experimental salinity conditions (0‰ and 15‰), to compare expression patterns of candidate genes that previously have been shown to influence phenotypic traits associated with freshwater adaptation (e.g., genes associated with osmoregulation). Differential gene expression analysis revealed 876, 861, and 925 differentially expressed transcripts under the two salinities for M. australiense, M. tolmerum, and M. novaehollandiae, respectively. Of these, 16 were found to be unannotated novel transcripts and may be taxonomically restricted or orphan genes. Functional enrichment and molecular pathway mapping revealed 13 functionally enriched categories and 11 enriched molecular pathways that were common to the three Macrobrachium species. Pattern of selection analysis revealed 26 genes with signatures of positive selection among pairwise species comparisons. Overall, our results indicate that the same key genes and similar molecular pathways are likely to be involved with freshwater adaptation widely across this decapod group; with nonoverlapping sets of genes showing differential expression (mainly osmoregulatory genes) and signatures of positive selection (genes involved with different life history traits).