The Society for Molecular Biology and Evolution is now accepting abstracts for the 2016 annual meeting, taking place on the Gold Coast, Queensland, Australia, July 3-7, 2016.
Continue Reading →
Mammals have adapted to live in the darkest of caves and the deepest oceans, and from the highest mountains to the plains. Along the way, mammals have also adapted a remarkable capacity in their sense of hearing, from the high-frequency echolocation calls of bats to low frequency whale songs. Even our best friend companion animals, dogs, have developed a hearing range twice as wide as people.
New findings from a team of scientists have shown that sea otters have low genetic diversity, which could endanger their health as a species.
AbstractThere is a growing body of evidence demonstrating the impacts of human arrival in new “pristine” environments, including terrestrial habitat alterations and species extinctions. However, the effects of marine resource utilization prior to industrialized whaling, sealing, and fishing have largely remained understudied. The expansion of the Norse across the North Atlantic offers a rare opportunity to study the effects of human arrival and early exploitation of marine resources. Today, there is no local population of walruses on Iceland, however, skeletal remains, place names, and written sources suggest that walruses existed, and were hunted by the Norse during the Settlement and Commonwealth periods (870–1262 AD). This study investigates the timing, geographic distribution, and genetic identity of walruses in Iceland by combining historical information, place names, radiocarbon dating, and genomic analyses. The results support a genetically distinct, local population of walruses that went extinct shortly after Norse settlement. The high value of walrus products such as ivory on international markets likely led to intense hunting pressure, which—potentially exacerbated by a warming climate and volcanism—resulted in the extinction of walrus on Iceland. We show that commercial hunting, economic incentives, and trade networks as early as the Viking Age were of sufficient scale and intensity to result in significant, irreversible ecological impacts on the marine environment. This is to one of the earliest examples of local extinction of a marine species following human arrival, during the very beginning of commercial marine exploitation.
AbstractChloroplasts originated from an ancient cyanobacterium and still harbor a bacterial-like genome. However, the centrality of Shine–Dalgarno ribosome binding, which predominantly regulates proteobacterial translation initiation, is significantly decreased in chloroplasts. As plastid ribosomal RNA anti-Shine–Dalgarno elements are similar to their bacterial counterparts, these sites alone cannot explain this decline. By computational simulation we show that upstream point mutations modulate the local structure of ribosomal RNA in chloroplasts, creating significantly tighter structures around the anti-Shine–Dalgarno locus, which in-turn reduce the probability of ribosome binding. To validate our model, we expressed two reporter genes (mCherry, hydrogenase) harboring a Shine–Dalgarno motif in the Chlamydomonas reinhardtii chloroplast. Coexpressing them with a 16S ribosomal RNA, modified according to our model, significantly enhances mCherry and hydrogenase expression compared with coexpression with an endogenous 16S gene.
AbstractReverse gyrase (RG) is the only protein found ubiquitously in hyperthermophilic organisms, but absent from mesophiles. As such, its simple presence or absence allows us to deduce information about the optimal growth temperature of long-extinct organisms, even as far as the last universal common ancestor of extant life (LUCA). The growth environment and gene content of the LUCA has long been a source of debate in which RG often features. In an attempt to settle this debate, we carried out an exhaustive search for RG proteins, generating the largest RG data set to date. Comprising 376 sequences, our data set allows for phylogenetic reconstructions of RG with unprecedented size and detail. These RG phylogenies are strikingly different from those of universal proteins inferred to be present in the LUCA, even when using the same set of species. Unlike such proteins, RG does not form monophyletic archaeal and bacterial clades, suggesting RG emergence after the formation of these domains, and/or significant horizontal gene transfer. Additionally, the branch lengths separating archaeal and bacterial groups are very short, inconsistent with the tempo of evolution from the time of the LUCA. Despite this, phylogenies limited to archaeal RG resolve most archaeal phyla, suggesting predominantly vertical evolution since the time of the last archaeal ancestor. In contrast, bacterial RG indicates emergence after the last bacterial ancestor followed by significant horizontal transfer. Taken together, these results suggest a nonhyperthermophilic LUCA and bacterial ancestor, with hyperthermophily emerging early in the evolution of the archaeal and bacterial domains.
AbstractSome genes have repeatedly been found to control diverse adaptations in a wide variety of organisms. Such gene reuse reveals not only the diversity of phenotypes these unique genes control but also the composition of developmental gene networks and the genetic routes available to and taken by organisms during adaptation. However, the causes of gene reuse remain unclear. A small number of large-effect Mendelian loci control a huge diversity of mimetic butterfly wing color patterns, but reasons for their reuse are difficult to identify because the genetic basis of mimicry has primarily been studied in two systems with correlated factors: female-limited Batesian mimicry in Papilio swallowtails (Papilionidae) and non-sex-limited Müllerian mimicry in Heliconius longwings (Nymphalidae). Here, we break the correlation between phylogenetic relationship and sex-limited mimicry by identifying loci controlling female-limited mimicry polymorphism Hypolimnas misippus (Nymphalidae) and non-sex-limited mimicry polymorphism in Papilio clytia (Papilionidae). The Papilio clytia polymorphism is controlled by the genome region containing the gene cortex, the classic P supergene in Heliconius numata, and loci controlling color pattern variation across Lepidoptera. In contrast, female-limited mimicry polymorphism in Hypolimnas misippus is associated with a locus not previously implicated in color patterning. Thus, although many species repeatedly converged on cortex and its neighboring genes over 120 My of evolution of diverse color patterns, female-limited mimicry polymorphisms each evolved using a different gene. Our results support conclusions that gene reuse occurs mainly within ∼10 My and highlight the puzzling diversity of genes controlling seemingly complex female-limited mimicry polymorphisms.
AbstractMany biomolecular machines need to be both fast and efficient. How has evolution optimized these machines along the tradeoff between speed and efficiency? We explore this question using optimizable dynamical models along coordinates that are plausible evolutionary degrees of freedom. Data on 11 motors and ion pumps are consistent with the hypothesis that evolution seeks an optimal balance of speed and efficiency, where any further small increase in one of these quantities would come at great expense to the other. For FoF1-ATPases in different species, we also find apparent optimization of the number of subunits in the c-ring, which determines the number of protons pumped per ATP synthesized. Interestingly, these ATPases appear to more optimized for efficiency than for speed, which can be rationalized through their key role as energy transducers in biology. The present modeling shows how the dynamical performance properties of biomolecular motors and pumps may have evolved to suit their corresponding biological actions.
AbstractThe relationship between enzymes and substrates does not perfectly match the “lock and key” model, because enzymes act on molecules other than their true substrate in different catalytic reactions. Such biologically nonfunctional reactions are called “promiscuous activities.” Promiscuous activities are apparently useless, but they can be an important starting point for enzyme evolution. It has been hypothesized that enzymes with low promiscuous activity will show enhanced promiscuous activity under selection pressure and become new specialists through gene duplication. Although this is the prevailing scenario, there are two major problems: 1) it would not apply to prokaryotes because horizontal gene transfer is more significant than gene duplication and 2) there is no direct evidence that promiscuous activity is low without selection pressure. We propose a new scenario including various levels of promiscuous activity throughout a clade and horizontal gene transfer. STAY-GREEN (SGR), a chlorophyll a—Mg dechelating enzyme, has homologous genes in bacteria lacking chlorophyll. We found that some bacterial SGR homologs have much higher Mg-dechelating activities than those of green plant SGRs, while others have no activity, indicating that the level of promiscuous activity varies. A phylogenetic analysis suggests that a bacterial SGR homolog with high dechelating activity was horizontally transferred to a photosynthetic eukaryote. Some SGR homologs acted on various chlorophyll molecules that are not used as substrates by green plant SGRs, indicating that SGR acquired substrate specificity after transfer to eukaryotes. We propose that horizontal transfer of high promiscuous activity is one process of new enzyme acquisition.
AbstractMost eukaryotes inherit their mitochondria from only one of their parents. When there are different sexes, it is almost always the maternal mitochondria that are transmitted. Indeed, maternal uniparental inheritance has been reported for the brown alga Ectocarpus but we show in this study that different strains of Ectocarpus can exhibit different patterns of inheritance: Ectocarpus siliculosus strains showed maternal uniparental inheritance, as expected, but crosses using different Ectocarpus species 7 strains exhibited either paternal uniparental inheritance or an unusual pattern of transmission where progeny inherited either maternal or paternal mitochondria, but not both. A possible correlation between the pattern of mitochondrial inheritance and male gamete parthenogenesis was investigated. Moreover, in contrast to observations in the green lineage, we did not detect any change in the pattern of mitochondrial inheritance in mutant strains affected in life cycle progression. Finally, an analysis of field-isolated strains provided evidence of mitochondrial genome recombination in both Ectocarpus species.
AbstractMastomys are the most widespread African rodent and carriers of various diseases such as the plague or Lassa virus. In addition, mastomys have rapidly gained a large number of mammary glands. Here, we generated a genome, variome, and transcriptomes for Mastomys coucha. As mastomys diverged at similar times from mouse and rat, we demonstrate their utility as a comparative genomic tool for these commonly used animal models. Furthermore, we identified over 500 mastomys accelerated regions, often residing near important mammary developmental genes or within their exons leading to protein sequence changes. Functional characterization of a noncoding mastomys accelerated region, located in the HoxD locus, showed enhancer activity in mouse developing mammary glands. Combined, our results provide genomic resources for mastomys and highlight their potential both as a comparative genomic tool and for the identification of mammary gland number determining factors.
AbstractLongitudinal next-generation sequencing of cancer patient samples has enhanced our understanding of the evolution and progression of various cancers. As a result, and due to our increasing knowledge of heterogeneity, such sampling is becoming increasingly common in research and clinical trial sample collections. Traditionally, the evolutionary analysis of these cohorts involves the use of an aligner followed by subsequent stringent downstream analyses. However, this can lead to large levels of information loss due to the vast mutational landscape that characterizes tumor samples.Here, we propose an alignment-free approach for sequence comparison—a well-established approach in a range of biological applications including typical phylogenetic classification. Such methods could be used to compare information collated in raw sequence files to allow an unsupervised assessment of the evolutionary trajectory of patient genomic profiles.In order to highlight this utility in cancer research we have applied our alignment-free approach using a previously established metric, Jensen–Shannon divergence, and a metric novel to this area, Hellinger distance, to two longitudinal cancer patient cohorts in glioma and clear cell renal cell carcinoma using our software, NUQA.We hypothesize that this approach has the potential to reveal novel information about the heterogeneity and evolutionary trajectory of spatiotemporal tumor samples, potentially revealing early events in tumorigenesis and the origins of metastases and recurrences.Key words: alignment-free, Hellinger distance, exome-seq, evolution, phylogenetics, longitudinal.
AbstractReconstructing species’ demographic histories is a central focus of molecular ecology and evolution. Recently, an expanding suite of methods leveraging either the sequentially Markovian coalescent (SMC) or the site-frequency spectrum has been developed to reconstruct population size histories from genomic sequence data. However, few studies have investigated the robustness of these methods to genome assemblies of varying quality. In this study, we first present an improved genome assembly for the Tasmanian devil using the Chicago library method. Compared with the original reference genome, our new assembly reduces the number of scaffolds (from 35,975 to 10,010) and increases the scaffold N90 (from 0.101 to 2.164 Mb). Second, we assess the performance of four contemporary genomic methods for inferring population size history (PSMC, MSMC, SMC++, Stairway Plot), using the two devil genome assemblies as well as simulated, artificially fragmented genomes that approximate the hypothesized demographic history of Tasmanian devils. We demonstrate that each method is robust to assembly quality, producing similar estimates of Ne when simulated genomes were fragmented into up to 5,000 scaffolds. Overall, methods reliant on the SMC are most reliable between ∼300 generations before present (gbp) and 100 kgbp, whereas methods exclusively reliant on the site-frequency spectrum are most reliable between the present and 30 gbp. Our results suggest that when used in concert, genomic methods for reconstructing species’ effective population size histories 1) can be applied to nonmodel organisms without highly contiguous reference genomes, and 2) are capable of detecting independently documented effects of historical geological events.
AbstractRetinoblastoma proteins are eukaryotic transcriptional corepressors that play central roles in cell cycle control, among other functions. Although most metazoan genomes encode a single retinoblastoma protein, gene duplications have occurred at least twice: in the vertebrate lineage, leading to Rb, p107, and p130, and in Drosophila, an ancestral Rbf1 gene and a derived Rbf2 gene. Structurally, Rbf1 resembles p107 and p130, and mutation of the gene is lethal. Rbf2 is more divergent and mutation does not lead to lethality. However, the retention of Rbf2 >60 My in Drosophila points to essential functions, which prior cell-based assays have been unable to elucidate. Here, using genomic approaches, we provide new insights on the function of Rbf2. Strikingly, we show that Rbf2 regulates a set of cell growth-related genes and can antagonize Rbf1 on specific genes. These unique properties have important implications for the fly; Rbf2 mutants show reduced egg laying, and lifespan is reduced in females and males. Structural alterations in conserved regions of Rbf2 gene suggest that it was sub- or neofunctionalized to develop specific regulatory specificity and activity. We define cis-regulatory features of Rbf2 target genes that allow preferential repression by this protein, indicating that it is not a weaker version of Rbf1 as previously thought. The specialization of retinoblastoma function in Drosophila may reflect a parallel evolution found in vertebrates, and raises the possibility that cell growth control is equally important to cell cycle function for this conserved family of transcriptional corepressors.
AbstractComparing newly obtained and previously known nucleotide and amino-acid sequences underpins modern biological research. BLAST is a well-established tool for such comparisons but is challenging to use on new data sets. We combined a user-centric design philosophy with sustainable software development approaches to create Sequenceserver, a tool for running BLAST and visually inspecting BLAST results for biological interpretation. Sequenceserver uses simple algorithms to prevent potential analysis errors and provides flexible text-based and visual outputs to support researcher productivity. Our software can be rapidly installed for use by individuals or on shared servers.
AbstractThe extent to which selection has shaped present-day human populations has attracted intense scrutiny, and examples of local adaptations abound. However, the evolutionary trajectory of alleles that, today, are deleterious has received much less attention. To address this question, the genomes of 2,062 individuals, including 1,179 ancient humans, were reanalyzed to assess how frequencies of risk alleles and their homozygosity changed through space and time in Europe over the past 45,000 years. Although the overall deleterious homozygosity has consistently decreased, risk alleles have steadily increased in frequency over that period of time. Those that increased most are associated with diseases such as asthma, Crohn disease, diabetes, and obesity, which are highly prevalent in present-day populations. These findings may not run against the existence of local adaptations but highlight the limitations imposed by drift and population dynamics on the strength of selection in purging deleterious mutations from human populations.
AbstractEvolve and resequence (E&R) studies are frequently used to dissect the genetic basis of quantitative traits. By subjecting a population to truncating selection for several generations and estimating the allele frequency differences between selected and nonselected populations using next-generation sequencing (NGS), the loci contributing to the selected trait may be identified. The role of different parameters, such as, the population size or the number of replicate populations has been examined in previous works. However, the influence of the selection regime, that is the strength of truncating selection during the experiment, remains little explored. Using whole genome, individual based forward simulations of E&R studies, we found that the power to identify the causative alleles may be maximized by gradually increasing the strength of truncating selection during the experiment. Notably, such an optimal selection regime comes at no or little additional cost in terms of sequencing effort and experimental time. Interestingly, we also found that a selection regime which optimizes the power to identify the causative loci is not necessarily identical to a regime that maximizes the phenotypic response. Finally, our simulations suggest that an E&R study with an optimized selection regime may have a higher power to identify the genetic basis of quantitative traits than a genome-wide association study, highlighting that E&R is a powerful approach for finding the loci underlying complex traits.
AbstractIt is incompletely understood how biophysical properties like protein stability impact molecular evolution and epistasis. Epistasis is defined as specific when a mutation exclusively influences the phenotypic effect of another mutation, often at physically interacting residues. In contrast, nonspecific epistasis results when a mutation is influenced by a large number of nonlocal mutations. As most mutations are pleiotropic, the in vivo folding probability—governed by basal protein stability—is thought to determine activity-enhancing mutational tolerance, implying that nonspecific epistasis is dominant. However, evidence exists for both specific and nonspecific epistasis as the prevalent factor, with limited comprehensive data sets to support either claim. Here, we use deep mutational scanning to probe how in vivo enzyme folding probability impacts local fitness landscapes. We computationally designed two different variants of the amidase AmiE with statistically indistinguishable catalytic efficiencies but lower probabilities of folding in vivo compared with wild-type. Local fitness landscapes show slight alterations among variants, with essentially the same global distribution of fitness effects. However, specific epistasis was predominant for the subset of mutations exhibiting positive sign epistasis. These mutations mapped to spatially distinct locations on AmiE near the initial mutation or proximal to the active site. Intriguingly, the majority of specific epistatic mutations were codon dependent, with different synonymous codons resulting in fitness sign reversals. Together, these results offer a nuanced view of how protein folding probability impacts local fitness landscapes and suggest that transcriptional–translational effects are as important as stability in determining evolutionary outcomes.
AbstractMutations, recombinations, and genome duplications may promote genetic diversity and trigger evolutionary processes. However, quantifying these events in diploid hybrid genomes is challenging. Here, we present an integrated experimental and computational workflow to accurately track the mutational landscape of yeast diploid hybrids (MuLoYDH) in terms of single-nucleotide variants, small insertions/deletions, copy-number variants, aneuploidies, and loss-of-heterozygosity. Pairs of haploid Saccharomyces parents were combined to generate ancestor hybrids with phased genomes and varying levels of heterozygosity. These diploids were evolved under different laboratory protocols, in particular mutation accumulation experiments. Variant simulations enabled the efficient integration of competitive and standard mapping of short reads, depending on local levels of heterozygosity. Experimental validations proved the high accuracy and resolution of our computational approach. Finally, applying MuLoYDH to four different diploids revealed striking genetic background effects. Homozygous Saccharomyces cerevisiae showed a ∼4-fold higher mutation rate compared with its closely related species S. paradoxus. Intraspecies hybrids unveiled that a substantial fraction of the genome (∼250 bp per generation) was shaped by loss-of-heterozygosity, a process strongly inhibited in interspecies hybrids by high levels of sequence divergence between homologous chromosomes. In contrast, interspecies hybrids exhibited higher single-nucleotide mutation rates compared with intraspecies hybrids. MuLoYDH provided an unprecedented quantitative insight into the evolutionary processes that mold diploid yeast genomes and can be generalized to other genetic systems.
AbstractCentipedes are among the most ancient groups of venomous predatory arthropods. Extant species belong to five orders, but our understanding of the composition and evolution of centipede venoms is based almost exclusively on one order, Scolopendromorpha. To gain a broader and less biased understanding we performed a comparative proteotranscriptomic analysis of centipede venoms from all five orders, including the first venom profiles for the orders Lithobiomorpha, Craterostigmomorpha, and Geophilomorpha. Our results reveal an astonishing structural diversity of venom components, with 93 phylogenetically distinct protein and peptide families. Proteomically-annotated gene trees of these putative toxin families show that centipede venom composition is highly dynamic across macroevolutionary timescales, with numerous gene duplications as well as functional recruitments and losses of toxin gene families. Strikingly, not a single family is found in the venoms of representatives of all five orders, with 67 families being unique for single orders. Ancestral state reconstructions reveal that centipede venom originated as a simple cocktail comprising just four toxin families, with very little compositional evolution happening during the approximately 50 My before the living orders had diverged. Venom complexity then increased in parallel within the orders, with scolopendromorphs evolving particularly complex venoms. Our results show that even venoms composed of toxins evolving under the strong constraint of negative selection can have striking evolutionary plasticity on the compositional level. We show that the functional recruitments and losses of toxin families that shape centipede venom arsenals are not concentrated early in their evolutionary history, but happen frequently throughout.
AbstractMany methods exist for detecting introgression between nonsister species, but the most commonly used require either a single sequence from four or more taxa or multiple sequences from each of three taxa. Here, we present a test for introgression that uses only a single sequence from three taxa. This test, denoted D3, uses similar logic as the standard D-test for introgression, but by using pairwise distances instead of site patterns it is able to detect the same signal of introgression with fewer species. We use simulations to show that D3 has statistical power almost equal to D, demonstrating its use on a data set of wild bananas (Musa). The new test is easy to apply and easy to interpret, and should find wide use among currently available data sets.
AbstractBacteria of the Firmicutes phylum are able to enter a developmental pathway that culminates with the formation of highly resistant, dormant endospores. Endospores allow environmental persistence, dissemination and for pathogens, are also infection vehicles. In both the model Bacillus subtilis, an aerobic organism, and in the intestinal pathogen Clostridioides difficile, an obligate anaerobe, sporulation mobilizes hundreds of genes. Their expression is coordinated between the forespore and the mother cell, the two cells that participate in the process, and is kept in close register with the course of morphogenesis. The evolutionary mechanisms by which sporulation emerged and evolved in these two species, and more broadly across Firmicutes, remain largely unknown. Here, we trace the origin and evolution of sporulation using the genes known to be involved in the process in B. subtilis and C. difficile, and estimating their gain-loss dynamics in a comprehensive bacterial macroevolutionary framework. We show that sporulation evolution was driven by two major gene gain events, the first at the base of the Firmicutes and the second at the base of the B. subtilis group and within the Peptostreptococcaceae family, which includes C. difficile. We also show that early and late sporulation regulons have been coevolving and that sporulation genes entail greater innovation in B. subtilis with many Bacilli lineage-restricted genes. In contrast, C. difficile more often recruits new sporulation genes by horizontal gene transfer, which reflects both its highly mobile genome, the complexity of the gut microbiota, and an adjustment of sporulation to the gut ecosystem.
AbstractStudies of Native South American genetic diversity have helped to shed light on the peopling and differentiation of the continent, but available data are sparse for the major ecogeographic domains. These include the Pacific Coast, a potential early migration route; the Andes, home to the most expansive complex societies and to one of the most widely spoken indigenous language families of the continent (Quechua); and Amazonia, with its understudied population structure and rich cultural diversity. Here, we explore the genetic structure of 176 individuals from these three domains, genotyped with the Affymetrix Human Origins array. We infer multiple sources of ancestry within the Native American ancestry component; one with clear predominance on the Coast and in the Andes, and at least two distinct substrates in neighboring Amazonia, including a previously undetected ancestry characteristic of northern Ecuador and Colombia. Amazonian populations are also involved in recent gene-flow with each other and across ecogeographic domains, which does not accord with the traditional view of small, isolated groups. Long-distance genetic connections between speakers of the same language family suggest that indigenous languages here were spread not by cultural contact alone. Finally, Native American populations admixed with post-Columbian European and African sources at different times, with few cases of prolonged isolation. With our results we emphasize the importance of including understudied regions of the continent in high-resolution genetic studies, and we illustrate the potential of SNP chip arrays for informative regional-scale analysis.
AbstractHowea palms are viewed as one of the most clear-cut cases of speciation in sympatry. The sister species Howea belmoreana and H. forsteriana are endemic to the oceanic Lord Howe Island, Australia, where they have overlapping distributions and are reproductively isolated mainly by flowering time differences. However, the potential role of introgression from Australian mainland relatives had not previously been investigated, a process that has recently put other examples of sympatric speciation into question. Furthermore, the drivers of flowering time-based reproductive isolation remain unclear. We sequenced an RNA-seq data set that comprehensively sampled Howea and their closest mainland relatives (Linospadix, Laccospadix), and collected detailed soil chemistry data on Lord Howe Island to evaluate whether secondary gene flow had taken place and to examine the role of soil preference in speciation. D-statistics analyses strongly support a scenario whereby ancestral Howea hybridized frequently with its mainland relatives, but this only occurred prior to speciation. Expression analysis, population genetic and phylogenetic tests of selection, identified several flowering time genes with evidence of adaptive divergence between the Howea species. We found expression plasticity in flowering time genes in response to soil chemistry as well as adaptive expression and sequence divergence in genes pleiotropically linked to soil adaptation and flowering time. Ancestral hybridization may have provided the genetic diversity that promoted their subsequent adaptive divergence and speciation, a process that may be common for rapid ecological speciation.
AbstractThe recent emergence and spread of X-linked segregation distorters—called “Paris” system—in the worldwide species Drosophila simulans has elicited the selection of drive-resistant Y chromosomes. Here, we investigate the evolutionary history of 386 Y chromosomes originating from 29 population samples collected over a period of 20 years, showing a wide continuum of phenotypes when tested against the Paris distorters, from high sensitivity to complete resistance (males sire ∼95% to ∼40% female progeny). Analyzing around 13 kb of Y-linked gene sequences in a representative subset of nine Y chromosomes, we identified only three polymorphic sites resulting in three haplotypes. Remarkably, one of the haplotypes is associated with resistance. This haplotype is fixed in all samples from Sub-Saharan Africa, the region of origin of the drivers. Exceptionally, with the spread of the drivers in Egypt and Morocco, we were able to record the replacement of the sensitive lineage by the resistant haplotype in real time, within only a few years. In addition, we performed in situ hybridization, using satellite DNA probes, on a subset of 21 Y chromosomes from six locations. In contrast to the low molecular polymorphism, this revealed extensive structural variation suggestive of rapid evolution, either neutral or adaptive. Moreover, our results show that intragenomic conflicts can drive astonishingly rapid replacement of Y chromosomes and suggest that the emergence of Paris segregation distorters in East Africa occurred less than half a century ago.
AbstractDespite its recent invasion into the marine realm, the sea otter (Enhydra lutris) has evolved a suite of adaptations for life in cold coastal waters, including limb modifications and dense insulating fur. This uniquely dense coat led to the near-extinction of sea otters during the 18th–20th century fur trade and an extreme population bottleneck. We used the de novo genome of the southern sea otter (E. l. nereis) to reconstruct its evolutionary history, identify genes influencing aquatic adaptation, and detect signals of population bottlenecks. We compared the genome of the southern sea otter with the tropical freshwater-living giant otter (Pteronura brasiliensis) to assess common and divergent genomic trends between otter species, and with the closely related northern sea otter (E. l. kenyoni) to uncover population-level trends. We found signals of positive selection in genes related to aquatic adaptations, particularly limb development and polygenic selection on genes related to hair follicle development. We found extensive pseudogenization of olfactory receptor genes in both the sea otter and giant otter lineages, consistent with patterns of sensory gene loss in other aquatic mammals. At the population level, the southern sea otter and the northern sea otter showed extremely low genomic diversity, signals of recent inbreeding, and demographic histories marked by population declines. These declines may predate the fur trade and appear to have resulted in an increase in putatively deleterious variants that could impact the future recovery of the sea otter.
AbstractTransposable elements (TEs) are parasitic DNA bits capable of mobilization and mutagenesis, typically suppressed by host’s epigenetic silencing. Since the selfish DNA concept, it is appreciated that genomes are also molded by arms-races against natural TE inhabitants. However, our understanding of evolutionary processes shaping TEs adaptive populations is scarce. Here, we review the events of recombination associated to reverse-transcription in LTR retrotransposons, a process shuffling their genetic variants during replicative mobilization. Current evidence may suggest that recombinogenic retrotransposons could beneficially exploit host suppression, where clan behavior facilitates their speciation and diversification. Novel refinements to retrotransposons life-cycle and evolution models thus emerge.
AbstractThe chloroplast genome usually has a quadripartite structure consisting of a large single copy region and a small single copy region separated by two long inverted repeats. It has been known for some time that a single cell may contain at least two structural haplotypes of this structure, which differ in the relative orientation of the single copy regions. However, the methods required to detect and measure the abundance of the structural haplotypes are labor-intensive, and this phenomenon remains understudied. Here, we develop a new method, Cp-hap, to detect all possible structural haplotypes of chloroplast genomes of quadripartite structure using long-read sequencing data. We use this method to conduct a systematic analysis and quantification of chloroplast structural haplotypes in 61 land plant species across 19 orders of Angiosperms, Gymnosperms, and Pteridophytes. Our results show that there are two chloroplast structural haplotypes which occur with equal frequency in most land plant individuals. Nevertheless, species whose chloroplast genomes lack inverted repeats or have short inverted repeats have just a single structural haplotype. We also show that the relative abundance of the two structural haplotypes remains constant across multiple samples from a single individual plant, suggesting that the process which maintains equal frequency of the two haplotypes operates rapidly, consistent with the hypothesis that flip-flop recombination mediates chloroplast structural heteroplasmy. Our results suggest that previous claims of differences in chloroplast genome structure between species may need to be revisited.
AbstractThe genus Rhododendron (Ericaceae), which includes horticulturally important plants such as azaleas, is a highly diverse and widely distributed genus of >1,000 species. Here, we report the chromosome-scale de novo assembly and genome annotation of Rhododendron williamsianum as a basis for continued study of this large genus. We created multiple short fragment genomic libraries, which were assembled using ALLPATHS-LG. This was followed by contiguity preserving transposase sequencing (CPT-seq) and fragScaff scaffolding of a large fragment library, which improved the assembly by decreasing the number of scaffolds and increasing scaffold length. Chromosome-scale scaffolding was performed by proximity-guided assembly (LACHESIS) using chromatin conformation capture (Hi-C) data. Chromosome-scale scaffolding was further refined and linkage groups defined by restriction-site associated DNA (RAD) sequencing of the parents and progeny of a genetic cross. The resulting linkage map confirmed the LACHESIS clustering and ordering of scaffolds onto chromosomes and rectified large-scale inversions. Assessments of the R. williamsianum genome assembly and gene annotation estimate them to be 89% and 79% complete, respectively. Predicted coding sequences from genome annotation were used in syntenic analyses and for generating age distributions of synonymous substitutions/site between paralgous gene pairs, which identified whole-genome duplications (WGDs) in R. williamsianum. We then analyzed other publicly available Ericaceae genomes for shared WGDs. Based on our spatial and temporal analyses of paralogous gene pairs, we find evidence for two shared, ancient WGDs in Rhododendron and Vaccinium (cranberry/blueberry) members that predate the Ericaceae family and, in one case, the Ericales order.
AbstractThe black-necked crane (Grus nigricollis) which inhabits high-altitude areas has the largest body size of the world’s 15 crane species, and is classified as threatened by the IUCN. To support further studies on population genetics and genomics, we present a high-quality genome assembly based on both Illumina and nanopore sequencing. In total, 54.59 Gb Illumina short reads and 116.5 Gb nanopore long reads were generated. The 1.23 Gb assembled genome has a high contig N50 of 17.89 Mb, and has a longest contig of 87.83 Mb. The completeness (97.7%) of the draft genome was evaluated with single-copy orthologous genes using BUSCO. We identified 17,789 genes and found that 8.11% of the genome is composed of repetitive elements. In total, 84 of the 2,272 one-to-one orthologous genes were under positive selection in the black-necked crane lineage. SNP-based inference indicated two bottlenecks in the recent demographic trajectories of the black-necked crane. The genome information will contribute to future study of crane evolutionary history and provide new insights into the potential adaptation mechanisms of the black-necked crane to its high-altitude habitat.
AbstractThe common pheasant (Phasianus colchicus) in the order Galliformes and the family Phasianidae, has 30 subspecies distributed across its native range in the Palearctic realm and has been introduced to Europe, North America, and Australia. It is an important game bird often subjected to wildlife management as well as a model species to study speciation, biogeography, and local adaptation. However, the genomic resources for the common pheasant are generally lacking. We sequenced a male individual of the subspecies torquatus of the common pheasant with the Illumina HiSeq platform. We obtained 94.88 Gb of usable sequences by filtering out low-quality reads of the raw data generated. This resulted in a 1.02 Gb final assembly, which equals the estimated genome size. BUSCO analysis using chicken as a model showed that 93.3% of genes were complete. The contig N50 and scaffold N50 sizes were 178 kb and 10.2 Mb, respectively. All these indicate that we obtained a high-quality genome assembly. We annotated 16,485 protein-coding genes and 123.3 Mb (12.05% of the genome) of repetitive sequences by ab initio and homology-based prediction. Furthermore, we applied a RAD-sequencing approach for another 45 individuals of seven representative subspecies in China and identified 4,376,351 novel single nucleotide polymorphism (SNPs) markers. Using this unprecedented data set, we uncovered the geographic population structure and genetic introgression among common pheasants in China. Our results provide the first high-quality reference genome for the common pheasant and a valuable genome-wide SNP database for studying population genomics and demographic history.
AbstractAscomycota is the largest phylogenetic group of fungi that includes species important to human health and wellbeing. DNA repair is important for fungal survival and genome evolution. Here, we describe a detailed comparative genomic analysis of DNA repair genes in Ascomycota. We determined the DNA repair gene repertoire in Taphrinomycotina, Saccharomycotina, Leotiomycetes, Sordariomycetes, Dothideomycetes, and Eurotiomycetes. The subphyla of yeasts, Saccharomycotina and Taphrinomycotina, have a smaller DNA repair gene repertoire comparing to Pezizomycotina. Some genes were absent from most, if not all, yeast species. To study the conservation of these genes in Pezizomycotina, we used the Gain Loss Mapping Engine algorithm that provides the expectations of gain or loss of genes given the tree topology. Genes that were absent from most of the species of Taphrinomycotina or Saccharomycotina showed lower conservation in Pezizomycotina. This suggests that the absence of some DNA repair in yeasts is not random; genes with a tendency to be lost in other classes are missing. We ranked the conservation of DNA repair genes in Ascomycota. We found that Rad51 and its paralogs were less conserved than other recombinational proteins, suggesting that there is a redundancy between Rad51 and its paralogs, at least in some species. Finally, based on the repertoire of UV repair genes, we found conditions that differentially kill the wine pathogen Brettanomyces bruxellensis and not Saccharomyces cerevisiae. In summary, our analysis provides testable hypotheses to the role of DNA repair proteins in the genome evolution of Ascomycota.
AbstractAncient duplication events and retained gene duplicates have contributed to the evolution of many novel plant traits and, consequently, to the diversity and complexity within and across plant lineages. Although mounting evidence highlights the importance of whole-genome duplication (WGD; polyploidy) and its key role as an evolutionary driver, gene duplication dynamics and mechanisms, both of which are fundamental to our understanding of evolutionary process and patterns of plant diversity, remain poorly characterized in many clades. We use newly available transcriptomic data and a robust phylogeny to investigate the prevalence, occurrence, and timing of gene duplications in Lamiaceae (mints), a species-rich and chemically diverse clade with many ecologically, economically, and culturally important species. We also infer putative WGDs—an extreme mechanism of gene duplication—using large-scale data sets from synonymous divergence (KS), phylotranscriptomic, and divergence time analyses. We find evidence for widespread but asymmetrical levels of gene duplication and ancient polyploidy in Lamiaceae that correlate with species richness, including pronounced levels of gene duplication and putative ancient WGDs (7–18 events) within the large subclade Nepetoideae and up to 10 additional WGD events in other subclades. Our results help disentangle WGD-derived gene duplicates from those produced by other mechanisms and illustrate the nonuniformity of duplication dynamics in mints, setting the stage for future investigations that explore their impacts on trait diversity and species diversification. Our results also provide a practical context for evaluating the benefits and limitations of transcriptome-based approaches to inferring WGD, and we offer recommendations for researchers interested in investigating ancient WGDs in other plant groups.
AbstractIn phylogenetic inference, we commonly use models of substitution which assume that sequence evolution is stationary, reversible, and homogeneous (SRH). Although the use of such models is often criticized, the extent of SRH violations and their effects on phylogenetic inference of tree topologies and edge lengths are not well understood. Here, we introduce and apply the maximal matched-pairs tests of homogeneity to assess the scale and impact of SRH model violations on 3,572 partitions from 35 published phylogenetic data sets. We show that roughly one-quarter of all the partitions we analyzed (23.5%) reject the SRH assumptions, and that for 25% of data sets, tree topologies inferred from all partitions differ significantly from topologies inferred using the subset of partitions that do not reject the SRH assumptions. This proportion increases when comparing trees inferred using the subset of partitions that rejects the SRH assumptions, to those inferred from partitions that do not reject the SRH assumptions. These results suggest that the extent and effects of model violation in phylogenetics may be substantial. They highlight the importance of testing for model violations and possibly excluding partitions that violate models prior to tree reconstruction. Our results also suggest that further effort in developing models that do not require SRH assumptions could lead to large improvements in the accuracy of phylogenomic inference. The scripts necessary to perform the analysis are available in https://github.com/roblanf/SRHtests, and the new tests we describe are available as a new option in IQ-TREE (http://www.iqtree.org).