Allan Wilson Junior Award for Independent Research

This award is intended for outstanding members of the SMBE community who are in the early stages of an independent research career. The primary signal of research excellence is a trajectory of innovative, creative and research that is moving the field of Molecular Biology and Evolution forward.  The prize includes recognition at the annual SMBE banquet, a cash prize of $2000 and a travel award to attend the annual meeting.  This award will be made annually.

2019 SMBE Allan Wilson Junior Award for Independent Research Winner: Claudia Bank

Dr. Claudia Bank heads the Evolutionary Dynamics group at the Gulbenkian Science Institute in Oeiras, Portugal. Her group studies the population genetics of adaptation and speciation using a combination of mathematical modeling, statistical method development and data analysis, and experimental evolution. After an undergraduate degree in Mathematics from the University of Bielefeld, Germany, Dr. Bank earned her PhD in Population Genetics from the University of Veterinary Medicine in Vienna, Austria, supervised by Joachim Hermisson, followed by a postdoc with Jeffrey Jensen at the Ecole Polytechnique Fédérale de Lausanne in Switzerland. During her PhD and postdoc, she undertook two research stays in the group of Mark Kirkpatrick at UT Austin, and at the Simons Institute for the Theory of Computing at UC Berkeley. Dr. Bank is currently supported by grants from the Portuguese Science Foundation, the European Research Council, and the European Molecular Biology Organization to expand her studies of fitness landscapes across environments and biological levels of organization.

2018 SMBE Allan Wilson Junior Award for Independent Research Winner: Melissa Wilson Sayres, Arizona State University

Dr. Melissa Wilson Sayres is an Assistant Professor in the School of Life Sciences and Center for Evolution and Medicine at Arizona State University. Broadly, her laboratory analyzes large-scale genomic and transcriptomic datasets to study sex-specific processes. The Wilson Sayres laboratory studies how sex chromosomes arise and evolve, utilizes sex chromosomes to understand population history, and is working to incorporate genetic and phenotypic sex as a biological variable in health and disease research. She received her B.S. in Medical Mathematics from Creighton University in Omaha, Nebraska, her Ph.D. in Integrative Biology: Bioinformatics & Genomics from The Pennsylvania State University working with Dr. Kateryna Makova, and studied as a Miller postdoctoral fellow at the University of California, Berkeley with Rasmus Nielsen. Her laboratory and research are currently supported by an NIH NIGMS R35 Maximizing Investigators’ Research Award, the Leakey Foundation, and a Heritage grant from Arizona Game and Fish. 

2017 SMBE Allan Wilson Junior Award for Independent Research Winner: Mia Levine, University of Pennsylvania

Dr. Mia Levine is an Assistant Professor in the Department of Biology and the Epigenetics Institute at the University of Pennsylvania. The Levine Lab investigates how intra-genomic conflict shapes the evolution of DNA packaging proteins. Together with her trainees, Mia combines evolutionary genetics with transgenics, genomics, and cell biology to identify selfish genetic elements that drive host protein adaptation and to uncover the functional consequences for chromosome integrity and transmission. Mia graduated magna cum laude with a BA in Biology from the University of Pennsylvania, where she is now faculty. She earned an MSc in Ecology from the University of Illinois, Urbana-Champaign under Dr. Ken Paige and an NSF GRFP-supported PhD in population genetics from the Center of Population Biology at the University of California, Davis under Dr. David Begun. Mia joined the Fred Hutchinson Cancer Research Center to work with Dr. Harmit Malik as a postdoctoral fellow supported by an NIH NIGMS Ruth L. Kirschstein NRSA and an NIH NIGMS K99 Pathway to Independence Award. Mia is currently a Forbeck Foundation Scholar and recipient of an NIH NIGMS R35 Maximizing Investigators’ Research Award.

2016 SMBE Allan Wilson Junior Award for Independent Research Winner: Joanna Kelley, Washington State University

Dr. Joanna Kelley is an Assistant Professor in the School of Biological Sciences at Washington State University. She runs an evolutionary genomics laboratory that focuses on high-throughput genome sequencing and computational approaches to analyzing big data in genomics. Her research focuses on  understanding the genomic basis for adaptation to extreme environments. She received her B.A. in  mathematics and biology with honors from Brown University, working with Johanna Schmitt. She earned  her Ph.D. in Genome Sciences from the University of Washington under Willie Swanson. As a postdoctoral  researcher at the University of Chicago in Human Genetics with Molly Przeworski, she received a National Institutes of Health Ruth L. Kirschstein National Research Service Award. Dr. Kelley was also a postdoctoral researcher in the Department of Genetics at Stanford University with Carlos Bustamante.

Award Information

Eligibility: Applicants should be three to seven years post Ph.D.* on the nomination deadline and in a senior postdoc, assistant professor, or equivalent positions prior to tenure.

Nomination:  Nomination will be an open process that begins with a call to SMBE members, typically early in the calendar year.

All nominations will include:

  • A nomination letter that includes a recommendation for the candidate.
  • A one-page statement summarizing the candidate’s work and its fit to the award.
  • A CV of the candidate.  
  • A second recommendation letter.

Process:  The President will convene an awards committee who will choose among those nominated.  It may also choose not to award the prize if no suitable candidates are nominated.
The materials should be compiled into a single PDF file, and should be emailed to before 25 January 2019.

*Years post-Ph.D. may be modified in the case of extenuating circumstances, such as childbirth etc. Extenuating circumstances will be considered by the awards committee on a case-by-case basis. 

@OfficialSMBE Feed

MBE | Most Read

Molecular Biology and Evolution

Scientists Explore Diversity of Han Chinese

Thu, 12 Sep 2019 00:00:00 GMT

The Han Chinese are the world's largest ethnic group, making up 91.6% of modern-day China. As DNA sequencing tools and statistical analyses software have advanced, scientists have been exploring the forces that helped shape the current genetic landscape of Han Chinese.

Scientists Crack Origin of the Persian Walnut

Thu, 12 Sep 2019 00:00:00 GMT

Prized worldwide for its high-quality wood and rich flavor of delicious nuts, the Persian walnut (Juglans regia) is an important economic crop. The Persian walnut is one of 22 species in the genus Juglans, which includes black and white walnuts and butternuts, grown across Europe, the Americas, and Asia.

GEMME: A Simple and Fast Global Epistatic Model Predicting Mutational Effects

Mon, 12 Aug 2019 00:00:00 GMT

The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (, an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at

Evolution of the Cholesterol Biosynthesis Pathway in Animals

Fri, 09 Aug 2019 00:00:00 GMT

Cholesterol plays essential roles in animal development and disease progression. Here, we characterize the evolutionary pattern of the canonical cholesterol biosynthesis pathway (CBP) in the animal kingdom using both genome-wide analyses and functional experiments. CBP genes in the basal metazoans were inherited from their last common eukaryotic ancestor and evolutionarily conserved for cholesterol biosynthesis. The genomes of both the basal metazoans and deuterostomes retain almost the full set of CBP genes, while Cnidaria and many protostomes have independently experienced multiple massive losses of CBP genes that might be due to the geologic events during the Ediacaran period, such as the appearance of an exogenous sterol supply and the frequent perturbation of ocean oxygenation. Meanwhile, the indispensable utilization processes of cholesterol potentially strengthened the maintenance of the complete set of CBP genes in vertebrates. These results strengthen both biotic and abiotic roles in the macroevolution of a biosynthesis pathway in animals.

MicroRNA Gene Regulation in Extremely Young and Parallel Adaptive Radiations of Crater Lake Cichlid Fish

Fri, 09 Aug 2019 00:00:00 GMT

Cichlid fishes provide textbook examples of explosive phenotypic diversification and sympatric speciation, thereby making them ideal systems for studying the molecular mechanisms underlying rapid lineage divergence. Despite the fact that gene regulation provides a critical link between diversification in gene function and speciation, many genomic regulatory mechanisms such as microRNAs (miRNAs) have received little attention in these rapidly diversifying groups. Therefore, we investigated the posttranscriptional regulatory role of miRNAs in the repeated sympatric divergence of Midas cichlids (Amphilophus spp.) from Nicaraguan crater lakes. Using miRNA and mRNA sequencing of embryos from five Midas species, we first identified miRNA binding sites in mRNAs and highlighted the presences of a surprising number of novel miRNAs in these adaptively radiating species. Then, through analyses of expression levels, we identified putative miRNA/gene target pairs with negatively correlated expression level that were consistent with the role of miRNA in downregulating mRNA. Furthermore, we determined that several miRNA/gene pairs show convergent expression patterns associated with the repeated benthic/limnetic sympatric species divergence implicating these miRNAs as potential molecular mechanisms underlying replicated sympatric divergence. Finally, as these candidate miRNA/gene pairs may play a central role in phenotypic diversification in these cichlids, we characterized the expression domains of selected miRNAs and their target genes via in situ hybridization, providing further evidence that miRNA regulation likely plays a role in the Midas cichlid adaptive radiation. These results provide support for the hypothesis that extremely quickly evolving miRNA regulation can contribute to rapid evolutionary divergence even in the presence of gene flow.

Bayesian Estimation of Past Population Dynamics in BEAST 1.10 Using the Skygrid Coalescent Model

Wed, 31 Jul 2019 00:00:00 GMT

Inferring past population dynamics over time from heterochronous molecular sequence data is often achieved using the Bayesian Skygrid model, a nonparametric coalescent model that estimates the effective population size over time. Available in BEAST, a cross-platform program for Bayesian analysis of molecular sequences using Markov chain Monte Carlo, this coalescent model is often estimated in conjunction with a molecular clock model to produce time-stamped phylogenetic trees. We here provide a practical guide to using BEAST and its accompanying applications for the purpose of drawing inference under these models. We focus on best practices, potential pitfalls, and recommendations that can be generalized to other software packages for Bayesian inference. This protocol shows how to use TempEst, BEAUti, and BEAST 1.10 (; last accessed July 29, 2019), LogCombiner as well as Tracer in a complete workflow.

Multiple Plasticity Regulators Reveal Targets Specifying an Induced Predatory Form in Nematodes

Wed, 31 Jul 2019 00:00:00 GMT

The ability to translate a single genome into multiple phenotypes, or developmental plasticity, defines how phenotype derives from more than just genes. However, to study the evolutionary targets of plasticity and their evolutionary fates, we need to understand how genetic regulators of plasticity control downstream gene expression. Here, we have identified a transcriptional response specific to polyphenism (i.e., discrete plasticity) in the nematode Pristionchus pacificus. This species produces alternative resource-use morphs—microbivorous and predatory forms, differing in the form of their teeth, a morphological novelty—as influenced by resource availability. Transcriptional profiles common to multiple polyphenism-controlling genes in P. pacificus reveal a suite of environmentally sensitive loci, or ultimate target genes, that make up an induced developmental response. Additionally, in vitro assays show that one polyphenism regulator, the nuclear receptor NHR-40, physically binds to promoters with putative HNF4α (the nuclear receptor class including NHR-40) binding sites, suggesting this receptor may directly regulate genes that describe alternative morphs. Among differentially expressed genes were morph-limited genes, highlighting factors with putative “on–off” function in plasticity regulation. Further, predatory morph-biased genes included candidates—namely, all four P. pacificus homologs of Hsp70, which have HNF4α motifs—whose natural variation in expression matches phenotypic differences among P. pacificus wild isolates. In summary, our study links polyphenism regulatory loci to the transcription producing alternative forms of a morphological novelty. Consequently, our findings establish a platform for determining how specific regulators of morph-biased genes may influence selection on plastic phenotypes.

Transcriptional Enhancers in the FOXP2 Locus Underwent Accelerated Evolution in the Human Lineage

Mon, 29 Jul 2019 00:00:00 GMT

Unique human features, such as complex language, are the result of molecular evolutionary changes that modified developmental programs of our brain. The human-specific evolution of the forkhead box P2 (FOXP2) gene-coding region has been linked to the emergence of speech and language in the human kind. However, little is known about how the expression of FOXP2 is regulated and whether its regulatory machinery evolved in a lineage-specific manner in humans. In order to identify FOXP2 regulatory regions containing human-specific changes, we used databases of human-accelerated noncoding sequences or HARs. We found that the topologically associating domain determined using developing human cerebral cortex containing the FOXP2 locus includes two clusters of 12 HARs, placing the locus occupied by FOXP2 among the top regions showing fast acceleration rates in noncoding regions in the human genome. Using in vivo enhancer assays in zebrafish, we found that at least five FOXP2-HARs behave as transcriptional enhancers throughout different developmental stages. In addition, we found that at least two FOXP2-HARs direct the expression of the reporter gene EGFP to foxP2-expressing regions and cells. Moreover, we uncovered two FOXP2-HARs showing reporter expression gain of function in the nervous system when compared with the chimpanzee ortholog sequences. Our results indicate that regulatory sequences in the FOXP2 locus underwent a human-specific evolutionary process suggesting that the transcriptional machinery controlling this gene could have also evolved differentially in the human lineage.

“Ghost Introgression” As a Cause of Deep Mitochondrial Divergence in a Bird Species Complex

Sat, 27 Jul 2019 00:00:00 GMT

In the absence of nuclear-genomic differentiation between two populations, deep mitochondrial divergence (DMD) is a form of mito-nuclear discordance. Such instances of DMD are rare and might variably be explained by unusual cases of female-linked selection, by male-biased dispersal, by “speciation reversal” or by mitochondrial capture through genetic introgression. Here, we analyze DMD in an Asian Phylloscopus leaf warbler (Aves: Phylloscopidae) complex. Bioacoustic, morphological, and genomic data demonstrate close similarity between the taxa affinis and occisinensis, even though DMD previously led to their classification as two distinct species. Using population genomic and comparative genomic methods on 45 whole genomes, including historical reconstructions of effective population size, genomic peaks of differentiation and genomic linkage, we infer that the form affinis is likely the product of a westward expansion in which it replaced a now-extinct congener that was the donor of its mtDNA and small portions of its nuclear genome. This study provides strong evidence of “ghost introgression” as the cause of DMD, and we suggest that “ghost introgression” may be a widely overlooked phenomenon in nature.

Population Gene Introgression and High Genome Plasticity for the Zoonotic Pathogen Streptococcus agalactiae

Thu, 25 Jul 2019 00:00:00 GMT

The influence that bacterial adaptation (or niche partitioning) within species has on gene spillover and transmission among bacterial populations occupying different niches is not well understood. Streptococcus agalactiae is an important bacterial pathogen that has a taxonomically diverse host range making it an excellent model system to study these processes. Here, we analyze a global set of 901 genome sequences from nine diverse host species to advance our understanding of these processes. Bayesian clustering analysis delineated 12 major populations that closely aligned with niches. Comparative genomics revealed extensive gene gain/loss among populations and a large pan genome of 9,527 genes, which remained open and was strongly partitioned among niches. As a result, the biochemical characteristics of 11 populations were highly distinctive (significantly enriched). Positive selection was detected and biochemical characteristics of the dispensable genes under selection were enriched in ten populations. Despite the strong gene partitioning, phylogenomics detected gene spillover. In particular, tetracycline resistance (which likely evolved in the human-associated population) from humans to bovine, canines, seals, and fish, demonstrating how a gene selected in one host can ultimately be transmitted into another, and biased transmission from humans to bovines was confirmed with a Bayesian migration analysis. Our findings show high bacterial genome plasticity acting in balance with selection pressure from distinct functional requirements of niches that is associated with an extensive and highly partitioned dispensable genome, likely facilitating continued and expansive adaptation.

Admixture between Ancient Lineages, Selection, and the Formation of Sympatric Stickleback Species-Pairs

Tue, 16 Jul 2019 00:00:00 GMT

Ecological speciation has become a popular model for the development and maintenance of reproductive isolation in closely related sympatric pairs of species or ecotypes. An implicit assumption has been that such pairs originate (possibly with gene flow) from a recent, genetically homogeneous ancestor. However, recent genomic data have revealed that currently sympatric taxa are often a result of secondary contact between ancestrally allopatric lineages. This has sparked an interest in the importance of initial hybridization upon secondary contact, with genomic reanalysis of classic examples of ecological speciation often implicating admixture in speciation. We describe a novel occurrence of unusually well-developed reproductive isolation in a model system for ecological speciation: the three-spined stickleback (Gasterosteus aculeatus), breeding sympatrically in multiple lagoons on the Scottish island of North Uist. Using morphological data, targeted genotyping, and genome-wide single-nucleotide polymorphism data, we show that lagoon resident and anadromous ecotypes are strongly reproductively isolated with an estimated hybridization rate of only ∼1%. We use palaeoecological and genetic data to test three hypotheses to explain the existence of these species-pairs. Our results suggest that recent, purely ecological speciation from a genetically homogeneous ancestor is probably not solely responsible for the evolution of species-pairs. Instead, we reveal a complex colonization history with multiple ancestral lineages contributing to the genetic composition of species-pairs, alongside strong disruptive selection. Our results imply a role for admixture upon secondary contact and are consistent with the recent suggestion that the genomic underpinning of ecological speciation often has an older, allopatric origin.

De Novo Mutation Rate Estimation in Wolves of Known Pedigree

Fri, 12 Jul 2019 00:00:00 GMT

Knowledge of mutation rates is crucial for calibrating population genetics models of demographic history in units of years. However, mutation rates remain challenging to estimate because of the need to identify extremely rare events. We estimated the nuclear mutation rate in wolves by identifying de novo mutations in a pedigree of seven wolves. Putative de novo mutations were discovered by whole-genome sequencing and were verified by Sanger sequencing of parents and offspring. Using stringent filters and an estimate of the false negative rate in the remaining observable genome, we obtain an estimate of ∼4.5 × 10−9 per base pair per generation and provide conservative bounds between 2.6 × 10−9 and 7.1 × 10−9. Although our estimate is consistent with recent mutation rate estimates from ancient DNA (4.0 × 10−9 and 3.0–4.5 × 10−9), it suggests a wider possible range. We also examined the consequences of our rate and the accompanying interval for dating several critical events in canid demographic history. For example, applying our full range of rates to coalescent models of dog and wolf demographic history implies a wide set of possible divergence times between the ancestral populations of dogs and extant Eurasian wolves (16,000–64,000 years ago) although our point estimate indicates a date between 25,000 and 33,000 years ago. Aside from one study in mice, ours provides the only direct mammalian mutation rate outside of primates and is likely to be vital to future investigations of mutation rate evolution.

EPAS1 Gain-of-Function Mutation Contributes to High-Altitude Adaptation in Tibetan Horses

Thu, 04 Jul 2019 00:00:00 GMT

High altitude represents some of the most extreme environments worldwide. The genetic changes underlying adaptation to such environments have been recently identified in multiple animals but remain unknown in horses. Here, we sequence the complete genome of 138 domestic horses encompassing a whole altitudinal range across China to uncover the genetic basis for adaptation to high-altitude hypoxia. Our genome data set includes 65 lowland animals across ten Chinese native breeds, 61 horses living at least 3,300 m above sea level across seven locations along Qinghai-Tibetan Plateau, as well as 7 Thoroughbred and 5 Przewalski’s horses added for comparison. We find that Tibetan horses do not descend from Przewalski’s horses but were most likely introduced from a distinct horse lineage, following the emergence of pastoral nomadism in Northwestern China ∼3,700 years ago. We identify that the endothelial PAS domain protein 1 gene (EPAS1, also HIF2A) shows the strongest signature for positive selection in the Tibetan horse genome. Two missense mutations at this locus appear strongly associated with blood physiological parameters facilitating blood circulation as well as oxygen transportation and consumption in hypoxic conditions. Functional validation through protein mutagenesis shows that these mutations increase EPAS1 stability and its hetero dimerization affinity to ARNT (HIF1B). Our study demonstrates that missense mutations in the EPAS1 gene provided key evolutionary molecular adaptation to Tibetan horses living in high-altitude hypoxic environments. It reveals possible targets for genomic selection programs aimed at increasing hypoxia tolerance in livestock and provides a textbook example of evolutionary convergence across independent mammal lineages.

Parallel Evolution of HIV-1 in a Long-Term Experiment

Thu, 04 Jul 2019 00:00:00 GMT

One of the most intriguing puzzles in biology is the degree to which evolution is repeatable. The repeatability of evolution, or parallel evolution, has been studied in a variety of model systems, but has rarely been investigated with clinically relevant viruses. To investigate parallel evolution of HIV-1, we passaged two replicate HIV-1 populations for almost 1 year in each of two human T-cell lines. For each of the four evolution lines, we determined the genetic composition of the viral population at nine time points by deep sequencing the entire genome. Mutations that were carried by the majority of the viral population accumulated continuously over 1 year in each evolution line. Many majority mutations appeared in more than one evolution line, that is, our experiments showed an extreme degree of parallel evolution. In one of the evolution lines, 62% of the majority mutations also occur in another line. The parallelism impairs our ability to reconstruct the evolutionary history by phylogenetic methods. We show that one can infer the correct phylogenetic topology by including minority mutations in our analysis. We also find that mutation diversity at the beginning of the experiment is predictive of the frequency of majority mutations at the end of the experiment.

Coevolution of Sites under Immune Selection Shapes Epstein–Barr Virus Population Structure

Tue, 02 Jul 2019 00:00:00 GMT

Epstein–Barr virus (EBV) is one of the most common viral infections in humans and persists within its host for life. EBV therefore represents an extremely successful virus that has evolved complex strategies to evade the host’s innate and adaptive immune response during both initial and persistent stages of infection. Here, we conducted a comparative genomics analysis on 223 whole genome sequences of worldwide EBV strains. We recover extensive genome-wide linkage disequilibrium (LD) despite pervasive genetic recombination. This pattern is explained by the global EBV population being subdivided into three main subpopulations, one primarily found in East Asia, one in Southeast Asia and Oceania, and the third including most of the other globally distributed genomes we analyzed. Additionally, sites in LD were overrepresented in immunogenic genes. Taken together, our results suggest that host immune selection and local adaptation to different human host populations has shaped the genome-wide patterns of genetic diversity in EBV.

High Satellite Repeat Turnover in Great Apes Studied with Short- and Long-Read Technologies

Tue, 02 Jul 2019 00:00:00 GMT

Satellite repeats are a structural component of centromeres and telomeres, and in some instances, their divergence is known to drive speciation. Due to their highly repetitive nature, satellite sequences have been understudied and underrepresented in genome assemblies. To investigate their turnover in great apes, we studied satellite repeats of unit sizes up to 50 bp in human, chimpanzee, bonobo, gorilla, and Sumatran and Bornean orangutans, using unassembled short and long sequencing reads. The density of satellite repeats, as identified from accurate short reads (Illumina), varied greatly among great ape genomes. These were dominated by a handful of abundant repeated motifs, frequently shared among species, which formed two groups: 1) the (AATGG)n repeat (critical for heat shock response) and its derivatives; and 2) subtelomeric 32-mers involved in telomeric metabolism. Using the densities of abundant repeats, individuals could be classified into species. However, clustering did not reproduce the accepted species phylogeny, suggesting rapid repeat evolution. Several abundant repeats were enriched in males versus females; using Y chromosome assemblies or Fluorescent In Situ Hybridization, we validated their location on the Y. Finally, applying a novel computational tool, we identified many satellite repeats completely embedded within long Oxford Nanopore and Pacific Biosciences reads. Such repeats were up to 59 kb in length and consisted of perfect repeats interspersed with other similar sequences. Our results based on sequencing reads generated with three different technologies provide the first detailed characterization of great ape satellite repeats, and open new avenues for exploring their functions.

Meta-Omics Reveals Genetic Flexibility of Diatom Nitrogen Transporters in Response to Environmental Changes

Mon, 01 Jul 2019 00:00:00 GMT

Diatoms (Bacillariophyta), one of the most abundant and diverse groups of marine phytoplankton, respond rapidly to the supply of new nutrients, often out-competing other phytoplankton. Herein, we integrated analyses of the evolution, distribution, and expression modulation of two gene families involved in diatom nitrogen uptake (DiAMT1 and DiNRT2), in order to infer the main drivers of divergence in a key functional trait of phytoplankton. Our results suggest that major steps in the evolution of the two gene families reflected key events triggering diatom radiation and diversification. Their expression is modulated in the contemporary ocean by seawater temperature, nitrate, and iron concentrations. Moreover, the differences in diversity and expression of these gene families throughout the water column hint at a possible link with bacterial activity. This study represents a proof-of-concept of how a holistic approach may shed light on the functional biology of organisms in their natural environment.

Symbiosis, Selection, and Novelty: Freshwater Adaptation in the Unique Sponges of Lake Baikal

Thu, 27 Jun 2019 00:00:00 GMT

Freshwater sponges (Spongillida) are a unique lineage of demosponges that secondarily colonized lakes and rivers and are now found ubiquitously in these ecosystems. They developed specific adaptations to freshwater systems, including the ability to survive extreme thermal ranges, long-lasting dessication, anoxia, and resistance to a variety of pollutants. Although spongillids have colonized all freshwater systems, the family Lubomirskiidae is endemic to Lake Baikal and plays a range of key roles in this ecosystem. Our work compares the genomic content and microbiome of individuals of three species of the Lubomirskiidae, providing hypotheses for how molecular evolution has allowed them to adapt to their unique environments. We have sequenced deep (>92% of the metazoan “Benchmarking Universal Single-Copy Orthologs” [BUSCO] set) transcriptomes from three species of Lubomirskiidae and a draft genome resource for Lubomirskia baikalensis. We note Baikal sponges contain unicellular algal and bacterial symbionts, as well as the dinoflagellate Gyrodinium. We investigated molecular evolution, gene duplication, and novelty in freshwater sponges compared with marine lineages. Sixty one orthogroups have consilient evidence of positive selection. Transporters (e.g., zinc transporter-2), transcription factors (aristaless-related homeobox), and structural proteins (e.g. actin-3), alongside other genes, are under strong evolutionary pressure in freshwater, with duplication driving novelty across the Spongillida, but especially in the Lubomirskiidae. This addition to knowledge of freshwater sponge genetics provides a range of tools for understanding the molecular biology and, in the future, the ecology (e.g., colonization and migration patterns) of these key species.

Genomic Patterns of Local Adaptation under Gene Flow in Arabidopsis lyrata

Tue, 25 Jun 2019 00:00:00 GMT

Short-scale local adaptation is a complex process involving selection, migration, and drift. The expected effects on the genome are well grounded in theory but examining these on an empirical level has proven difficult, as it requires information about local selection, demographic history, and recombination rate variation. Here, we use locally adapted and phenotypically differentiated Arabidopsis lyrata populations from two altitudinal gradients in Norway to test these expectations at the whole-genome level. Demography modeling indicates that populations within the gradients diverged <2 kya and that the sites are connected by gene flow. The gene flow estimates are, however, highly asymmetric with migration from high to low altitudes being several times more frequent than vice versa. To detect signatures of selection for local adaptation, we estimate patterns of lineage-specific differentiation among these populations. Theory predicts that gene flow leads to concentration of adaptive loci in areas of low recombination; a pattern we observe in both lowland-alpine comparisons. Although most selected loci display patterns of conditional neutrality, we found indications of genetic trade-offs, with one locus particularly showing high differentiation and signs of selection in both populations. Our results further suggest that resistance to solar radiation is an important adaptation to alpine environments, while vegetative growth and bacterial defense are indicated as selected traits in the lowland habitats. These results provide insights into genetic architectures and evolutionary processes driving local adaptation under gene flow. We also contribute to understanding of traits and biological processes underlying alpine adaptation in northern latitudes.

Phylogenomics Reveals an Ancient Hybrid Origin of the Persian Walnut

Tue, 04 Jun 2019 00:00:00 GMT

Persian walnut (Juglans regia) is cultivated worldwide for its high-quality wood and nuts, but its origin has remained mysterious because in phylogenies it occupies an unresolved position between American black walnuts and Asian butternuts. Equally unclear is the origin of the only American butternut, J. cinerea. We resequenced the whole genome of 80 individuals from 19 of the 22 species of Juglans and assembled the genome of its relatives Pterocarya stenoptera and Platycarya strobilacea. Using phylogenetic-network analysis of single-copy nuclear genes, genome-wide site pattern probabilities, and Approximate Bayesian Computation, we discovered that J. regia (and its landrace J. sigillata) arose as a hybrid between the American and the Asian lineages and that J. cinerea resulted from massive introgression from an immigrating Asian butternut into the genome of an American black walnut. Approximate Bayesian Computation modeling placed the hybrid origin in the late Pliocene, ∼3.45 My, with both parental lineages since having gone extinct in Europe.

GBE | Most Read

Genome Biology & Evolution

Highlight: New Solutions and Open Questions in Computational Evolutionary Biology

Thu, 07 Nov 2019 00:00:00 GMT

The dawn of the computer and information age in the last century left virtually no field untouched. In biology, computational advances enabled scientists to generate, store, and analyze large-scale data sets that could scarcely have been imagined decades earlier. These advances ultimately led to the publication of the first bacterial genome sequence in 1995 (Fleischmann et al. 1995), and with it, the birth of the genomics era. The advent of high-throughput sequencing further accelerated the pace of data generation to an unprecedented rate. Now, less than a quarter of a century later, genomic data for almost 220,000 individual organisms and another 25,000 metagenomes are currently available through the National Center for Biotechnology Information (NCBI) website, and Genome Biology and Evolution has played a role in publishing numerous articles in the field of computational evolutionary biology.

Corrigendum to: “An Upper Limit on the Functional Fraction of the Human Genome”

Wed, 06 Nov 2019 00:00:00 GMT

Genome Biol. Evol. 9(7):1880–1885; doi:10.1093/gbe/evx121

Draft Genome of the Rice CoralMontipora capitata Obtained from Linked-Read Sequencing

Mon, 04 Nov 2019 00:00:00 GMT

Genome Biol. Evol. 11(7):2045–2054; doi:10.1093/gbe/evz135

Genomic Basis of Convergent Island Phenotypes in Boa Constrictors

Wed, 23 Oct 2019 00:00:00 GMT

Convergent evolution is often documented in organisms inhabiting isolated environments with distinct ecological conditions and similar selective regimes. Several Central America islands harbor dwarf Boa populations that are characterized by distinct differences in growth, mass, and craniofacial morphology, which are linked to the shared arboreal and feast-famine ecology of these island populations. Using high-density RADseq data, we inferred three dwarf island populations with independent origins and demonstrate that selection, along with genetic drift, has produced both divergent and convergent molecular evolution across island populations. Leveraging whole-genome resequencing data for 20 individuals and a newly annotated Boa genome, we identify four genes with evidence of phenotypically relevant protein-coding variation that differentiate island and mainland populations. The known roles of these genes involved in body growth (PTPRS, DMGDH, and ARSB), circulating fat and cholesterol levels (MYLIP), and craniofacial development (DMGDH and ARSB) in mammals link patterns of molecular evolution with the unique phenotypes of these island forms. Our results provide an important genome-wide example for quantifying expectations of selection and convergence in closely related populations. We also find evidence at several genomic loci that selection may be a prominent force of evolutionary change—even for small island populations for which drift is predicted to dominate. Overall, while phenotypically convergent island populations show relatively few loci under strong selection, infrequent patterns of molecular convergence are still apparent and implicate genes with strong connections to convergent phenotypes.

A Nearly Complete Genome of Ciona intestinalis Type A (C. robusta) Reveals the Contribution of Inversion to Chromosomal Evolution in the Genus Ciona

Tue, 22 Oct 2019 00:00:00 GMT

Since its initial publication in 2002, the genome of Ciona intestinalis type A (Ciona robusta), the first genome sequence of an invertebrate chordate, has provided a valuable resource for a wide range of biological studies, including developmental biology, evolutionary biology, and neuroscience. The genome assembly was updated in 2008, and it included 68% of the sequence information in 14 pairs of chromosomes. However, a more contiguous genome is required for analyses of higher order genomic structure and of chromosomal evolution. Here, we provide a new genome assembly for an inbred line of this animal, constructed with short and long sequencing reads and Hi-C data. In this latest assembly, over 95% of the 123 Mb of sequence data was included in the chromosomes. Short sequencing reads predicted a genome size of 114–120 Mb; therefore, it is likely that the current assembly contains almost the entire genome, although this estimate of genome size was smaller than previous estimates. Remapping of the Hi-C data onto the new assembly revealed a large inversion in the genome of the inbred line. Moreover, a comparison of this genome assembly with that of Ciona savignyi, a different species in the same genus, revealed many chromosomal inversions between these two Ciona species, suggesting that such inversions have occurred frequently and have contributed to chromosomal evolution of Ciona species. Thus, the present assembly greatly improves an essential resource for genome-wide studies of ascidians.

GingerRoot: A Novel DNA Transposon Encoding Integrase-Related Transposase in Plants and Animals

Mon, 21 Oct 2019 00:00:00 GMT

Transposable elements represent the largest components of many eukaryotic genomes and different genomes harbor different combinations of elements. Here, we discovered a novel DNA transposon in the genome of the clubmoss Selaginella lepidophylla. Further searching for related sequences to the conserved DDE region uncovered the presence of this superfamily of elements in fish, coral, sea anemone, and other animal species. However, this element appears restricted to Bryophytes and Lycophytes in plants. This transposon, named GingerRoot, is associated with a 6 bp (base pair) target site duplication, and 100–150 bp terminal inverted repeats. Analysis of transposase sequences identified the DDE motif, a catalytic domain, which shows similarity to the integrase of Gypsy-like long terminal repeat retrotransposons, the most abundant component in plant genomes. A total of 77 intact and several hundred truncated copies of GingerRoot elements were identified in S. lepidophylla. Like Gypsy retrotransposons, GingerRoots show a lack of insertion preference near genes, which contrasts to the compact genome size of about 100 Mb. Nevertheless, a considerable portion of GingerRoot elements was found to carry gene fragments, suggesting the capacity of duplicating gene sequences is unlikely attributed to the proximity to genes. Elements carrying gene fragments appear to be less methylated, more diverged, and more distal to genes than those without gene fragments, indicating they are preferentially retained in gene-poor regions. This study has identified a broadly dispersed, novel DNA transposon, and the first plant DNA transposon with an integrase-related transposase, suggesting the possibility of de novo formation of Gypsy-like elements in plants.

Complex Evolutionary Origins of Specialized Metabolite Gene Cluster Diversity among the Plant Pathogenic Fungi of the Fusarium graminearum Species Complex

Mon, 14 Oct 2019 00:00:00 GMT

Fungal genomes encode highly organized gene clusters that underlie the production of specialized (or secondary) metabolites. Gene clusters encode key functions to exploit plant hosts or environmental niches. Promiscuous exchange among species and frequent reconfigurations make gene clusters some of the most dynamic elements of fungal genomes. Despite evidence for high diversity in gene cluster content among closely related strains, the microevolutionary processes driving gene cluster gain, loss, and neofunctionalization are largely unknown. We analyzed the Fusarium graminearum species complex (FGSC) composed of plant pathogens producing potent mycotoxins and causing Fusarium head blight on cereals. We de novo assembled genomes of previously uncharacterized FGSC members (two strains of F. austroamericanum, F. cortaderiae, and F. meridionale). Our analyses of 8 species of the FGSC in addition to 15 other Fusarium species identified a pangenome of 54 gene clusters within FGSC. We found that multiple independent losses were a key factor generating extant cluster diversity within the FGSC and the Fusarium genus. We identified a modular gene cluster conserved among distantly related fungi, which was likely reconfigured to encode different functions. We also found strong evidence that a rare cluster in FGSC was gained through an ancient horizontal transfer between bacteria and fungi. Chromosomal rearrangements underlying cluster loss were often complex and were likely facilitated by an enrichment in specific transposable elements. Our findings identify important transitory stages in the birth and death process of specialized metabolism gene clusters among very closely related species.

Deciphering Ancestral Sex Chromosome Turnovers Based on Analysis of Male Mutation Bias

Sat, 12 Oct 2019 00:00:00 GMT

The age of sex chromosomes is commonly obtained by comparing the substitution rates of XY gametologs. Coupled with phylogenetic reconstructions, one can refine the origin of a sex chromosome system relative to specific speciation events. However, these approaches are insufficient to determine the presence and duration of ancestral sex chromosome systems that were lost in some species. In this study, we worked with genomic and transcriptomic data from mammals and squamates and analyzed the effect of male mutation bias on X-linked sequences in these groups. We searched for signatures indicating whether monotremes shared the same sex chromosomes with placental mammals or whether pleurodonts and acrodonts had a common ancestral sex chromosome system. Our analyses indicate that platypus did not share the XY chromosomes with placental mammals, in agreement with previous work. In contrast, analyses of agamids showed that this lineage maintained the pleurodont XY chromosomes for several million years. We performed multiple simulations using different strengths of male mutation bias to confirm the results. Overall, our work shows that variations in substitution rates due to male mutation bias could be applied to uncover signatures of ancestral sex chromosome systems.

Interspecific Gene Exchange Introduces High Genetic Variability in Crop Pathogen

Fri, 11 Oct 2019 00:00:00 GMT

Genome analyses have revealed a profound role of hybridization and introgression in the evolution of many eukaryote lineages, including fungi. The impact of recurrent introgression on fungal evolution however remains elusive. Here, we analyzed signatures of introgression along the genome of the fungal wheat pathogen Zymoseptoria tritici. We applied a comparative population genomics approach, including genome data from five Zymoseptoria species, to characterize the distribution and composition of introgressed regions representing segments with an exceptional haplotype pattern. These regions are found throughout the genome, comprising 5% of the total genome and overlapping with > 1,000 predicted genes. We performed window-based phylogenetic analyses along the genome to distinguish regions which have a monophyletic or nonmonophyletic origin with Z. tritici sequences. A majority of nonmonophyletic windows overlap with the highly variable regions suggesting that these originate from introgression. We verified that incongruent gene genealogies do not result from incomplete lineage sorting by comparing the observed and expected length distribution of haplotype blocks resulting from incomplete lineage sorting. Although protein-coding genes are not enriched in these regions, we identify 18 that encode putative virulence determinants. Moreover, we find an enrichment of transposable elements in these regions implying that hybridization may contribute to the horizontal spread of transposable elements. We detected a similar pattern in the closely related species Zymoseptoria ardabiliae, suggesting that hybridization is widespread among these closely related grass pathogens. Overall, our results demonstrate a significant impact of recurrent hybridization on overall genome evolution of this important wheat pathogen.

Compound Dynamics and Combinatorial Patterns of Amino Acid Repeats Encode a System of Evolutionary and Developmental Markers

Mon, 07 Oct 2019 00:00:00 GMT

Homopolymeric amino acid repeats (AARs) like polyalanine (polyA) and polyglutamine (polyQ) in some developmental proteins (DPs) regulate certain aspects of organismal morphology and behavior, suggesting an evolutionary role for AARs as developmental “tuning knobs.” It is still unclear, however, whether these are occasional protein-specific phenomena or hints at the existence of a whole AAR-based regulatory system in DPs. Using novel approaches to trace their functional and evolutionary history, we find quantitative evidence supporting a generalized, combinatorial role of AARs in developmental processes with evolutionary implications. We observe nonrandom AAR distributions and combinations in HOX and other DPs, as well as in their interactomes, defining elements of a proteome-wide combinatorial functional code whereby different AARs and their combinations appear preferentially in proteins involved in the development of specific organs/systems. Such functional associations can be either static or display detectable evolutionary dynamics. These findings suggest that progressive changes in AAR occurrence/combination, by altering embryonic development, may have contributed to taxonomic divergence, leaving detectable traces in the evolutionary history of proteomes. Consistent with this hypothesis, we find that the evolutionary trajectories of the 20 AARs in eukaryotic proteomes are highly interrelated and their individual or compound dynamics can sharply mark taxonomic boundaries, or display clock-like trends, carrying overall a strong phylogenetic signal. These findings provide quantitative evidence and an interpretive framework outlining a combinatorial system of AARs whose compound dynamics mark at the same time DP functions and evolutionary transitions.

New Non-Bilaterian Transcriptomes Provide Novel Insights into the Evolution of Coral Skeletomes

Fri, 13 Sep 2019 00:00:00 GMT

A general trend observed in animal skeletomes—the proteins occluded in animal skeletons—is the copresence of taxonomically widespread and lineage-specific proteins that actively regulate the biomineralization process. Among cnidarians, the skeletomes of scleractinian corals have been shown to follow this trend. However, distributions and phylogenetic analyses of biomineralization-related genes are often based on only a few species, with other anthozoan calcifiers such as octocorals (soft corals), not being fully considered. We de novo assembled the transcriptomes of four soft-coral species characterized by different calcification strategies (aragonite skeleton vs. calcitic sclerites) and data-mined published nonbilaterian transcriptome resources to construct a taxonomically comprehensive sequence database to map the distribution of scleractinian and octocoral skeletome components. Cnidaria shared no skeletome proteins with Placozoa or Ctenophora, but did share some skeletome proteins with Porifera, such as galaxin-related proteins. Within Scleractinia and Octocorallia, we expanded the distribution for several taxonomically restricted genes such as secreted acidic proteins, scleritin, and carbonic anhydrases, and propose an early, single biomineralization-recruitment event for galaxin sensu stricto. Additionally, we show that the enrichment of acidic residues within skeletogenic proteins did not occur at the Corallimorpharia–Scleractinia transition, but appears to be associated with protein secretion into the organic matrix. Finally, the distribution of octocoral calcification-related proteins appears independent of skeleton mineralogy (i.e., aragonite/calcite) with no differences in the proportion of shared skeletogenic proteins between scleractinians and aragonitic or calcitic octocorals. This points to skeletome homogeneity within but not between groups of calcifying cnidarians, although some proteins such as galaxins and SCRiP-3a could represent instances of commonality.

Phylogenomic Analysis of a Putative Missing Link Sparks Reinterpretation of Leech Evolution

Wed, 19 Jun 2019 00:00:00 GMT

Leeches (Hirudinida) comprise a charismatic, yet often maligned group of worms. Despite their ecological, economic, and medical importance, a general consensus on the phylogenetic relationships of major hirudinidan lineages is lacking. This absence of a consistent, robust phylogeny of early-diverging lineages has hindered our understanding of the underlying processes that enabled evolutionary diversification of this clade. Here, we used an anchored hybrid enrichment-based phylogenomic approach, capturing hundreds of loci to investigate phylogenetic relationships among major hirudinidan lineages and their closest living relatives. We recovered Branchiobdellida as sister to a clade that includes all major lineages of hirudinidans and Acanthobdella, casting doubt on the utility of Acanthobdella as a “missing link” between hirudinidans and the clitellate group formerly known as Oligochaeta. Further, our results corroborate the reciprocal monophyly of jawed and proboscis-bearing leeches. Our phylogenomic resolution of early-diverging leeches provides a useful framework for illuminating the evolution of key adaptations and host–symbiont associations that have allowed leeches to colonize a wide diversity of habitats worldwide.