Joseph Felsenstein is Professor in the Departments of Genome Sciences and Biology and Adjunct Professor in the Departments of Computer Science and Statistics at the University of Washington in Seattle. He is best known for his work on phylogenetic inference, and is the author of Inferring Phylogenies, and principal author and distributor of the package of phylogenetic inference programs called PHYLIP, and is currently serving as the President of the Society for Molecular Biology & Evolution.

You can reach Joe at

James McInerney is the principle investigator of the Bioinformatics and Molecular Evolution Laboratories at NUI Maynooth. He was one of the founding directors of the Irish Centre for High End Computing, an Associate Editor of Molecular Biology and Evolution, Biology Direct, and Journal of Experimental Zoology, and is currently serving as the Secretary for the Society for Molecular Biology and Evolution.

You can reach James at

Juliette de Meaux is interested in the molecular basis of Darwinian adaptation in natural plant systems. Her works combines the approaches of population, quantitative and molecular genetics to dissect the underpinning of adaptive changes. She completed her PhD at AgroParisTech, under the supervision of Prof. Claire Neema and studied the molecular basis of host-pathogen coevolution in natural populations of common bean. She then spent her Postdoc time in the lab of Prof. Tom Mitchell-Olds at the Max Planck Institute of Chemical Ecology in Jena and worked on the evolution of cis-regulatory DNA. Since 2005, she runs her own lab, first at the Max Planck Institute of Plant Breeding in Cologne and then at the University of Münster. In January 2015, she relocated her lab at the University of Cologne. She is currently serving as the Treasurer for the Society for Molecular Biology and Evolution.

You can reach Juliette at


Forgot username/password?

Registration and Membership

Non-Members: You must Register for an account to purchase a membership and conduct other transactions. Future visits to the website will only require login.

After login or registration: You may conduct online transactions such as joining or renewing a membership, registering for an annual meeting and making donations.

Events Calendar

Check out our Events Calendar
for upcoming meetings. 

If you have an event you wish to add,
just email it to


The Society for Molecular Biology and Evolution is an international organization whose goals are to provide facilities for association and communication among molecular evolutionists and to further the goals of molecular evolution, as well as its practitioners and teachers. In order to accomplish these goals, the Society publishes two peer-reviewed journals, Molecular Biology and Evolution and Genome Biology and Evolution. The Society sponsors an annual meeting, as well as smaller satellite meetings or workshop on important, focused, and timely topics. It also confers honors and awards to students and researchers.

SMBE 2019

We are delighted to announce that the SMBE 2019 Meeting will be taking place in Manchester, United Kingdom. The Meeting will be held at the state of the art Manchester Central venue.

The programme will provide plenty of opportunities for you to submit your work for consideration as a symposium, oral or poster presentation.

Full details on registration fees, accommodation and exhibition opportunities will be made available in due course. Please do make a note of the key dates included below.

More information can be found HERE


SMBE is a member of the Scientific Society Publisher Alliance

Featured News and Updates

Graduate student positions (MS and PhD) in environmental microbiology and molecular biology at Bowling Green State University (BGSU)

The Environmental Microbiology and Molecular Research Group at BGSU is comprised of individuals performing diverse work concerning important societal issues that are of serious concern such as harmful algal blooms (Bullerjahn Lab) in the Great Lakes, antibiotic discovery against human and plant pathogens (H. Wildschutte Lab), metagenomics studies of Antarctica’s Lake Vostok microbial community (Rogers Lab), and ice nucleation by microbes in freshwater lakes (McKay lab).

Continue Reading →

  • Wednesday, October 28, 2015
  • Comments (0)

SMBE Childcare Awards - an update by Aoife McLysaght

What do you do if you are part of an academic society and you want to do your bit so that everyone who has the ambition to enjoy this stimulating career gets the opportunity to do so?

Continue Reading →

  • Tuesday, August 04, 2015
  • Comments (0)

SMBE 2015 - Participant Limit and News

As a friendly reminder, we're about to reach our capacity limit for SMBE 2015 (July 12-16) in Vienna, Austria. Register NOW to secure your participation and join us in one of the most beautiful cities at the spectacular Imperial Palace (Hofburg) in the heart of Vienna:

Continue Reading →

  • Tuesday, May 19, 2015
  • Comments (0)

2015 Walter M. Fitch Prize Finalists

SMBE would like to congratulate the 2015 Walter M. Fitch Prize Finalists. These 8 young investigators will be presenting their abstracts in the Walter M. Fitch Symposium at the 2015 annual SMBE meeting in Vienna, Austria.

Continue Reading →

  • Thursday, April 30, 2015
  • Comments (0)

SMBE - Call for satellite meeting proposals

SMBE is now calling for proposals for small satellite meetings and/or workshops to be held between Fall 2015 and Fall 2016.  Funds will be awarded on a competitive basis to members of the molecular evolution research community to run a small meeting or workshop on an important, focused and timely topic of their choice. 
Guidelines and submission details for proposals can be found HERE.
The deadline for submission of proposals is Monday, 15 June, 2015.
SMBE is now calling for proposals for small satellite meetings and/or workshops to be held between Fall 2015 and Fall 2016.  Funds will be awarded on a competitive basis to members of the molecular evolution research community to run a small meeting or workshop on an important, focused and timely topic of their choice.

Guidelines and submission details for proposals can be found HERE.
The deadline for submission of proposals is Monday, 15 June, 2015.

Continue Reading →

  • Tuesday, April 21, 2015
  • Comments (0)

SMBE 2015 Vienna Registration Now Open

The deadline for submission of requests for oral presentations, and submission of abstracts for them, for the 2015 annual meeting of the SMBE has been extended to the 15th of February. This extension does not extend the deadlines for nominations for any of the SMBE awards.

Registration is now open for the SMBE 2015 annual meeting, taking place in Vienna July 12th-16th  2015! 

Early bird registrants will benefit from up to a 30% reduced registration fee and full consideration of submitted abstracts (early bird registration closes March 1, 2015). 

Continue Reading →

  • Thursday, December 11, 2014
  • Comments (0)

@OfficialSMBE Feed

MBE | Most Read

Molecular Biology and Evolution

Withdrawn as Duplicate: The many nuanced evolutionary consequences of duplicated genes

Fri, 30 Nov 2018 00:00:00 GMT

The article 10.1093/molbev/msy216 has been withdrawn because it is a duplicate of 10.1093/molbev/msy210. The publisher regrets the error.

Two Methods for Mapping and Visualizing Associated Data on Phylogeny Using Ggtree

Tue, 23 Oct 2018 00:00:00 GMT

Ggtree is a comprehensive R package for visualizing and annotating phylogenetic trees with associated data. It can also map and visualize associated external data on phylogenies with two general methods. Method 1 allows external data to be mapped on the tree structure and used as visual characteristic in tree and data visualization. Method 2 plots the data with the tree side by side using different geometric functions after reordering the data based on the tree structure. These two methods integrate data with phylogeny for further exploration and comparison in the evolutionary biology context. Ggtree is available from

Secreted Bacterial Adenosine Deaminase Is an Evolutionary Precursor of Adenosine Deaminase Growth Factor

Tue, 16 Oct 2018 00:00:00 GMT

Adenosine deaminases (ADAs) play a pivotal role in regulating the level of adenosine, an important signaling molecule that controls a variety of cellular responses. Two distinct ADAs, ADA1 and adenosine deaminase growth factor (ADGF aka ADA2), are known. Cytoplasmic ADA1 plays a key role in purine metabolism and is widely distributed from prokaryotes to mammals. On the other hand, secreted ADGF/ADA2 is a cell-signaling protein that was thought to be present only in multicellular organisms. Here, we discovered a bacterial homologue of ADGF/ADA2. Bacterial and eukaryotic ADGF/ADA2 possess the dimerization and PRB domains characteristic for the family, have nearly identical catalytic sites, and show similar catalytic characteristics. Most surprisingly, the bacterial enzyme has a signal sequence similar to that of eukaryotic ADGF/ADA2 and is specifically secreted into the extracellular space, where it may potentially control the level of extracellular adenosine. This finding provides the first example of evolution of an extracellular eukaryotic signaling protein from a secreted bacterial analogue with identical activity and suggests a potential role of ADGF/ADA2 in bacterial communication.

New Insight into Parrots’ Mitogenomes Indicates That Their Ancestor Contained a Duplicated Region

Wed, 10 Oct 2018 00:00:00 GMT

Mitochondrial genomes of vertebrates are generally thought to evolve under strong selection for size reduction and gene order conservation. Therefore, a growing number of mitogenomes with duplicated regions changes our view on the genome evolution. Among Aves, order Psittaciformes (parrots) is especially noteworthy because of its large morphological, ecological, and taxonomical diversity, which offers an opportunity to study genome evolution in various aspects. Former analyses showed that tandem duplications comprising the control region with adjacent genes are restricted to several lineages in which the duplication occurred independently. However, using an appropriate polymerase chain reaction strategy, we demonstrate that early diverged parrot groups contain mitogenomes with the duplicated region. These findings together with mapping duplication data from other mitogenomes onto parrot phylogeny indicate that the duplication was an ancestral state for Psittaciformes. The state was inherited by main parrot groups and was lost several times in some lineages. The duplicated regions were subjected to concerted evolution with a frequency higher than the rate of speciation. The duplicated control regions may provide a selective advantage due to a more efficient initiation of replication or transcription and a larger number of replicating genomes per organelle, which may lead to a more effective energy production by mitochondria. The mitogenomic duplications were associated with phenotypic features and parrots with the duplicated region can live longer, show larger body mass as well as predispositions to a more active flight. The results have wider implications on the presence of duplications and their evolution in mitogenomes of other avian groups.

Understanding the Factors That Shape Patterns of Nucleotide Diversity in the House Mouse Genome

Mon, 08 Oct 2018 00:00:00 GMT

A major goal of population genetics has been to determine the extent by which selection at linked sites influences patterns of neutral nucleotide diversity in the genome. Multiple lines of evidence suggest that diversity is influenced by both positive and negative selection. For example, in many species there are troughs in diversity surrounding functional genomic elements, consistent with the action of either background selection (BGS) or selective sweeps. In this study, we investigated the causes of the diversity troughs that are observed in the wild house mouse genome. Using the unfolded site frequency spectrum, we estimated the strength and frequencies of deleterious and advantageous mutations occurring in different functional elements in the genome. We then used these estimates to parameterize forward-in-time simulations of chromosomes, using realistic distributions of functional elements and recombination rate variation in order to determine whether selection at linked sites can explain the observed patterns of nucleotide diversity. The simulations suggest that BGS alone cannot explain the dips in diversity around either exons or conserved noncoding elements. A combination of BGS and selective sweeps produces deeper dips in diversity than BGS alone, but the inferred parameters of selection cannot fully explain the patterns observed in the genome. Our results provide evidence of sweeps shaping patterns of nucleotide diversity across the mouse genome and also suggest that infrequent, strongly advantageous mutations play an important role in this. The limitations of using the unfolded site frequency spectrum for inferring the frequency and effects of advantageous mutations are discussed.

Genomic Analyses of Human European Diversity at the Southwestern Edge: Isolation, African Influence and Disease Associations in the Canary Islands

Fri, 05 Oct 2018 00:00:00 GMT

Despite the genetic resemblance of Canary Islanders to other southern European populations, their geographical isolation and the historical admixture of aborigines (from North Africa) with sub-Saharan Africans and Europeans have shaped a distinctive genetic makeup that likely affects disease susceptibility and health disparities. Based on single nucleotide polymorphism array data and whole genome sequencing (30×), we inferred that the last African admixture took place ∼14 generations ago and estimated that up to 34% of the Canary Islander genome is of recent African descent. The length of regions in homozygosis and the ancestry-related mosaic organization of the Canary Islander genome support the view that isolation has been strongest on the two smallest islands. Furthermore, several genomic regions showed significant and large deviations in African or European ancestry and were significantly enriched in genes involved in prevalent diseases in this community, such as diabetes, asthma, and allergy. The most prominent of these regions were located near LCT and the HLA, two well-known targets of selection, at which 40‒50% of the Canarian genome is of recent African descent according to our estimates. Putative selective signals were also identified in these regions near the SLC6A11-SLC6A1, KCNMB2, and PCDH20-PCDH9 genes. Taken together, our findings provide solid evidence of a significant recent African admixture, population isolation, and adaptation in this part of Europe, with the favoring of African alleles in some chromosome regions. These findings may have medical implications for populations of recent African ancestry.

FADS1 and the Timing of Human Adaptation to Agriculture

Mon, 01 Oct 2018 00:00:00 GMT

Variation at the FADS1/FADS2 gene cluster is functionally associated with differences in lipid metabolism and is often hypothesized to reflect adaptation to an agricultural diet. Here, we test the evidence for this relationship using both modern and ancient DNA data. We show that almost all the inhabitants of Europe carried the ancestral allele until the derived allele was introduced ∼8,500 years ago by Early Neolithic farming populations. However, we also show that it was not under strong selection in these populations. We find that this allele, and other proposed agricultural adaptations at LCT/MCM6 and SLC22A4, were not strongly selected until much later, perhaps as late as the Bronze Age. Similarly, increased copy number variation at the salivary amylase gene AMY1 is not linked to the development of agriculture although, in this case, the putative adaptation precedes the agricultural transition. Our analysis shows that selection at the FADS locus was not tightly linked to the initial introduction of agriculture and the Neolithic transition. Further, it suggests that the strongest signals of recent human adaptation in Europe did not coincide with the Neolithic transition but with more recent changes in environment, diet, or efficiency of selection due to increases in effective population size.

REforge Associates Transcription Factor Binding Site Divergence in Regulatory Elements with Phenotypic Differences between Species

Wed, 26 Sep 2018 00:00:00 GMT

Elucidating the genomic determinants of morphological differences between species is key to understanding how morphological diversity evolved. While differences in cis-regulatory elements are an important genetic source for morphological evolution, it remains challenging to identify regulatory elements involved in phenotypic differences. Here, we present Regulatory Element forward genomics (REforge), a computational approach that detects associations between transcription factor binding site divergence in putative regulatory elements and phenotypic differences between species. By simulating regulatory element evolution in silico, we show that this approach has substantial power to detect such associations. To validate REforge on real data, we used known binding motifs for eye-related transcription factors and identified significant binding site divergence in vision-impaired subterranean mammals in 1% of all conserved noncoding elements. We show that these genomic regions are significantly enriched in regulatory elements that are specifically active in mouse eye tissues, and that several of them are located near genes, which are required for eye development and photoreceptor function and are implicated in human eye disorders. Thus, our genome-wide screen detects widespread divergence of eye-regulatory elements and highlights regulatory regions that likely contributed to eye degeneration in subterranean mammals. REforge has broad applicability to detect regulatory elements that could be involved in many other phenotypes, which will help to reveal the genomic basis of morphological diversity.

Genomic Takeover by Transposable Elements in the Strawberry Poison Frog

Tue, 25 Sep 2018 00:00:00 GMT

We sequenced the genome of the strawberry poison frog, Oophaga pumilio, at a depth of 127.5× using variable insert size libraries. The total genome size is estimated to be 6.76 Gb, of which 4.76 Gb are from high copy number repetitive elements with low differentiation across copies. These repeats encompass DNA transposons, RNA transposons, and LTR retrotransposons, including at least 0.4 and 1.0 Gb of Mariner/Tc1 and Gypsy elements, respectively. Expression data indicate high levels of gypsy and Mariner/Tc1 expression in ova of O. pumilio compared with Xenopus laevis. We further observe phylogenetic evidence for horizontal transfer (HT) of Mariner elements, possibly between fish and frogs. The elements affected by HT are present in high copy number and are highly expressed, suggesting ongoing proliferation after HT. Our results suggest that the large amphibian genome sizes, at least partially, can be explained by a process of repeated invasion of new transposable elements that are not yet suppressed in the germline. We also find changes in the spliceosome that we hypothesize are related to permissiveness of O. pumilio to increases in intron length due to transposon proliferation. Finally, we identify the complement of ion channels in the first genomic sequenced poison frog and discuss its relation to the evolution of autoresistance to toxins sequestered in the skin.

Pseudogenes Provide Evolutionary Evidence for the Competitive Endogenous RNA Hypothesis

Tue, 25 Sep 2018 00:00:00 GMT

The competitive endogenous RNA (ceRNA) hypothesis is an attractively simple model to explain the biological role of many putatively functionless noncoding RNAs. Under this model, there exist transcripts in the cell whose role is to titrate out microRNAs such that the expression level of another target sequence is altered. That it is logistically possible for expression of one microRNA recognition element (MRE)-containing transcript to affect another is seen in the multiple examples of pathogenic effects of inappropriate expression of MRE-containing RNAs. However, the role, if any, of ceRNAs in normal biological processes and at physiological levels is disputed. By comparison of parent genes and pseudogenes we show, both for a specific example and genome-wide, that the pseudo-3′ untranslated regions (3′UTRs) of expressed pseudogenes are frequently retained and are under selective constraint in mammalian genomes. We found that the pseudo-3′UTR of BRAFP1, a previously described oncogenic ceRNA, has reduced substitutions relative to its pseudo-coding sequence, and we show sequence constraint on MREs shared between the parent gene, BRAF, and the pseudogene. Investigation of RNA-seq data reveals expression of BRAFP1 in normal somatic tissues in human and in other primates, consistent with biological ceRNA functionality of this pseudogene in nonpathogenic cellular contexts. Furthermore, we find that on a genome-wide scale pseudo-3′UTRs of mammalian pseudogenes (n = 1,629) are under stronger selective constraint than their pseudo-coding sequence counterparts, and are more often retained and expressed. Our results suggest that many human pseudogenes, often considered nonfunctional, may have an evolutionarily constrained role, consistent with the ceRNA hypothesis.

A Single Pheromone Receptor Gene Conserved across 400 My of Vertebrate Evolution

Mon, 24 Sep 2018 00:00:00 GMT

Pheromones are crucial for eliciting social and sexual behaviors in diverse animal species. The vomeronasal receptor type-1 (V1R) genes, encoding members of a pheromone receptor family, are highly variable in number and repertoire among mammals due to extensive gene gain and loss. Here, we report a novel pheromone receptor gene belonging to the V1R family, named ancient V1R (ancV1R), which is shared among most Osteichthyes (bony vertebrates) from the basal lineage of ray-finned fishes to mammals. Phylogenetic and syntenic analyses of ancV1R using 115 vertebrate genomes revealed that it represents an orthologous gene conserved for >400 My of vertebrate evolution. Interestingly, the loss of ancV1R in some tetrapods is coincident with the degeneration of the vomeronasal organ in higher primates, cetaceans, and some reptiles including birds and crocodilians. In addition, ancV1R is expressed in most mature vomeronasal sensory neurons in contrast with canonical V1Rs, which are sparsely expressed in a manner that is consistent with the “one neuron–one receptor” rule. Our results imply that a previously undescribed V1R gene inherited from an ancient Silurian ancestor may have played an important functional role in the evolution of vertebrate vomeronasal organ.

Life History Traits Impact the Nuclear Rate of Substitution but Not the Mitochondrial Rate in Isopods

Mon, 24 Sep 2018 00:00:00 GMT

The rate of molecular evolution varies widely among species. Life history traits (LHTs) have been proposed as a major driver of these variations. However, the relative contribution of each trait is poorly understood. Here, we test the influence of metabolic rate (MR), longevity, and generation time (GT) on the nuclear and mitochondrial synonymous substitution rates using a group of isopod species that have made multiple independent transitions to subterranean environments. Subterranean species have repeatedly evolved a lower MR, a longer lifespan and a longer GT. We assembled the nuclear transcriptomes and the mitochondrial genomes of 13 pairs of closely related isopods, each pair composed of one surface and one subterranean species. We found that subterranean species have a lower rate of nuclear synonymous substitution than surface species whereas the mitochondrial rate remained unchanged. We propose that this decoupling between nuclear and mitochondrial rates comes from different DNA replication processes in these two compartments. In isopods, the nuclear rate is probably tightly controlled by GT alone. In contrast, mitochondrial genomes appear to replicate and mutate at a rate independent of LHTs. These results are incongruent with previous studies, which were mostly devoted to vertebrates. We suggest that this incongruence can be explained by developmental differences between animal clades, with a quiescent period during female gametogenesis in mammals and birds which imposes a nuclear and mitochondrial rate coupling, as opposed to the continuous gametogenesis observed in most arthropods.

Successive Domain Rearrangements Underlie the Evolution of a Regulatory Module Controlled by a Small Interfering Peptide

Fri, 07 Sep 2018 00:00:00 GMT

The establishment of new interactions between transcriptional regulators increases the regulatory diversity that drives phenotypic novelty. To understand how such interactions evolve, we have studied a regulatory module (DDR) composed by three MYB-like proteins: DIVARICATA (DIV), RADIALIS (RAD), and DIV-and-RAD-Interacting Factor (DRIF). The DIV and DRIF proteins form a transcriptional complex that is disrupted in the presence of RAD, a small interfering peptide, due to the formation of RAD–DRIF dimers. This dynamic interaction result in a molecular switch mechanism responsible for the control of distinct developmental processes in plants. Here, we have determined how the DDR regulatory module was established by analyzing the origin and evolution of the DIV, DRIF, and RAD protein families and the evolutionary history of their interactions. We show that duplications of a pre-existing MYB domain originated the DIV and DRIF protein families in the ancestral lineage of green algae, and, later, the RAD family in seed plants. Intraspecies interactions between the MYB domains of DIV and DRIF proteins are detected in green algae, whereas the earliest evidence of an interaction between DRIF and RAD proteins occurs in the gymnosperms, coincident with the establishment of the RAD family. Therefore, the DDR module evolved in a stepwise progression with the DIV–DRIF transcription complex evolving prior to the antagonistic RAD–DRIF interaction that established the molecular switch mechanism. Our results suggest that the successive rearrangement and divergence of a single protein domain can be an effective evolutionary mechanism driving new protein interactions and the establishment of novel regulatory modules.

Adaptive Evolution of Animal Proteins over Development: Support for the Darwin Selection Opportunity Hypothesis of Evo-Devo

Sat, 01 Sep 2018 00:00:00 GMT

A driving hypothesis of evolutionary developmental biology is that animal morphological diversity is shaped both by adaptation and by developmental constraints. Here, we have tested Darwin’s “selection opportunity” hypothesis, according to which high evolutionary divergence in late development is due to strong positive selection. We contrasted it to a “developmental constraint” hypothesis, according to which late development is under relaxed negative selection. Indeed, the highest divergence between species, both at the morphological and molecular levels, is observed late in embryogenesis and postembryonically. To distinguish between adaptation and relaxation hypotheses, we investigated the evidence of positive selection on protein-coding genes in relation to their expression over development, in fly Drosophila melanogaster, zebrafish Danio rerio, and mouse Mus musculus. First, we found that genes specifically expressed in late development have stronger signals of positive selection. Second, over the full transcriptome, genes with evidence for positive selection trend to be expressed in late development. Finally, genes involved in pathways with cumulative evidence of positive selection have higher expression in late development. Overall, there is a consistent signal that positive selection mainly affects genes and pathways expressed in late embryonic development and in adult. Our results imply that the evolution of embryogenesis is mostly conservative, with most adaptive evolution affecting some stages of postembryonic gene expression, and thus postembryonic phenotypes. This is consistent with the diversity of environmental challenges to which juveniles and adults are exposed.

Integrating Embryonic Development and Evolutionary History to Characterize Tentacle-Specific Cell Types in a Ctenophore

Thu, 30 Aug 2018 00:00:00 GMT

The origin of novel traits can promote expansion into new niches and drive speciation. Ctenophores (comb jellies) are unified by their possession of a novel cell type: the colloblast, an adhesive cell found only in the tentacles. Although colloblast-laden tentacles are fundamental for prey capture among ctenophores, some species have tentacles lacking colloblasts and others have lost their tentacles completely. We used transcriptomes from 36 ctenophore species to identify gene losses that occurred specifically in lineages lacking colloblasts and tentacles. We cross-referenced these colloblast- and tentacle-specific candidate genes with temporal RNA-Seq during embryogenesis in Mnemiopsis leidyi and found that both sets of candidates are preferentially expressed during tentacle morphogenesis. We also demonstrate significant upregulation of candidates from both data sets in the tentacle bulb of adults. Both sets of candidates were enriched for an N-terminal signal peptide and protein domains associated with secretion; among tentacle candidates we also identified orthologs of cnidarian toxin proteins, presenting tantalizing evidence that ctenophore tentacles may secrete toxins along with their adhesive. Finally, using cell lineage tracing, we demonstrate that colloblasts and neurons share a common progenitor, suggesting the evolution of colloblasts involved co-option of a neurosecretory gene regulatory network. Together these data offer an initial glimpse into the genetic architecture underlying ctenophore cell-type diversity.

GBE | Most Read

Genome Biology & Evolution

IMPUTOR: Phylogenetically Aware Software for Imputation of Errors in Next-Generation Sequencing

Mon, 19 Nov 2018 00:00:00 GMT

Matthew Jobin, Haiko Schurz, and Brenna M. Henn

Effect of Collapsed Duplications on Diversity Estimates: What to Expect

Fri, 26 Oct 2018 00:00:00 GMT

The study of segmental duplications (SDs) and copy-number variants (CNVs) is of great importance in the fields of genomics and evolution. However, SDs and CNVs are usually excluded from genome-wide scans for natural selection. Because of high identity between copies, SDs and CNVs that are not included in reference genomes are prone to be collapsed—that is, mistakenly aligned to the same region—when aligning sequence data from single individuals to the reference. Such collapsed duplications are additionally challenging because concerted evolution between duplications alters their site frequency spectrum and linkage disequilibrium patterns. To investigate the potential effect of collapsed duplications upon natural selection scans we obtained expectations for four summary statistics from simulations of duplications evolving under a range of interlocus gene conversion and crossover rates. We confirm that summary statistics traditionally used to detect the action of natural selection on DNA sequences cannot be applied to SDs and CNVs since in some cases values for known duplications mimic selective signatures. As a proof of concept of the pervasiveness of collapsed duplications, we analyzed data from the 1,000 Genomes Project. We find that, within regions identified as variable in copy number, diversity between individuals with the duplication is consistently higher than between individuals without the duplication. Furthermore, the frequency of single nucleotide variants (SNVs) deviating from Hardy–Weinberg Equilibrium is higher in individuals with the duplication, which strongly suggests that higher diversity is a consequence of collapsed duplications and incorrect evaluation of SNVs within these CNV regions.

From De Novo to “De Nono”: The Majority of Novel Protein-Coding Genes Identified with Phylostratigraphy Are Old Genes or Recent Duplicates

Mon, 22 Oct 2018 00:00:00 GMT

The evolution of novel protein-coding genes from noncoding regions of the genome is one of the most compelling pieces of evidence for genetic innovations in nature. One popular approach to identify de novo genes is phylostratigraphy, which consists of determining the approximate time of origin (age) of a gene based on its distribution along a species phylogeny. Several studies have revealed significant flaws in determining the age of genes, including de novo genes, using phylostratigraphy alone. However, the rate of false positives in de novo gene surveys, based on phylostratigraphy, remains unknown. Here, I reanalyze the findings from three studies, two of which identified tens to hundreds of rodent-specific de novo genes adopting a phylostratigraphy-centered approach. Most putative de novo genes discovered in these investigations are no longer included in recently updated mouse gene sets. Using a combination of synteny information and sequence similarity searches, I show that ∼60% of the remaining 381 putative de novo genes share homology with genes from other vertebrates, originated through gene duplication, and/or share no synteny information with nonrodent mammals. These results led to an estimated rate of ∼12 de novo genes per million years in mouse. Contrary to a previous study (Wilson BA, Foy SG, Neme R, Masel J. 2017. Young genes are highly disordered as predicted by the preadaptation hypothesis of de novo gene birth. Nat Ecol Evol. 1:0146), I found no evidence supporting the preadaptation hypothesis of de novo gene formation. Nearly half of the de novo genes confirmed in this study are within older genes, indicating that co-option of preexisting regulatory regions and a higher GC content may facilitate the origin of novel genes.

Evidence of Polygenic Adaptation to High Altitude from Tibetan and Sherpa Genomes

Thu, 18 Oct 2018 00:00:00 GMT

Although Tibetans and Sherpa present several physiological adjustments evolved to cope with selective pressures imposed by the high-altitude environment, especially hypobaric hypoxia, few selective sweeps at a limited number of hypoxia related genes were confirmed by multiple genomic studies. Nevertheless, variants at these loci were found to be associated only with downregulation of the erythropoietic cascade, which represents an indirect aspect of the considered adaptive phenotype. Accordingly, the genetic basis of Tibetan/Sherpa adaptive traits remains to be fully elucidated, in part due to limitations of selection scans implemented so far and mostly relying on the hard sweep model.In order to overcome this issue, we used whole-genome sequence data and several selection statistics as input for gene network analyses aimed at testing for the occurrence of polygenic adaptation in these high-altitude Himalayan populations. Being able to detect also subtle genomic signatures ascribable to weak positive selection at multiple genes of the same functional subnetwork, this approach allowed us to infer adaptive evolution at loci individually showing small effect sizes, but belonging to highly interconnected biological pathways overall involved in angiogenetic processes.Therefore, these findings pinpointed a series of selective events neglected so far, which likely contributed to the augmented tissue blood perfusion observed in Tibetans and Sherpa, thus uncovering the genetic determinants of a key biological mechanism that underlies their adaptation to high altitude.

Moraxella catarrhalis Restriction–Modification Systems Are Associated with Phylogenetic Lineage and Disease

Thu, 18 Oct 2018 00:00:00 GMT

Moraxella catarrhalis is a human-adapted pathogen, and a major cause of otitis media (OM) and exacerbations of chronic obstructive pulmonary disease. The species is comprised of two main phylogenetic lineages, RB1 and RB2/3. Restriction–modification (R-M) systems are among the few lineage-associated genes identified in other bacterial genera and have multiple functions including defense against foreign invading DNA, maintenance of speciation, and epigenetic regulation of gene expression. Here, we define the repertoire of R-M systems in 51 publicly available M. catarrhalis genomes and report their distribution among M. catarrhalis phylogenetic lineages. An association with phylogenetic lineage (RB1 or RB2/3) was observed for six R-M systems, which may contribute to the evolution of the lineages by restricting DNA transformation. In addition, we observed a relationship between a mutually exclusive Type I R-M system and a Type III R-M system at a single locus conserved throughout a geographically and clinically diverse set of M. catarrhalis isolates. The Type III R-M system at this locus contains the phase-variable Type III DNA methyltransferase, modM, which controls a phasevarion (phase-variable regulon). We observed an association between modM presence and OM-associated middle ear isolates, indicating a potential role for ModM-mediated epigenetic regulation in OM pathobiology.

Assessing the Performance of Ks Plots for Detecting Ancient Whole Genome Duplications

Tue, 18 Sep 2018 00:00:00 GMT

Genomic data have provided evidence of previously unknown ancient whole genome duplications (WGDs) and highlighted the role of WGDs in the evolution of many eukaryotic lineages. Ancient WGDs often are detected by examining distributions of synonymous substitutions per site (Ks) within a genome, or “Ks plots.” For example, WGDs can be detected from Ks plots by using univariate mixture models to identify peaks in Ks distributions. We performed gene family simulation experiments to evaluate the effects of different Ks estimation methods and mixture models on our ability to detect ancient WGDs from Ks plots. The simulation experiments, which accounted for variation in substitution rates and gene duplication and loss rates across gene families, tested the effects of WGD age and gene retention rates following WGD on inferring WGDs from Ks plots. Our simulations reveal limitations of Ks plot analyses. Strict interpretations of mixture model analyses often overestimate the number of WGD events, and Ks plot analyses typically fail to detect WGDs when ≤10% of the duplicated genes are retained following the WGD. However, WGDs can accurately be characterized over an intermediate range of Ks. The simulation results are supported by empirical analyses of transcriptomic data, which also suggest that biases in gene retention likely affect our ability to detect ancient WGDs. Although our results indicate mixture model results should be interpreted with great caution, using node-averaged Ks estimates and applying more appropriate mixture models can improve the accuracy of detecting WGDs.

Thermosipho spp. Immune System Differences Affect Variation in Genome Size and Geographical Distributions

Sat, 15 Sep 2018 00:00:00 GMT

Thermosipho species inhabit thermal environments such as marine hydrothermal vents, petroleum reservoirs, and terrestrial hot springs. A 16S rRNA phylogeny of available Thermosipho spp. sequences suggested habitat specialists adapted to living in hydrothermal vents only, and habitat generalists inhabiting oil reservoirs, hydrothermal vents, and hotsprings. Comparative genomics of 15 Thermosipho genomes separated them into three distinct species with different habitat distributions: The widely distributed T. africanus and the more specialized, T. melanesiensis and T. affectus. Moreover, the species can be differentiated on the basis of genome size (GS), genome content, and immune system composition. For instance, the T. africanus genomes are largest and contained the most carbohydrate metabolism genes, which could explain why these isolates were obtained from ecologically more divergent habitats. Nonetheless, all the Thermosipho genomes, like other Thermotogae genomes, show evidence of genome streamlining. GS differences between the species could further be correlated to differences in defense capacities against foreign DNA, which influence recombination via HGT. The smallest genomes are found in T. affectus that contain both CRISPR-cas Type I and III systems, but no RM system genes. We suggest that this has caused these genomes to be almost devoid of mobile elements, contrasting the two other species genomes that contain a higher abundance of mobile elements combined with different immune system configurations. Taken together, the comparative genomic analyses of Thermosipho spp. revealed genetic variation allowing habitat differentiation within the genus as well as differentiation with respect to invading mobile DNA.

Concordant Changes in Gene Expression and Nucleotides Underlie Independent Adaptation to Hydrogen-Sulfide-Rich Environments

Wed, 12 Sep 2018 00:00:00 GMT

The colonization of novel environments often involves changes in gene expression, protein coding sequence, or both. Studies of how populations adapt to novel conditions, however, often focus on only one of these two processes, potentially missing out on the relative importance of different parts of the evolutionary process. In this study, our objectives were 1) to better understand the qualitative concordance between conclusions drawn from analyses of differential expression and changes in genic sequence and 2) to quantitatively test whether differentially expressed genes were enriched for sites putatively under positive selection within gene regions. To achieve this, we compared populations of fish (Poecilia mexicana) that have independently adapted to hydrogen-sulfide-rich environments in southern Mexico to adjacent populations residing in nonsulfidic waters. Specifically, we used RNA-sequencing data to compare both gene expression and DNA sequence differences between populations. Analyzing these two different data types led to similar conclusions about which biochemical pathways (sulfide detoxification and cellular respiration) were involved in adaptation to sulfidic environments. Additionally, we found a greater overlap between genes putatively under selection and differentially expressed genes than expected by chance. We conclude that considering both differential expression and changes in DNA sequence led to a more comprehensive understanding of how these populations adapted to extreme environmental conditions. Our results imply that changes in both gene expression and DNA sequence—sometimes at the same loci—may be involved in adaptation.