Poster authors and title: Joel M Alves*, Miguel Carneiro, Jade Y Cheng, Ana Lemos de Matos, Masmudur M Rahman, Liisa Loog, Anders Eriksson, Grant McFadden, Rasmus Nielsen, Thomas P Gilbert, Pedro J Esteves, Nuno Ferrand, Francis M Jiggins
Poster authors and title: Kentaro M. Tanaka*, Yoshitaka Kamimura, Aya Takahashi
Poster authors and title: Maria A. Spyrou*, Marcel Keller, Rezeda I. Tukhbatova, Elisabeth Nelson, Don Walker, Sacha Kacki, Dominique Castex, Sandra Loesch, Michaela Harbeck , Alexander Herbig, Kirsten I Bos, Johannes Krause
Poster authors and title: Geno Guerra*, Rasmus Nielsen
Poster authors and title: Xueying C. Li*, David Peris, Chris Todd Hittinger, Elaine A. Sia, Justin C. Fay
Poster authors and title: Hassan Sibroe Abdulla Daanaa*, Ali Mostafa Anwar
Poster authors and title: Sara El-Shawa*, Rob Ness
Poster authors and title: Kara Boyer*, Nicole Creanza, Maanasa Raghavan
In 2005, the SMBE Council decided to present at each meeting one or more Best Poster Prizes for Postdoctoral Fellows and Graduate Students. In 2010, a Best Poster Prize for undergraduate students was added.
Grant L. Filowitz, Rajendhran Rajakumar, Katherine L. O'Shaughnessy, and Martin J. Cohn
AbstractNatural selection works best when the two alleles in a diploid organism are transmitted to offspring at equal frequencies. Despite this, selfish loci known as meiotic drivers that bias their own transmission into gametes are found throughout eukaryotes. Drive is thought to be a powerful evolutionary force, but empirical evolutionary analyses of drive systems are limited by low numbers of identified meiotic drive genes. Here, we analyze the evolution of the wtf gene family of Schizosaccharomyces pombe that contains both killer meiotic drive genes and suppressors of drive. We completed assemblies of all wtf genes for two S. pombe isolates, as well as a subset of wtf genes from over 50 isolates. We find that wtf copy number can vary greatly between isolates and that amino acid substitutions, expansions and contractions of DNA sequence repeats, and nonallelic gene conversion between family members all contribute to dynamic wtf gene evolution. This work demonstrates the power of meiotic drive to foster rapid evolution and identifies a recombination mechanism through which transposons can indirectly mobilize meiotic drivers.
Mol. Biol. Evol. 35(3):549–563, doi:10.1093/molbev/msx247
AbstractGenomic imprinting is an epigenetic phenomenon where autosomal genes display uniparental expression depending on whether they are maternally or paternally inherited. Genomic imprinting can arise from parental conflicts over resource allocation to the offspring, which could drive imprinted loci to evolve by positive selection. We investigate whether positive selection is associated with genomic imprinting in the inbreeding species Arabidopsis thaliana. Our analysis of 140 genes regulated by genomic imprinting in the A. thaliana seed endosperm demonstrates they are evolving more rapidly than expected. To investigate whether positive selection drives this evolutionary acceleration, we identified orthologs of each imprinted gene across 34 plant species and elucidated their evolutionary trajectories. Increased positive selection was sought by comparing its incidence among imprinted genes with nonimprinted controls. Strikingly, we find a statistically significant enrichment of imprinted paternally expressed genes (iPEGs) evolving under positive selection, 50.6% of the total, but no such enrichment for positive selection among imprinted maternally expressed genes (iMEGs). This suggests that maternally- and paternally expressed imprinted genes are subject to different selective pressures. Almost all positively selected amino acids were fixed across 80 sequenced A. thaliana accessions, suggestive of selective sweeps in the A. thaliana lineage. The imprinted genes under positive selection are involved in processes important for seed development including auxin biosynthesis and epigenetic regulation. Our findings support a genomic imprinting model for plants where positive selection can affect paternally expressed genes due to continued conflict with maternal sporophyte tissues, even when parental conflict is reduced in predominantly inbreeding species.
AbstractIn species with chromosomal sex determination, X chromosomes are predicted to evolve faster than autosomes because of positive selection on recessive alleles or weak purifying selection. We investigated X chromosome evolution in Stegodyphus spiders that differ in mating system, sex ratio, and population dynamics. We assigned scaffolds to X chromosomes and autosomes using a novel method based on flow cytometry of sperm cells and reduced representation sequencing. We estimated coding substitution patterns (dN/dS) in a subsocial outcrossing species (S. africanus) and its social inbreeding and female-biased sister species (S. mimosarum), and found evidence for faster-X evolution in both species. X chromosome-to-autosome diversity (piX/piA) ratios were estimated in multiple populations. The average piX/piA estimates of S. africanus (0.57 [95% CI: 0.55–0.60]) was lower than the neutral expectation of 0.75, consistent with more hitchhiking events on X-linked loci and/or a lower X chromosome mutation rate, and we provide evidence in support of both. The social species S. mimosarum has a significantly higher piX/piA ratio (0.72 [95% CI: 0.65–0.79]) in agreement with its female-biased sex ratio. Stegodyphus mimosarum also have different piX/piA estimates among populations, which we interpret as evidence for recurrent founder events. Simulations show that recurrent founder events are expected to decrease the piX/piA estimates in S. mimosarum, thus underestimating the true effect of female-biased sex ratios. Finally, we found lower synonymous divergence on X chromosomes in both species, and the male-to-female substitution ratio to be higher than 1, indicating a higher mutation rate in males.
AbstractIncreasingly, large phylogenomic data sets include transcriptomic data from nonmodel organisms. This not only has allowed controversial and unexplored evolutionary relationships in the tree of life to be addressed but also increases the risk of inadvertent inclusion of paralogs in the analysis. Although this may be expected to result in decreased phylogenetic support, it is not clear if it could also drive highly supported artifactual relationships. Many groups, including the hyperdiverse Lissamphibia, are especially susceptible to these issues due to ancient gene duplication events and small numbers of sequenced genomes and because transcriptomes are increasingly applied to resolve historically conflicting taxonomic hypotheses. We tested the potential impact of paralog inclusion on the topologies and timetree estimates of the Lissamphibia using published and de novo sequencing data including 18 amphibian species, from which 2,656 single-copy gene families were identified. A novel paralog filtering approach resulted in four differently curated data sets, which were used for phylogenetic reconstructions using Bayesian inference, maximum likelihood, and quartet-based supertrees. We found that paralogs drive strongly supported conflicting hypotheses within the Lissamphibia (Batrachia and Procera) and older divergence time estimates even within groups where no variation in topology was observed. All investigated methods, except Bayesian inference with the CAT-GTR model, were found to be sensitive to paralogs, but with filtering convergence to the same answer (Batrachia) was observed. This is the first large-scale study to address the impact of orthology selection using transcriptomic data and emphasizes the importance of quality over quantity particularly for understanding relationships of poorly sampled taxa.
AbstractGenomes are dynamic biological units, with processes of gene duplication and loss triggering evolutionary novelty. The mammalian skin provides a remarkable case study on the occurrence of adaptive morphological innovations. Skin sebaceous glands (SGs), for instance, emerged in the ancestor of mammals serving pivotal roles, such as lubrication, waterproofing, immunity, and thermoregulation, through the secretion of sebum, a complex mixture of various neutral lipids such as triacylglycerol, free fatty acids, wax esters, cholesterol, and squalene. Remarkably, SGs are absent in a few mammalian lineages, including the iconic Cetacea. We investigated the evolution of the key molecular components responsible for skin sebum production: Dgat2l6, Awat1, Awat2, Elovl3, Mogat3, and Fabp9. We show that all analyzed genes have been rendered nonfunctional in Cetacea species (toothed and baleen whales). Transcriptomic analysis, including a novel skin transcriptome from blue whale, supports gene inactivation. The conserved mutational pattern found in most analyzed genes, indicates that pseudogenization events took place prior to the diversification of modern Cetacea lineages. Genome and skin transcriptome analysis of the common hippopotamus highlighted the convergent loss of a subset of sebum-producing genes, notably Awat1 and Mogat3. Partial loss profiles were also detected in non-Cetacea aquatic mammals, such as the Florida manatee, and in terrestrial mammals displaying specialized skin phenotypes such as the African elephant, white rhinoceros and pig. Our findings reveal a unique landscape of “gene vestiges” in the Cetacea sebum-producing compartment, with limited gene loss observed in other mammalian lineages: suggestive of specific adaptations or specializations of skin lipids.
AbstractExtensive European and African admixture coupled with loss of Amerindian lineages makes the reconstruction of pre-Columbian history of Native Americans based on present-day genomes extremely challenging. Still open questions remain about the dispersals that occurred throughout the continent after the initial peopling from the Beringia, especially concerning the number and dynamics of diffusions into South America. Indeed, if environmental and historical factors contributed to shape distinct gene pools in the Andes and Amazonia, the origins of this East-West genetic structure and the extension of further interactions between populations residing along this divide are still not well understood.To this end, we generated new high-resolution genome-wide data for 229 individuals representative of one Central and ten South Amerindian ethnic groups from Mexico, Peru, Bolivia, and Argentina. Low levels of European and African admixture in the sampled individuals allowed the application of fine-scale haplotype-based methods and demographic modeling approaches. These analyses revealed highly specific Native American genetic ancestries and great intragroup homogeneity, along with limited traces of gene flow mainly from the Andes into Peruvian Amazonians. Substantial amount of genetic drift differentially experienced by the considered populations underlined distinct patterns of recent inbreeding or prolonged isolation. Overall, our results support the hypothesis that all non-Andean South Americans are compatible with descending from a common lineage, while we found low support for common Mesoamerican ancestors of both Andeans and other South American groups. These findings suggest extensive back-migrations into Central America from non-Andean sources or conceal distinct peopling events into the Southern Continent.
AbstractOne approach to the reconstruction of infectious disease transmission trees from pathogen genomic data has been to use a phylogenetic tree, reconstructed from pathogen sequences, and annotate its internal nodes to provide a reconstruction of which host each lineage was in at each point in time. If only one pathogen lineage can be transmitted to a new host (i.e., the transmission bottleneck is complete), this corresponds to partitioning the nodes of the phylogeny into connected regions, each of which represents evolution in an individual host. These partitions define the possible transmission trees that are consistent with a given phylogenetic tree. However, the mathematical properties of the transmission trees given a phylogeny remain largely unexplored. Here, we describe a procedure to calculate the number of possible transmission trees for a given phylogeny, and we then show how to uniformly sample from these transmission trees. The procedure is outlined for situations where one sample is available from each host and trees do not have branch lengths, and we also provide extensions for incomplete sampling, multiple sampling, and the application to time trees in a situation where limits on the period during which each host could have been infected and infectious are known. The sampling algorithm is available as an R package (STraTUS).
AbstractExtracellular matrix (ECM) is considered central to the evolution of metazoan multicellularity; however, the repertoire of ECM proteins in nonbilaterians remains unclear. Thrombospondins (TSPs) are known to be well conserved from cnidarians to vertebrates, yet to date have been considered a unique family, principally studied for matricellular functions in vertebrates. Through searches utilizing the highly conserved C-terminal region of TSPs, we identify undisclosed new families of TSP-related proteins in metazoans, designated mega-TSP, sushi-TSP, and poriferan-TSP, each with a distinctive phylogenetic distribution. These proteins share the TSP C-terminal region domain architecture, as determined by domain composition and analysis of molecular models against known structures. Mega-TSPs, the only form identified in ctenophores, are typically >2,700 aa and are also characterized by N-terminal leucine-rich repeats and central cadherin/immunoglobulin domains. In cnidarians, which have a well-defined ECM, Mega-TSP was expressed throughout embryogenesis in Nematostella vectensis, with dynamic endodermal expression in larvae and primary polyps and widespread ectodermal expression in adult Nematostella vectensis and Hydra magnipapillata polyps. Hydra Mega-TSP was also expressed during regeneration and siRNA-silencing of Mega-TSP in Hydra caused specific blockade of head regeneration. Molecular phylogenetic analyses based on the conserved TSP C-terminal region identified each of the TSP-related groups to form clades distinct from the canonical TSPs. We discuss models for the evolution of the newly defined TSP superfamily by gene duplications, radiation, and gene losses from a debut in the last metazoan common ancestor. Together, the data provide new insight into the evolution of ECM and tissue organization in metazoans.
AbstractThe importance of climate in determining biodiversity patterns has been well documented. However, the relationship between climate and rates of genetic evolution remains controversial. Latitude and elevation have been associated with rates of change in genetic markers such as cytochrome b. What is not known, however, is the strength of such associations and whether patterns found among these genes apply across entire genomes. Here, using bumblebee genetic data from seven subgenera of Bombus, we demonstrate that all species occupying warmer elevations have undergone faster genome-wide evolution than those in the same subgenera occupying cooler elevations. Our findings point to a critical biogeographic role in the relative rates of whole species evolution, potentially influencing global biodiversity patterns.
AbstractSeasonal influenza viruses undergo frequent mutations on their surface hemagglutinin (HA) proteins to escape the host immune response. In these mutations, a few key amino acid sites are associated with significant antigenic cluster transitions. To recognize the cluster-transition determining sites of seasonal influenza A/H3N2 and A/H1N1 viruses systematically and quickly, we developed a computational model named RECDS (recognition of cluster-transition determining sites) to evaluate the contribution of a specific amino acid site on the HA protein in the whole history of antigenic evolution. In RECDS, we ranked all of the HA sites by calculating the contribution scores derived from the forest of gradient boosting classifiers trained by various sequence- and structure-based features. With the RECDS model, we found out that the sites determining influenza antigenicity were mostly around the receptor-binding domain both for the influenza A/H3N2 and A/H1N1 viruses. Specifically, half of the cluster-transition determining sites of the influenza A/H1N1 virus were located in the vestigial esterase domain and basic path area on the HA, which indicated that the differential driving force of the antigenic evolution of the A/H1N1 virus refers to the A/H3N2 virus. Beyond that, the footprints of substitutions responsible for antigenic evolution were inferred according to the phylogenetic trees for the cluster-transition determining sites. The monitoring of genetic variation occurring at these cluster-transition determining sites in circulating influenza viruses on a large scale will potentially reduce current assay workloads in influenza surveillance and the selection of new influenza vaccine strains.
AbstractThe mass application of whole mitogenome (MG) sequencing has great potential for resolving complex phylogeographic patterns that cannot be resolved by partial mitogenomic sequences or nuclear markers. North American periodical cicadas (Magicicada) are well known for their periodical mass emergence at 17- and 13-year intervals in the north and south, respectively. Magicicada comprises three species groups, each containing one 17-year species and one or two 13-year species. Within each life cycle, single-aged cohorts, called broods, of periodical cicadas emerge in different years, and most broods contain members of all three species groups. There are 12 and three extant broods of 17- and 13-year cicadas, respectively. The phylogeographic relationships among the populations and broods within the species groups have not been clearly resolved. We analyzed 125 whole MG sequences from all broods and seven species within three species groups to ascertain the divergence history of the geographic and allochronic populations and their life cycles. Our mitogenomic phylogeny analysis clearly revealed that each of the three species groups had largely similar phylogeographic subdivisions (east, middle, and west) and demographic histories (rapid population expansion after the last glacial period). The mitogenomic phylogeny also partly resolved the brood diversification process, which could be explained by hypothetical temporary life cycle shifts, and showed that none of the 13- and 17-year species within the species groups was monophyletic, possibly due to gene flow between them. Our findings clearly reveal phylogeographic structures in the three Magicicada species groups, demonstrating the advantage of whole MG sequence data in phylogeographic studies.
AbstractThere are numerous sources of variation in the rate of synonymous substitutions inside genes, such as direct selection on the nucleotide sequence, or mutation rate variation. Yet scans for positive selection rely on codon models which incorporate an assumption of effectively neutral synonymous substitution rate, constant between sites of each gene. Here we perform a large-scale comparison of approaches which incorporate codon substitution rate variation and propose our own simple yet effective modification of existing models. We find strong effects of substitution rate variation on positive selection inference. More than 70% of the genes detected by the classical branch-site model are presumably false positives caused by the incorrect assumption of uniform synonymous substitution rate. We propose a new model which is strongly favored by the data while remaining computationally tractable. With the new model we can capture signatures of nucleotide level selection acting on translation initiation and on splicing sites within the coding region. Finally, we show that rate variation is highest in the highly recombining regions, and we propose that recombination and mutation rate variation, such as high CpG mutation rate, are the two main sources of nucleotide rate variation. Although we detect fewer genes under positive selection in Drosophila than without rate variation, the genes which we detect contain a stronger signal of adaptation of dynein, which could be associated with Wolbachia infection. We provide software to perform positive selection analysis using the new model.
AbstractWe present a method that jointly analyzes the polymorphism and divergence sites in genomic sequences of multiple species to identify the genes under natural selection and pinpoint the occurrence time of selection to a specific lineage of the species phylogeny. This method integrates population genetics models using a Bayesian Poisson random field framework and combines information over all gene loci to boost the power for detecting selection. The method provides posterior distributions of the fitness effects of each gene along with parameters associated with the evolutionary history, including the species divergence time and effective population size of external species. The results of simulations demonstrate that our method achieves a high power to identify genes under positive selection for a wide range of selection intensity and provides reasonably accurate estimates of the population genetic parameters. The proposed method is applied to genomic sequences of humans, chimpanzees, gorillas, and orangutans and identifies a list of lineage-specific targets of positive selection. The positively selected genes in the human lineage are enriched in pathways of gene expression regulation, immune system and metabolism, etc. Our analysis provides insights into natural evolution in the coding regions of humans and great apes and thus serves as a basis for further molecular and functional studies.
AbstractThe Ashkenazi Jews (AJ) are a population isolate sharing ancestry with both European and Middle Eastern populations that has likely resided in Central Europe since at least the tenth century. Between the 11th and 16th centuries, the AJ population expanded eastward leading to two culturally distinct communities in Western/Central and Eastern Europe. Our aim was to determine whether the western and eastern groups are genetically distinct, and if so, what demographic processes contributed to population differentiation. We used Approximate Bayesian Computation to choose among models of AJ history and to infer demographic parameter values, including divergence times, effective population sizes, and levels of gene flow. For the ABC analysis, we used allele frequency spectrum and identical by descent-based statistics to capture information on a wide timescale. We also mitigated the effects of ascertainment bias when performing ABC on SNP array data by jointly modeling and inferring SNP discovery. We found that the most likely model was population differentiation between Eastern and Western AJ ∼400 years ago. The differentiation between the Eastern and Western AJ could be attributed to more extreme population growth in the Eastern AJ (0.250 per generation) than the Western AJ (0.069 per generation).
AbstractPyricularia is a fungal genus comprising several pathogenic species causing the blast disease in monocots. Pyricularia oryzae, the best-known species, infects rice, wheat, finger millet, and other crops. As past comparative and population genomics studies mainly focused on isolates of P. oryzae, the genomes of the other Pyricularia species have not been well explored. In this study, we obtained a chromosomal-level genome assembly of the finger millet isolate P. oryzae MZ5-1-6 and also highly contiguous assemblies of Pyricularia sp. LS, P. grisea, and P. pennisetigena. The differences in the genomic content of repetitive DNA sequences could largely explain the variation in genome size among these new genomes. Moreover, we found extensive gene gains and losses and structural changes among Pyricularia genomes, including a large interchromosomal translocation. We searched for homologs of known blast effectors across fungal taxa and found that most avirulence effectors are specific to Pyricularia, whereas many other effectors share homologs with distant fungal taxa. In particular, we discovered a novel effector family with metalloprotease activity, distinct from the well-known AVR-Pita family. We predicted 751 gene families containing putative effectors in 7 Pyricularia genomes and found that 60 of them showed differential expression in the P. oryzae MZ5-1-6 transcriptomes obtained under experimental conditions mimicking the pathogen infection process. In summary, this study increased our understanding of the structural, functional, and evolutionary genomics of the blast pathogen and identified new potential effector genes, providing useful data for developing crops with durable resistance.
AbstractAs limits on O2 availability during submergence impose severe constraints on aerobic respiration, the oxygen binding globin proteins of marine mammals are expected to have evolved under strong evolutionary pressures during their land-to-sea transition. Here, we address this question for the order Sirenia by retrieving, annotating, and performing detailed selection analyses on the globin repertoire of the extinct Steller’s sea cow (Hydrodamalis gigas), dugong (Dugong dugon), and Florida manatee (Trichechus manatus latirostris) in relation to their closest living terrestrial relatives (elephants and hyraxes). These analyses indicate most loci experienced elevated nucleotide substitution rates during their transition to a fully aquatic lifestyle. While most of these genes evolved under neutrality or strong purifying selection, the rate of nonsynonymous/synonymous replacements increased in two genes (Hbz-T1 and Hba-T1) that encode the α-type chains of hemoglobin (Hb) during each stage of life. Notably, the relaxed evolution of Hba-T1 is temporally coupled with the emergence of a chimeric pseudogene (Hba-T2/Hbq-ps) that contributed to the tandemly linked Hba-T1 of stem sirenians via interparalog gene conversion. Functional tests on recombinant Hb proteins from extant and ancestral sirenians further revealed that the molecular remodeling of Hba-T1 coincided with increased Hb–O2 affinity in early sirenians. Available evidence suggests that this trait evolved to maximize O2 extraction from finite lung stores and suppress tissue O2 offloading, thereby facilitating the low metabolic intensities of extant sirenians. In contrast, the derived reduction in Hb–O2 affinity in (sub)Arctic Steller’s sea cows is consistent with fueling increased thermogenesis by these once colossal marine herbivores.
AbstractMolecular phylogenetics has neglected polymorphisms within present and ancestral populations for a long time. Recently, multispecies coalescent based methods have increased in popularity, however, their application is limited to a small number of species and individuals. We introduced a polymorphism-aware phylogenetic model (PoMo), which overcomes this limitation and scales well with the increasing amount of sequence data whereas accounting for present and ancestral polymorphisms. PoMo circumvents handling of gene trees and directly infers species trees from allele frequency data. Here, we extend the PoMo implementation in IQ-TREE and integrate search for the statistically best-fit mutation model, the ability to infer mutation rate variation across sites, and assessment of branch support values. We exemplify an analysis of a hundred species with ten haploid individuals each, showing that PoMo can perform inference on large data sets. While PoMo is more accurate than standard substitution models applied to concatenated alignments, it is almost as fast. We also provide bmm-simulate, a software package that allows simulation of sequences evolving under PoMo. The new options consolidate the value of PoMo for phylogenetic analyses with population data.
AbstractTranscription regulatory networks (TRNs) are of central importance for both short-term phenotypic adaptation in response to environmental fluctuations and long-term evolutionary adaptation, with global regulatory genes often being targets of natural selection in laboratory experiments. Here, we combined evolution experiments, whole-genome resequencing, and molecular genetics to investigate the driving forces, genetic constraints, and molecular mechanisms that dictate how bacteria can cope with a drastic perturbation of their TRNs. The crp gene, encoding a major global regulator in Escherichia coli, was deleted in four different genetic backgrounds, all derived from the Long-Term Evolution Experiment (LTEE) but with different TRN architectures. We confirmed that crp deletion had a more deleterious effect on growth rate in the LTEE-adapted genotypes; and we showed that the ptsG gene, which encodes the major glucose-PTS transporter, gained CRP (cyclic AMP receptor protein) dependence over time in the LTEE. We then further evolved the four crp-deleted genotypes in glucose minimal medium, and we found that they all quickly recovered from their growth defects by increasing glucose uptake. We showed that this recovery was specific to the selective environment and consistently relied on mutations in the cis-regulatory region of ptsG, regardless of the initial genotype. These mutations affected the interplay of transcription factors acting at the promoters, changed the intrinsic properties of the existing promoters, or produced new transcription initiation sites. Therefore, the plasticity of even a single promoter region can compensate by three different mechanisms for the loss of a key regulatory hub in the E. coli TRN.
AbstractSex determination in varanids, Gila monsters, beaded lizards, and other anguimorphan lizards is still poorly understood. Sex chromosomes were reported only in a few species based solely on cytogenetics, which precluded assessment of their homology. We uncovered Z-chromosome-specific genes in varanids from their transcriptomes. Comparison of differences in gene copy numbers between sexes across anguimorphan lizards and outgroups revealed that homologous differentiated ZZ/ZW sex chromosomes are present in Gila monsters, beaded lizards, alligator lizards, and a wide phylogenetic spectrum of varanids. However, these sex chromosomes are not homologous to those known in other amniotes. We conclude that differentiated sex chromosomes were already present in the common ancestor of Anguimorpha living in the early Cretaceous or even in the Jurassic Period, 115–180 Ma, placing anguimorphan sex chromosomes among the oldest known in vertebrates. The analysis of transcriptomes of Komodo dragon (Varanus komodoensis) showed that the expression levels of genes linked to anguimorphan sex chromosomes are not balanced between sexes. Besides expanding our knowledge on vertebrate sex chromosome evolution, our study has important practical relevance for breeding and ecological studies. We introduce the first, widely applicable technique of molecular sexing in varanids, Gila monsters, and beaded lizards, where reliable determination of sex based on external morphology is dubious even in adults.
Long-chain polyunsaturated fatty acids (PUFAs) are essential for human brain development and immunity. Although they can be obtained by consuming certain foods, especially seafood, the largest dietary source of PUFAs is plant oils. However, plant-derived PUFAs are shorter and must be converted into longer chains by enzymes known as fatty acid desaturases (FADS). Research over the last decade has revealed that humans vary in the efficiency of their FADS enzymes, and this can be attributed in large part to variation in the genes encoding these enzymes. In a new study in the current issue of Genome Biology and Evolution entitled “Evolution of hominin polyunsaturated fatty acid metabolism: from Africa to the New World,” a group of researchers led by Timothy O’Connor at the University of Maryland provides an in-depth look at the history of these genes in human populations, with particular relevance for the health of present-day Native Americans (Harris et al. 2019).
In the last five years, scientists have discovered a staggering number of ultra-small microbes, doubling the number of known lineages. Because of their extremely small size, these nanoorganisms are thought to have reduced genomes and to lack the proteins needed to carry out more complex metabolic processes. As reported in this issue of Genome Biology and Evolution (Lannes et al. 2019), however, researchers at Sorbonne University and The Open University show that some ultra-small microbes do indeed participate in complex metabolisms and make greater contributions to global carbon cycles than previously realized.
AbstractDuring the last two decades, there has been a public health concern of severe invasive infections caused by Group A Streptococcus (GAS) of the emm1 genotype. This study investigated the dynamics of emm1 GAS during 1994–2013 in Belgium. emm1 GAS isolated from blood, tissue, and wounds of patients with invasive infections (n = 23, S1–S23), and from patients with uncomplicated pharyngitis (n = 15, NS1–NS15) were subjected to whole-genome mapping (WGM; kpn) (Opgen). Whole-genome sequencing was performed on 25 strains (WGS; S1–S23 and NS6–NS7) (Illumina Inc.). Belgian GAS belonged to the M1T1 clone typified by the 36-kb chromosomal region encoding extracellular toxins, NAD+-glycohydrolase and streptolysin O. Strains from 1994–1999 clustered together with published strains (MGAS5005 and M1476). From 2001 onward, invasive GAS showed higher genomic divergence in the accessory genome and harbored on average 7% prophage content. Low evolutionary rate (2.49E-008; P > 0.05) was observed in this study, indicating a highly stable genome. The studied invasive and pharyngitis isolates were no genetically distinct populations based on the WGM and core genome phylogeny analyses. Two copies of the speJ superantigen were present in the 1999 and 2010 study strains (n = 3), one being chromosomal and one being truncated and associated with phage remnants.This study showed that emm1 GAS in Belgium, compared with Canada and UK M1 strains, were highly conserved by harboring a remarkable genome stability over a 19-year period with variations observed in the accessory genome.
AbstractMechanisms of genome evolution are fundamental to our understanding of adaptation and the generation and maintenance of biodiversity, yet genome dynamics are still poorly characterized in many clades. Strong correlations between variation in genomic attributes and species diversity across the plant tree of life suggest that polyploidy or other mechanisms of genome size change confer selective advantages due to the introduction of genomic novelty. Palms (order Arecales, family Arecaceae) are diverse, widespread, and dominant in tropical ecosystems, yet little is known about genome evolution in this ecologically and economically important clade. Here, we take a phylogenetic comparative approach to investigate palm genome dynamics using genomic and transcriptomic data in combination with a recent, densely sampled, phylogenetic tree. We find conclusive evidence of a paleopolyploid event shared by the ancestor of palms but not with the sister clade, Dasypogonales. We find evidence of incremental chromosome number change in the palms as opposed to one of recurrent polyploidy. We find strong phylogenetic signal in chromosome number, but no signal in genome size, and further no correlation between the two when correcting for phylogenetic relationships. Palms thus add to a growing number of diverse, ecologically successful clades with evidence of whole-genome duplication, sister to a species-poor clade with no evidence of such an event. Disentangling the causes of genome size variation in palms moves us closer to understanding the genomic conditions facilitating adaptive radiation and ecological dominance in an evolutionarily successful, emblematic tropical clade.
Acropora milleporaAcropora digitiferagenomeWGS
AbstractPhytopathogen genomes are under constant pressure to change, as pathogens are locked in an evolutionary arms race with their hosts, where pathogens evolve effector genes to manipulate their hosts, whereas the hosts evolve immune components to recognize the products of these genes. Colletotrichum higginsianum (Ch), a fungal pathogen with no known sexual morph, infects Brassicaceae plants including Arabidopsis thaliana. Previous studies revealed that Ch differs in its virulence toward various Arabidopsis thaliana ecotypes, indicating the existence of coevolutionary selective pressures. However, between-strain genomic variations in Ch have not been studied. Here, we sequenced and assembled the genome of a Ch strain, resulting in a highly contiguous genome assembly, which was compared with the chromosome-level genome assembly of another strain to identify genomic variations between strains. We found that the two closely related strains vary in terms of large-scale rearrangements, the existence of strain-specific regions, and effector candidate gene sets and that these variations are frequently associated with transposable elements (TEs). Ch has a compartmentalized genome consisting of gene-sparse, TE-dense regions with more effector candidate genes and gene-dense, TE-sparse regions harboring conserved genes. Additionally, analysis of the conservation patterns and syntenic regions of effector candidate genes indicated that the two strains vary in their effector candidate gene sets because of de novo evolution, horizontal gene transfer, or gene loss after divergence. Our results reveal mechanisms for generating genomic diversity in this asexual pathogen, which are important for understanding its adaption to hosts.
AbstractGenome assemblies from next-generation sequencing technologies are now an integral part of biological research, but many sequencing and assembly processes are still error-prone. Unfortunately, these errors can propagate to downstream analyses and wreak havoc on results and conclusions. Although such errors are recognized when dealing with diploid genotype data, modern reference assemblies (which are represented as haploid sequences) lack any type of succinct quality assessment for every position. Here we present Referee, a program that uses diploid genotype quality information in order to annotate a haploid assembly with a quality score for every position. Referee aims to provide an assembly with concise quality information on a Phred-like scale in FASTQ format for easy filtering of low-quality sites. Referee also provides output of quality scores in BED format that can be easily visualized as tracks on most genome browsers. Referee is freely available at https://gwct.github.io/referee/.
AbstractPrevious studies of the evolution of genes expressed at different life-cycle stages of Drosophila melanogaster have not been able to disentangle adaptive from nonadaptive substitutions when using nonsynonymous sites. Here, we overcome this limitation by combining whole-genome polymorphism data from D. melanogaster and divergence data between D. melanogaster and Drosophila yakuba. For the set of genes expressed at different life-cycle stages of D. melanogaster, as reported in modENCODE, we estimate the ratio of substitutions relative to polymorphism between nonsynonymous and synonymous sites (α) and then α is discomposed into the ratio of adaptive (ωa) and nonadaptive (ωna) substitutions to synonymous substitutions.We find that the genes expressed in mid- and late-embryonic development are the most conserved, whereas those expressed in early development and postembryonic stages are the least conserved. Importantly, we found that low conservation in early development is due to high rates of nonadaptive substitutions (high ωna), whereas in postembryonic stages it is due, instead, to high rates of adaptive substitutions (high ωa).By using estimates of different genomic features (codon bias, average intron length, exon number, recombination rate, among others), we also find that genes expressed in mid- and late-embryonic development show the most complex architecture: they are larger, have more exons, more transcripts, and longer introns. In addition, these genes are broadly expressed among all stages. We suggest that all these genomic features are related to the conservation of mid- and late-embryonic development. Globally, our study supports the hourglass pattern of conservation and adaptation over the life-cycle.
AbstractTo compare overall genome structure and transcription activator-like effector content, we completely sequenced Xanthomonas axonopodis pv. glycines strain 12-2, isolated in 1992 in Thailand, and strain EB08, isolated in 2008 in the United States (Iowa) using PacBio technology. We reassembled the genome sequence for a second US strain, 8ra, derived from a 1980 Iowa isolate, from existing PacBio reads. Despite geographic and temporal separation, the three genomes are highly syntenous, and their transcription activator-like effector repertoires are highly conserved.
AbstractModern rice cultivars are adapted to a range of environmental conditions and human preferences. At the root of this diversity is a marked genetic structure, owing to multiple foundation events. Admixture and recurrent introgression from wild sources have played upon this base to produce the myriad adaptations existing today. Genome-wide studies bring support to this idea, but understanding the history and nature of particular genetic adaptations requires the identification of specific patterns of genetic exchange. In this study, we explore the patterns of haplotype similarity along the genomes of a subset of rice cultivars available in the 3,000 Rice Genomes data set. We begin by establishing a custom method of classification based on a combination of dimensionality reduction and kernel density estimation. Through simulations, the behavior of this classifier is studied under scenarios of varying genetic divergence, admixture, and alien introgression. Finally, the method is applied to local haplotypes along the genome of a Core set of Asian Landraces. Taking the Japonica, Indica, and cAus groups as references, we find evidence of reciprocal introgressions covering 2.6% of reference genomes on average. Structured signals of introgression among reference accessions are discussed. We extend the analysis to elucidate the genetic structure of the group circum-Basmati: we delimit regions of Japonica, cAus, and Indica origin, as well as regions outlier to these groups (13% on average). Finally, the approach used highlights regions of partial to complete loss of structure that can be attributed to selective pressures during domestication.
AbstractBacterial pathogens evolve during the course of infection as they adapt to the selective pressures that confront them inside the host. Identification of adaptive mutations and their contributions to pathogen fitness remains a central challenge. Although mutations can either target intergenic or coding regions in the pathogen genome, studies of host adaptation have focused predominantly on molecular evolution within coding regions, whereas the role of intergenic mutations remains unclear. Here, we address this issue and investigate the extent to which intergenic mutations contribute to the evolutionary response of a clinically important bacterial pathogen, Pseudomonas aeruginosa, to the host environment, and whether intergenic mutations have distinct roles in host adaptation. We characterize intergenic evolution in 44 clonal lineages of P. aeruginosa and identify 77 intergenic regions in which parallel evolution occurs. At the genetic level, we find that mutations in regions under selection are located primarily within regulatory elements upstream of transcriptional start sites. At the functional level, we show that some of these mutations both increase or decrease transcription of genes and are directly responsible for evolution of important pathogenic phenotypes including antibiotic sensitivity. Importantly, we find that intergenic mutations facilitate essential genes to become targets of evolution. In summary, our results highlight the evolutionary significance of intergenic mutations in creating host-adapted strains, and that intergenic and coding regions have different qualitative contributions to this process.
AbstractThe evolution of mitochondrial genomes and their population-genetic environment among unicellular eukaryotes are understudied. Ciliate mitochondrial genomes exhibit a unique combination of characteristics, including a linear organization and the presence of multiple genes with no known function or detectable homologs in other eukaryotes. Here we study the variation of ciliate mitochondrial genomes both within and across 13 highly diverged Paramecium species, including multiple species from the P. aurelia species complex, with four outgroup species: P. caudatum, P. multimicronucleatum, and two strains that may represent novel related species. We observe extraordinary conservation of gene order and protein-coding content in Paramecium mitochondria across species. In contrast, significant differences are observed in tRNA content and copy number, which is highly conserved in species belonging to the P. aurelia complex but variable among and even within the other Paramecium species. There is an increase in GC content from ∼20% to ∼40% on the branch leading to the P. aurelia complex. Patterns of polymorphism in population-genomic data and mutation-accumulation experiments suggest that the increase in GC content is primarily due to changes in the mutation spectra in the P. aurelia species. Finally, we find no evidence of recombination in Paramecium mitochondria and find that the mitochondrial genome appears to experience either similar or stronger efficacy of purifying selection than the nucleus.
AbstractSignaling through ligand/receptor interactions is a widespread mechanism across all living taxa. During evolution, however, there has been a diversification in multigene families and changes in their interaction patterns. Among the events that led to the creation of new genes is the whole-genome duplication, which made possible some major innovations. Teleost fishes descended from a common ancestor which underwent one such whole-genome duplication.In our study, we investigated the effect of complete genome duplication on the evolution of ligand–receptor pairs in teleosts. We selected ten teleost species and used bioinformatics programs and phylogenetic tools in order to study the evolution of the human ligands and receptors that have orthologous genes in fishes, as well as the rest of the fish genomes.We established that since the complete duplication of the fish genomes, the conservation in duplicate copy of ligand and receptor genes is higher than expected. However, the ligand/receptor pair partners did not necessarily evolve in the same way, and a lot of situations occurred in which one of the partners returned in singleton copy when the other one was maintained in duplicate. This suggests that changes in interaction partners may have taken place during the evolution of teleosts. Moreover, the fate of the ligands and receptor coding genes is partly congruent with the phylogeny of teleosts. However, some incongruences can be observed. We suggest that these incongruences are correlated to the environment.
AbstractThe metabolic conversion of dietary omega-3 and omega-6 18 carbon (18C) to long chain (>20 carbon) polyunsaturated fatty acids (LC-PUFAs) is vital for human life. The rate-limiting steps of this process are catalyzed by fatty acid desaturase (FADS) 1 and 2. Therefore, understanding the evolutionary history of the FADS genes is essential to our understanding of hominin evolution. The FADS genes have two haplogroups, ancestral and derived, with the derived haplogroup being associated with more efficient LC-PUFA biosynthesis than the ancestral haplogroup. In addition, there is a complex global distribution of these haplogroups that is suggestive of Neanderthal introgression. We confirm that Native American ancestry is nearly fixed for the ancestral haplogroup, and replicate a positive selection signal in Native Americans. This positive selection potentially continued after the founding of the Americas, although simulations suggest that the timing is dependent on the allele frequency of the ancestral Beringian population. We also find that the Neanderthal FADS haplotype is more closely related to the derived haplogroup and the Denisovan clusters closer to the ancestral haplogroup. Furthermore, the derived haplogroup has a time to the most recent common ancestor of 688,474 years before present. These results support an ancient polymorphism, as opposed to Neanderthal introgression, forming in the FADS region during the Pleistocene with possibly differential selection pressures on both haplogroups. The near fixation of the ancestral haplogroup in Native American ancestry calls for future studies to explore the potential health risk of associated low LC-PUFA levels in these populations.
AbstractMembers of the crustacean subclass Copepoda are likely the most abundant metazoans worldwide. Pelagic marine species are critical in converting planktonic microalgae to animal biomass, supporting oceanic food webs. Despite their abundance and ecological importance, only six copepod genomes are publicly available, owing to a number of factors including large genome size, repetitiveness, GC-content, and small animal size. Here, we report the seventh representative copepod genome and the first genome and the first transcriptome from the calanoid copepod species Acartia tonsa Dana, which is among the most numerous mesozooplankton in boreal coastal and estuarine waters. The ecology, physiology, and behavior of A. tonsa have been studied extensively. The genetic resources contributed in this work will allow researchers to link experimental results to molecular mechanisms. From PCR-free whole genome sequence and mRNA Illumina data, we assemble the largest copepod genome to date. We estimate that A. tonsa has a total genome size of 2.5 Gb including repetitive elements we could not resolve. The nonrepetitive fraction of the genome assembly is estimated to be 566 Mb. Our DNA sequencing-based analyses suggest there is a 14-fold difference in genome size between the six members of Copepoda with available genomic information. This finding complements nucleus staining genome size estimations, where 100-fold difference has been reported within 70 species. We briefly analyze the repeat structure in the existing copepod whole genome sequence data sets. The information presented here confirms the evolution of genome size in Copepoda and expands the scope for evolutionary inferences in Copepoda by providing several levels of genetic information from a key planktonic crustacean species.