Continue Reading →
AbstractEvolutionary genetic studies have shown a positive correlation between levels of nucleotide diversity and either rates of recombination or genetic distance to genes. Both positive-directional and purifying selection have been offered as the source of these correlations via genetic hitchhiking and background selection, respectively. Phylogenetically conserved elements (CEs) are short (∼100 bp), widely distributed (comprising ∼5% of genome), sequences that are often found far from genes. While the function of many CEs is unknown, CEs also are associated with reduced diversity at linked sites. Using high coverage (>80×) whole genome data from two human populations, the Yoruba and the CEU, we perform fine scale evaluations of diversity, rates of recombination, and linkage to genes. We find that the local rate of recombination has a stronger effect on levels of diversity than linkage to genes, and that these effects of recombination persist even in regions far from genes. Our whole genome modeling demonstrates that, rather than recombination or GC-biased gene conversion, selection on sites within or linked to CEs better explains the observed genomic diversity patterns. A major implication is that very few sites in the human genome are predicted to be free of the effects of selection. These sites, which we refer to as the human “neutralome,” comprise only 1.2% of the autosomes and 5.1% of the X chromosome. Demographic analysis of the neutralome reveals larger population sizes and lower rates of growth for ancestral human populations than inferred by previous analyses.
Ana Catalán, Aide Macias-Muñoz, and Adriana D. Briscoe
AbstractThermal tolerance is a key determinant of species distribution. Despite much study, the genetic basis of adaptive evolution of thermal tolerance, including the relative contributions of transcriptional regulation versus protein evolution, remains unclear. Populations of the intertidal copepod Tigriopus californicus are adapted to local thermal regimes across their broad geographic range. Upon thermal stress, adults from a heat tolerant southern population, San Diego (SD), upregulate several heat shock proteins (HSPs) to higher levels than those from a less tolerant northern population, Santa Cruz (SC). Suppression of a specific HSP, HSPB1, significantly reduces T. californicus survival following acute heat stress. Sequencing of HSPB1 revealed population specific nucleotide substitutions in both promoter and coding regions of the gene. HSPB1 promoters from heat tolerant populations contain two canonical heat shock elements (HSEs), the binding sites for heat shock transcription factor (HSF), whereas less tolerant populations have mutations in these conserved motifs. Allele specific expression of HSPB1 in F1 hybrids between tolerant and less tolerant populations showed significantly biased expression favoring alleles from tolerant populations and supporting the adaptive divergence in these cis-regulatory variants. The functional impact of population-specific nonsynonymous substitutions in HSPB1 coding sequences was tested by assessing the thermal stabilization properties of SD versus SC HSPB1 protein variants. Recombinant HSPB1 from the southern SD population showed greater capacity for protecting protein structure under elevated temperature. Our results indicate that both regulatory and protein coding sequence evolution within a single gene appear to contribute to thermal tolerance phenotypes and local adaptation among conspecific populations.
AbstractIn the history of life, some phenotypes have been acquired several times independently, through convergent evolution. Recently, lots of genome-scale studies have been devoted to identify nucleotides or amino acids that changed in a convergent manner when the convergent phenotypes evolved. These efforts have had mixed results, probably because of differences in the detection methods, and because of conceptual differences about the definition of a convergent substitution. Some methods contend that substitutions are convergent only if they occur on all branches where the phenotype changed toward the exact same state at a given nucleotide or amino acid position. Others are much looser in their requirements and define a convergent substitution as one that leads the site at which they occur to prefer a phylogeny in which species with the convergent phenotype group together. Here, we suggest to look for convergent shifts in amino acid preferences instead of convergent substitutions to the exact same amino acid. We define as convergent shifts substitutions that occur on all branches where the phenotype changed and such that they correspond to a change in the type of amino acid preferred at this position. We implement the corresponding model into a method named PCOC. We show on simulations that PCOC better recovers convergent shifts than existing methods in terms of sensitivity and specificity. We test it on a plant protein alignment where convergent evolution has been studied in detail and find that our method recovers several previously identified convergent substitutions and proposes credible new candidates.
AbstractHuman skin color diversity is considered an adaptation to environmental conditions such as UV radiation. Investigations into the genetic bases of such adaptation have identified a group of pigmentation genes contributing to skin color diversity in African and non-African populations. Here, we present a population analysis of the pigmentation gene KITLG with previously reported signal of Darwinian positive selection in both European and East Asian populations. We demonstrated that there had been recurrent selective events in the upstream and the downstream regions of KITLG in Eurasian populations. More importantly, besides the expected selection on the KITLG variants favoring light skin in coping with the weak UV radiation at high latitude, we observed a KITLG variant showing adaptation to winter temperature. In particular, compared with UV radiation, winter temperature showed a much stronger correlation with the prevalence of the presumably adaptive KITLG allele in Asian populations. This observation was further supported by the in vitro functional test at low temperature. Consequently, the pleiotropic effects of KITLG, that is, pigmentation and thermogenesis were both targeted by natural selection that acted on different KITLG sequence variants, contributing to the adaptation of Eurasians to both UV radiation and winter temperature at high latitude areas.
AbstractThe main outcome of molecular dating, the timetree, provides crucial information for understanding the evolutionary history of lineages and is a requirement of several evolutionary analyses. Although essential, the estimation of divergence times from molecular data is frequently regarded as a complicated task. However, establishing biological timescales can be performed in a straightforward manner, even with large, genome-wide data sets. This protocol presents all the necessary steps to estimate a timetree in the program MEGA X. It also illustrates how the TimeTree resource can be a useful tool to obtain chronological information based on previous studies, therefore yielding calibration boundaries.
AbstractAdmixture between populations provides opportunity to study biological adaptation and phenotypic variation. Admixture studies rely on local ancestry inference for admixed individuals, which consists of computing at each locus the number of copies that originate from ancestral source populations. Existing software packages for local ancestry inference are tuned to provide accurate results on human data and recent admixture events. Here, we introduce Loter, an open-source software package that does not require any biological parameter besides haplotype data in order to make local ancestry inference available for a wide range of species. Using simulations, we compare the performance of Loter to HAPMIX, LAMP-LD, and RFMix. HAPMIX is the only software severely impacted by imperfect haplotype reconstruction. Loter is the less impacted software by increasing admixture time when considering simulated and admixed human genotypes. For simulations of admixed Populus genotypes, Loter and LAMP-LD are robust to increasing admixture times by contrast to RFMix. When comparing length of reconstructed and true ancestry tracts, Loter and LAMP-LD provide results whose accuracy is again more robust than RFMix to increasing admixture times. We apply Loter to individuals resulting from admixture between Populus trichocarpa and Populus balsamifera and lengths of ancestry tracts indicate that admixture took place ∼100 generations ago. We expect that providing a rapid and parameter-free software for local ancestry inference will make more accessible genomic studies about admixture processes.
AbstractThe mechanisms by which organisms adapt to variable environments are a fundamental question in evolutionary biology and are important to protect important species in response to a changing climate. An interesting candidate to study this question is the honey bee Apis cerana, a keystone pollinator with a wide distribution throughout a large variety of climates, that exhibits rapid dispersal. Here, we resequenced the genome of 180 A. cerana individuals from 18 populations throughout China. Using a population genomics approach, we observed considerable genetic variation in A. cerana. Patterns of genetic differentiation indicate high divergence at the subspecies level, and physical barriers rather than distance are the driving force for population divergence. Estimations of divergence time suggested that the main branches diverged between 300 and 500 Ka. Analyses of the population history revealed a substantial influence of the Earth’s climate on the effective population size of A. cerana, as increased population sizes were observed during warmer periods. Further analyses identified candidate genes under natural selection that are potentially related to honey bee cognition, temperature adaptation, and olfactory. Based on our results, A. cerana may have great potential in response to climate change. Our study provides fundamental knowledge of the evolution and adaptation of A. cerana.
AbstractThe relative evolutionary rates at individual sites in proteins are informative measures of conservation or adaptation. Often used as evolutionarily aware conservation scores, relative rates reveal key functional or strongly selected residues. Estimating rates in a phylogenetic context requires specifying a protein substitution model, which is typically a phenomenological model trained on a large empirical data set. A strong emphasis has traditionally been placed on selecting the “best-fit” model, with the implicit understanding that suboptimal or otherwise ill-fitting models might bias inferences. However, the pervasiveness and degree of such bias has not been systematically examined. We investigated how model choice impacts site-wise relative rates in a large set of empirical protein alignments. We compared models designed for use on any general protein, models designed for specific domains of life, and the simple equal-rates Jukes Cantor-style model (JC). As expected, information theoretic measures showed overwhelming evidence that some models fit the data decidedly better than others. By contrast, estimates of site-specific evolutionary rates were impressively insensitive to the substitution model used, revealing an unexpected degree of robustness to potential model misspecification. A deeper examination of the fewer than 5% of sites for which model inferences differed in a meaningful way showed that the JC model could uniquely identify rapidly evolving sites that models with empirically derived exchangeabilities failed to detect. We conclude that relative protein rates appear robust to the applied substitution model, and any sensible model of protein evolution, regardless of its fit to the data, should produce broadly consistent evolutionary rates.
AbstractEndosymbiosis has been common all along eukaryotic evolution, providing opportunities for genomic and organellar innovation. Plastids are a prominent example. After the primary endosymbiosis of the cyanobacterial plastid ancestor, photosynthesis spread in many eukaryotic lineages via secondary endosymbioses involving red or green algal endosymbionts and diverse heterotrophic hosts. However, the number of secondary endosymbioses and how they occurred remain poorly understood. In particular, contrasting patterns of endosymbiotic gene transfer have been detected and subjected to various interpretations. In this context, accurate detection of endosymbiotic gene transfers is essential to avoid wrong evolutionary conclusions. We have assembled a strictly selected set of markers that provides robust phylogenomic evidence suggesting that nuclear genes involved in the function and maintenance of green secondary plastids in chlorarachniophytes and euglenids have unexpected mixed red and green algal origins. This mixed ancestry contrasts with the clear red algal origin of most nuclear genes carrying similar functions in secondary algae with red plastids.
AbstractThe hypothesis that eusociality originated once in Vespidae has shaped interpretation of social evolution for decades and has driven the supposition that preimaginal morphophysiological differences between castes were absent at the outset of eusociality. Many researchers also consider casteless nest-sharing an antecedent to eusociality. Together, these ideas endorse a stepwise progression of social evolution in wasps (solitary → casteless nest-sharing → eusociality with rudimentary behavioral castes → eusociality with preimaginal caste-biasing (PCB) → morphologically differentiated castes). Here, we infer the phylogeny of Vespidae using sequence data generated via anchored hybrid enrichment from 378 loci across 136 vespid species and perform ancestral state reconstructions to test whether rudimentary and monomorphic castes characterized the initial stages of eusocial evolution. Our results reject the single origin of eusociality hypothesis, contest the supposition that eusociality emerged from a casteless nest-sharing ancestor, and suggest that eusociality in Polistinae + Vespinae began with castes having morphological differences. An abrupt appearance of castes with ontogenetically established morphophysiological differences conflicts with the current conception of stepwise social evolution and suggests that the climb up the ladder of sociality does not occur through sequential mutation. Phenotypic plasticity and standing genetic variation could explain how cooperative brood care evolved in concert with nest-sharing and how morphologically dissimilar castes arose without a rudimentary intermediate. Furthermore, PCB at the outset of eusociality implicates a subsocial route to eusociality in Polistinae + Vespinae, emphasizing the role of mother–daughter interactions and subfertility (i.e. the cost component of kin selection) in the origin of workers.
AbstractHomeobox genes are key toolkit genes that regulate the development of metazoans and changes in their regulation and copy number have contributed to the evolution of phenotypic diversity. We recently identified a whole genome duplication (WGD) event that occurred in an ancestor of spiders and scorpions (Arachnopulmonata), and that many homeobox genes, including two Hox clusters, appear to have been retained in arachnopulmonates. To better understand the consequences of this ancient WGD and the evolution of arachnid homeobox genes, we have characterized and compared the homeobox repertoires in a range of arachnids. We found that many families and clusters of these genes are duplicated in all studied arachnopulmonates (Parasteatoda tepidariorum, Pholcus phalangioides, Centruroides sculpturatus, and Mesobuthus martensii) compared with nonarachnopulmonate arachnids (Phalangium opilio, Neobisium carcinoides, Hesperochernes sp., and Ixodes scapularis). To assess divergence in the roles of homeobox ohnologs, we analyzed the expression of P. tepidariorum homeobox genes during embryogenesis and found pervasive changes in the level and timing of their expression. Furthermore, we compared the spatial expression of a subset of P. tepidariorum ohnologs with their single copy orthologs in P. opilio embryos. We found evidence for likely subfunctionlization and neofunctionalization of these genes in the spider. Overall our results show a high level of retention of homeobox genes in spiders and scorpions post-WGD, which is likely to have made a major contribution to their developmental evolution and diversification through pervasive subfunctionlization and neofunctionalization, and paralleling the outcomes of WGD in vertebrates.
AbstractThe origin of hepadnaviruses (Hepadnaviridae), a group of reverse-transcribing DNA viruses that infect vertebrates, remains mysterious. All the known retrotransposons are only distantly related to hepadnaviruses. Here, we report the discovery of two novel lineages of retroelements, which we designate hepadnavirus-like retroelement (HEART1 and HEART2), within the insect genomes through screening 1, 095 eukaryotic genomes. Both phylogenetic and similarity analyses suggest that the HEART retroelements represent the closest nonviral relatives of hepadnaviruses so far. The discovery of HEART retroelements narrows down the evolutionary gap between hepadnaviruses and retrotransposons and might thus provide unique insights into the origin and evolution of hepadnaviruses.
AbstractDifferences in behavior and life history traits between females and males are the basis of divergent selective pressures between sexes. It has been suggested that a way for the two sexes to deal with different life history requirements is through sex-biased gene expression. In this study, we performed a comparative sex-biased gene expression analysis of the combined eye and brain transcriptome from five Heliconius species, H. charithonia, H. sara, H. erato, H. melpomene and H. doris, representing five of the main clades from the Heliconius phylogeny. We found that the degree of sexual dimorphism in gene expression is not conserved across Heliconius. Most of the sex-biased genes identified in each species are not sex-biased in any other, suggesting that sexual selection might have driven sexually dimorphic gene expression. Only three genes shared sex-biased expression across multiple species: ultraviolet opsin UVRh1 and orthologs of Drosophila Krüppel-homolog 1 and CG9492. We also observed that in some species female-biased genes have higher evolutionary rates, but in others, male-biased genes show the fastest rates when compared with unbiased genes, suggesting that selective forces driving sex-biased gene evolution in Heliconius act in a sex- and species-specific manner. Furthermore, we found dosage compensation in all the Heliconius tested, providing additional evidence for the conservation of dosage compensation across Lepidoptera. Finally, sex-biased genes are significantly enriched on the Z, a pattern that could be a result of sexually antagonistic selection.
AbstractZic family genes encode C2H2-type zinc finger proteins that act as critical toolkit proteins in the metazoan body plan establishment. In this study, we searched evolutionarily conserved domains (CDs) among 121 Zic protein sequences from 22 animal phyla and 40 classes, and addressed their evolutionary significance. The collected sequences included those from poriferans and orthonectids. We discovered seven new CDs, CD0–CD6, (in order from the N- to C-terminus) using the most conserved Zic protein sequences from Deuterostomia (Hemichordata and Cephalochordata), Lophotrochozoa (Cephalopoda and Brachiopoda), and Ecdysozoa (Chelicerata and Priapulida). Subsequently, we analyzed the evolutionary history of Zic CDs including the known CDs (ZOC, ZFD, ZFNC, and ZFCC). All Zic CDs are predicted to have existed in a bilaterian ancestor. During evolution, they have degenerated in a taxa-selective manner with significant correlations among CDs. The N terminal CD (CD0) was largely lost, but was observed in Brachiopoda, Priapulida, Hemichordata, Echinodermata, and Cephalochordata, and the C terminal CD (CD6) was highly conserved in conserved-type-Zic possessing taxa, but was truncated in vertebrate Zic gene paralogues (Zic1/2/3), generating a vertebrate-specific C-terminus critical for transcriptional regulation. ZOC was preferentially conserved in insects and in an anthozoan paralogue, and it was bound to the homeodomain transcription factor Msx in a phylogenetically conserved manner. Accordingly, the extent of divergence of Msx and Zic CDs from their respective bilaterian ancestors is strongly correlated. These results suggest that coordinated divergence among the toolkit CDs and among toolkit proteins is involved in the divergence of metazoan body plans.
AbstractIsoprenoids and their derivatives represent the largest group of organic compounds in nature and are distributed universally in the three domains of life. Isoprenoids are biosynthesized from isoprenyl diphosphate units, generated by two distinctive biosynthetic pathways: mevalonate pathway and methylerthritol 4-phosphate pathway. Archaea and eukaryotes exclusively have the former pathway, while most bacteria have the latter. Some bacteria, however, are known to possess the mevalonate pathway genes. Understanding the evolutionary history of these two isoprenoid biosynthesis pathways in each domain of life is critical since isoprenoids are so interweaved in the architecture of life that they would have had indispensable roles in the early evolution of life. Our study provides a detailed phylogenetic analysis of enzymes involved in the mevalonate pathway and sheds new light on its evolutionary history. The results suggest that a potential mevalonate pathway is present in the recently discovered superphylum Candidate Phyla Radiation (CPR), and further suggest a strong evolutionary relationship exists between archaea and CPR. Interestingly, CPR harbors the characteristics of both the bacterial-type and archaeal-type mevalonate pathways and may retain signatures regarding the ancestral isoprenoid biosynthesis pathway in the last universal common ancestor. Our study supports the ancient origin of the mevalonate pathway in the three domains of life as previously inferred, but concludes that the evolution of the mevalonate pathway was more complex.
AbstractSelf-transmissible mobile genetic elements drive horizontal gene transfer between prokaryotes. Some of these elements integrate in the chromosome, whereas others replicate autonomously as plasmids. Recent works showed the existence of few differences, and occasional interconversion, between the two types of elements. Here, we enquired on why evolutionary processes have maintained the two types of mobile genetic elements by comparing integrative and conjugative elements (ICE) with extrachromosomal ones (conjugative plasmids) of the highly abundant MPFT conjugative type. We observed that plasmids encode more replicases, partition systems, and antibiotic resistance genes, whereas ICEs encode more integrases and metabolism-associated genes. ICEs and plasmids have similar average sizes, but plasmids are much more variable, have more DNA repeats, and exchange genes more frequently. On the other hand, we found that ICEs are more frequently transferred between distant taxa. We propose a model where the different genetic plasticity and amplitude of host range between elements explain the co-occurrence of integrative and extrachromosomal elements in microbial populations. In particular, the conversion from ICE to plasmid allows ICE to be more plastic, while the conversion from plasmid to ICE allows the expansion of the element’s host range.
AbstractToll-like receptors (TLRs) are key sensor molecules in vertebrates triggering initial phases of immune responses to pathogens. The avian TLR family typically consists of ten receptors, each adapted to distinct ligands. To understand the complex evolutionary history of each avian TLR, we analyzed all members of the TLR family in the whole genome assemblies and target sequence data of 63 bird species covering all major avian clades. Our results indicate that gene duplication events most probably occurred in TLR1 before synapsids diversified from sauropsids. Unlike mammals, ssRNA-recognizing TLR7 has duplicated independently in several avian taxa, while flagellin-sensing TLR5 has pseudogenized multiple times in bird phylogeny. Our analysis revealed stronger positive, diversifying selection acting in TLR5 and the three-domain TLRs (TLR10 [TLR1A], TLR1 [TLR1B], TLR2A, TLR2B, TLR4) that face the extracellular space and bind complex ligands than in single-domain TLR15 and endosomal TLRs (TLR3, TLR7, TLR21). In total, 84 out of 306 positively selected sites were predicted to harbor substitutions dramatically changing the amino acid physicochemical properties. Furthermore, 105 positively selected sites were located in the known functionally relevant TLR regions. We found evidence for convergent evolution acting between birds and mammals at 54 of these sites. Our comparative study provides a comprehensive insight into the evolution of avian TLR genetic variability. Besides describing the history of avian TLR gene gain and gene loss, we also identified candidate positions in the receptors that have been likely shaped by direct molecular host–pathogen coevolutionary interactions and most probably play key functional roles in birds.
AbstractThe highly polymorphic genes of the major histocompatibility complex (MHC) play a key role in adaptive immunity. Divergent allele advantage, a mechanism of balancing selection, is proposed to contribute to their exceptional polymorphism. It assumes that MHC genotypes with more divergent alleles allow for broader antigen-presentation to immune effector cells, by that increasing immunocompetence. However, the direct correlation between pairwise sequence divergence and the corresponding repertoire of bound peptides has not been studied systematically across different MHC genes. Here, we investigated this relationship for five key classical human MHC genes (human leukocyte antigen; HLA-A, -B, -C, -DRB1, and -DQB1), using allele-specific computational binding prediction to 118,097 peptides derived from a broad range of human pathogens. For all five human MHC genes, the genetic distance between two alleles of a heterozygous genotype was positively correlated with the total number of peptides bound by these two alleles. In accordance with the major antigen-presentation pathway of MHC class I molecules, HLA-B and HLA-C alleles showed particularly strong correlations for peptides derived from intracellular pathogens. Intriguingly, this bias coincides with distinct protein compositions between intra- and extracellular pathogens, possibly suggesting adaptation of MHC I molecules to present specifically intracellular peptides. Eventually, we observed significant positive correlations between an allele’s average divergence and its population frequency. Overall, our results support the divergent allele advantage as a meaningful quantitative mechanism through which pathogen-mediated selection leads to the evolution of MHC diversity.
AbstractMost phylogenetic tree-generating programs produce a fully dichotomous phylogenetic tree. However, as different markers may produce distinct topologies for the same set of organisms, topological tests are used to estimate the statistical reliability of the clades. In this protocol, we provide step-by-step instructions on how to perform the widely used bootstrap test using MEGA. However, a single unstable lineage, also known as a rogue lineage, may decrease the bootstrap proportions in many branches of the tree. This occurs because rogue taxa tend to bounce between clades from one pseudo-replicate to the next, lowering bootstrap proportions for many correct clades. Thus, it is important to identify and exclude rogue taxa before initiating a final phylogenetic analysis; here, we provide this protocol using the RogueNaRok platform.
AbstractThe predictability of evolution, or whether lineages repeatedly follow the same evolutionary trajectories during phenotypic convergence remains an open question of evolutionary biology. In this study, we investigate evolutionary convergence at the biochemical pathway level and test the predictability of evolution using floral anthocyanin pigmentation, a trait with a well-understood genetic and regulatory basis. We reconstructed the evolution of floral anthocyanin content across 28 species of the Andean clade Iochrominae (Solanaceae) and investigated how shifts in pigmentation are related to changes in expression of seven key anthocyanin pathway genes. We used phylogenetic multivariate analysis of gene expression to test for phenotypic and developmental convergence at a macroevolutionary scale. Our results show that the four independent losses of the ancestral pigment delphinidin involved convergent losses of expression of the three late pathway genes (F3′5′h, Dfr, and Ans). Transitions between pigment types affecting floral hue (e.g., blue to red) involve changes to the expression of branching genes F3′h and F3′5′h, while the expression levels of early steps of the pathway are strongly conserved in all species. These patterns support the idea that the macroevolution of floral pigmentation follows predictable evolutionary trajectories to reach convergent phenotype space, repeatedly involving regulatory changes. This is likely driven by constraints at the pathway level, such as pleiotropy and regulatory structure.
AbstractGenetic diversity plays a central role in tumor progression, metastasis, and resistance to treatment. Experiments are shedding light on this diversity at ever finer scales, but interpretation is challenging. Using recent progress in numerical models, we simulate macroscopic tumors to investigate the interplay between growth dynamics, microscopic composition, and circulating tumor cell cluster diversity. We find that modest differences in growth parameters can profoundly change microscopic diversity. Simple outwards expansion leads to spatially segregated clones and low diversity, as expected. However, a modest cell turnover can result in an increased number of divisions and mixing among clones resulting in increased microscopic diversity in the tumor core. Using simulations to estimate power to detect such spatial trends, we find that multi-region sequencing data from contemporary studies is marginally powered to detect the predicted effects. Slightly larger samples, improved detection of rare variants, or sequencing of smaller biopsies or circulating tumor cell clusters would allow one to distinguish between leading models of tumor evolution. The genetic composition of circulating tumor cell clusters, which can be obtained from noninvasive blood draws, is therefore informative about tumor evolution and its metastatic potential.
Hiromu C. Suzuki, Katsuhisa Ozaki, Takashi Makino, Hironobu Uchiyama, Shunsuke Yajima, and Masakado Kawata
AbstractThe giant panda (Ailuropoda melanoleuca) is popular around the world and is widely recognized as a symbol of nature conservation. A draft genome of the giant panda is now available, but its Y chromosome has not been sequenced. Y chromosome data are necessary for study of sex chromosome evolution, male development, and spermatogenesis. Thus, in the present study, we sequenced two parts of the giant panda Y chromosome utilizing a male giant panda fosmid library. The sequencing data were assembled into two contigs, each ∼100 kb in length with no gaps, providing high-quality resources for studying the giant panda Y chromosome. Annotation and transposable element comparison indicates varied evolutionary pressure in different regions of the Y chromosome. Two genes, zinc finger protein, Y-linked (ZFY) and lysine demethylase 5D (KDM5D), were annotated and gene conversion was observed for ZFY exon 7. Phylogenetic analysis also revealed that this gene conversion event happened independently in multiple mammalian lineages, indicating a putative mechanism to maintain the function of this particular gene on the Y chromosome. Furthermore, a transposition event, discovered through comparative alignment with the giant panda X chromosome sequence, may be involved in the process of gaining new genes on the Y chromosome. Thus, these newly obtained Y chromosome sequences provide valuable insights into the genomic patterns of the giant panda.
AbstractGenome reduction is pervasive among maternally inherited bacterial endosymbionts. This genome reduction can eventually lead to serious deterioration of essential metabolic pathways, thus rendering an obligate endosymbiont unable to provide essential nutrients to its host. This loss of essential pathways can lead to either symbiont complementation (sharing of the nutrient production with a novel co-obligate symbiont) or symbiont replacement (complete takeover of nutrient production by the novel symbiont). However, the process by which these two evolutionary events happen remains somewhat enigmatic by the lack of examples of intermediate stages of this process. Cinara aphids (Hemiptera: Aphididae) typically harbor two obligate bacterial symbionts: Buchnera and Serratia symbiotica. However, the latter has been replaced by different bacterial taxa in specific lineages, and thus species within this aphid lineage could provide important clues into the process of symbiont replacement. In the present study, using 16S rRNA high-throughput amplicon sequencing, we determined that the aphid Cinara strobi harbors not two, but three fixed bacterial symbionts: Buchnera aphidicola, a Sodalis sp., and S. symbiotica. Through genome assembly and genome-based metabolic inference, we have found that only the first two symbionts (Buchnera and Sodalis) actually contribute to the hosts’ supply of essential nutrients while S. symbiotica has become unable to contribute towards this task. We found that S. symbiotica has a rather large and highly eroded genome which codes only for a few proteins and displays extensive pseudogenization. Thus, we propose an ongoing symbiont replacement within C. strobi, in which a once “competent” S. symbiotica does no longer contribute towards the beneficial association. These results suggest that in dual symbiotic systems, when a substitute cosymbiont is available, genome deterioration can precede genome reduction and a symbiont can be maintained despite the apparent lack of benefit to its host.
AbstractAppreciation is growing for how chromosomes are organized in three-dimensional space at interphase. Microscopic and high throughput sequence-based studies have established that the mammalian inactive X chromosome (Xi) adopts an alternate conformation relative to the active X chromosome. The Xi is organized into several multi-megabase chromatin loops called superloops. At the base of these loops are superloop anchors, and in humans three of these anchors are composed of large tandem repeat DNA that include DXZ4, Functional Intergenic Repeating RNA Element, and Inactive-X CTCF-binding Contact Element (ICCE). Each repeat contains a high density of binding sites for the architectural organization protein CCCTC-binding factor (CTCF) which exclusively associates with the Xi allele in normal cells. Removal of DXZ4 from the Xi compromises proper folding of the chromosome. In this study, we report the characterization of the ICCE tandem repeat, for which very little is known. ICCE is embedded within an intron of the Nobody (NBDY) gene locus at Xp11.21. We find that primary DNA sequence conservation of ICCE is only retained in higher primates, but that ICCE orthologs exist beyond the primate lineage. Like DXZ4, what is conserved is organization of the underlying DNA into a large tandem repeat, physical location within the NBDY locus and conservation of short DNA sequences corresponding to specific CTCF and Yin Yang 1 binding motifs that correlate with female-specific DNA hypomethylation. Unlike DXZ4, ICCE is not common to all eutherian mammals. Analysis of certain ICCE CTCF motifs reveal striking similarity with the DXZ4 motif and support an evolutionary relationship between DXZ4 and ICCE.
AbstractHomeodomain transcription factors are involved in many developmental processes across animals and have been linked to body plan evolution. Detailed classifications of these proteins identified 11 distinct classes of homeodomain proteins in animal genomes, each harboring specific sequence composition and protein domains. Although humans contain the full set of classes, Drosophila melanogaster and Caenorhabditis elegans each lack one specific class. Furthermore, representative previous analyses in sponges, ctenophores, and cnidarians could not identify several classes in those nonbilaterian metazoan taxa. Consequently, it is currently unknown when certain homeodomain protein classes first evolved during animal evolution. Here, we investigate representatives of the sister group to all remaining bilaterians, the Xenacoelomorpha. We analyzed three acoel, one nemertodermatid, and one Xenoturbella transcriptomes and identified their expressed homeodomain protein content. We report the identification of representatives of all 11 classes of animal homeodomain transcription factors in Xenacoelomorpha and we describe and classify their homeobox genes relative to the established animal homeodomain protein families. Our findings suggest that the genome of the last common ancestor of bilateria contained the full set of these gene classes, supporting the subsequent diversification of bilaterians.
AbstractThe diversity of mechanisms and capacity for regeneration across the Metazoa present an intriguing challenge in evolutionary biology, impacting on the burgeoning field of regenerative medicine. Broad taxonomic sampling is essential to improve our understanding of regeneration, and studies outside of the traditional model organisms have proved extremely informative. Within the historically understudied Spiralia, the Annelida have an impressive variety of tractable regenerative systems. The biomeralizing, blastema-less regeneration of the head appendage (operculum) of the serpulid polychaete keelworm Spirobranchus (formerly Pomatoceros) lamarcki is one such system. To profile potential regulatory mechanisms, we classified the homeobox gene content of opercular regeneration transcriptomes. As a result of retrieving several difficult-to-classify homeobox sequences, we performed an extensive search and phylogenetic analysis of the TALE and PRD-class homeobox gene content of a broad selection of lophotrochozoan genomes. These analyses contribute to our increasing understanding of the diversity, taxonomic extent, rapid evolution, and radical flexibility of these recently discovered homeobox gene radiations. Our expansion and integration of previous nomenclature systems helps to clarify their cryptic orthology. We also describe an unusual divergent S. lamarcki Antp gene, a previously unclassified lophotrochozoan orphan gene family (Lopx), and a number of novel Nk class orphan genes. The expression and potential involvement of many of these lineage- and clade-restricted homeobox genes in S. lamarcki operculum regeneration provides an example of diversity in regenerative mechanisms, as well as significantly improving our understanding of homeobox gene evolution.
AbstractMitochondrial genomes of animals have long been considered to evolve under the action of purifying selection. Nevertheless, there is increasing evidence that they can also undergo episodes of positive selection in response to shifts in physiological or environmental demands. Vampire bats experienced such a shift, as they are the only mammals feeding exclusively on blood and possessing anatomical adaptations to deal with the associated physiological requirements (e.g., ingestion of high amounts of liquid water and iron). We sequenced eight new chiropteran mitogenomes including two species of vampire bats, five representatives of other lineages of phyllostomids and one close outgroup. Conducting detailed comparative mitogenomic analyses, we found evidence for accelerated evolutionary rates at the nucleotide and amino acid levels in vampires. Moreover, the mitogenomes of vampire bats are characterized by an increased cytosine (C) content mirrored by a decrease in thymine (T) compared with other chiropterans. Proteins encoded by the vampire bat mitogenomes also exhibit a significant increase in threonine (Thr) and slight reductions in frequency of the hydrophobic residues isoleucine (Ile), valine (Val), methionine (Met), and phenylalanine (Phe). We show that these peculiar substitution patterns can be explained by the co-occurrence of both neutral (mutational bias) and adaptive (positive selection) processes. We propose that vampire bat mitogenomes may have been impacted by selection on mitochondrial proteins to accommodate the metabolism and nutritional qualities of blood meals.