This is a correction to: Brian Charlesworth, Jeffrey D Jensen, Population Genetic Considerations Regarding Evidence for Biased Mutation Rates in Arabidopsis thaliana, Molecular Biology and Evolution, Volume 40, Issue 2, February 2023, msac275, https://doi.org/10.1093/molbev/msac275
AbstractMitochondrial carriers (MCs) belong to a eukaryotic protein family of transporters that in higher organisms is called the solute carrier family 25 (SLC25). All MCs have characteristic triplicated sequence repeats forming a 3-fold symmetrical structure of a six-transmembrane α-helix bundle with a centrally located substrate-binding site. Biochemical characterization has shown that MCs altogether transport a wide variety of substrates but can be divided into subfamilies, each transporting a few specific substrates. We have investigated the intron positions in the human MC genes and their orthologs of highly diversified organisms. The results demonstrate that several intron positions are present in numerous MC sequences at the same specific points, of which some are 3-fold symmetry related. Many of these frequent intron positions are also conserved in subfamilies or in groups of subfamilies transporting similar substrates. The analyses of the frequent and conserved intron positions in MCs suggest phylogenetic relationships not only between close but also distant homologs as well as a possible involvement of the intron positions in the evolution of the substrate specificity diversification of the MC family members.
AbstractIncreasing numbers of horizontal transfer (HT) of genes and transposable elements are reported in insects. Yet the mechanisms underlying these transfers remain unknown. Here we first quantify and characterize the patterns of chromosomal integration of the polydnavirus (PDV) encoded by the Campopleginae Hyposoter didymator parasitoid wasp (HdIV) in somatic cells of parasitized fall armyworm (Spodoptera frugiperda). PDVs are domesticated viruses injected by wasps together with their eggs into their hosts in order to facilitate the development of wasp larvae. We found that six HdIV DNA circles integrate into the genome of host somatic cells. Each host haploid genome suffers between 23 and 40 integration events (IEs) on average 72 h post-parasitism. Almost all IEs are mediated by DNA double-strand breaks occurring in the host integration motif (HIM) of HdIV circles. We show that despite their independent evolutionary origins, PDV from both Campopleginae and Braconidae wasps use remarkably similar mechanisms for chromosomal integration. Next, our similarity search performed on 775 genomes reveals that PDVs of both Campopleginae and Braconidae wasps have recurrently colonized the germline of dozens of lepidopteran species through the same mechanisms they use to integrate into somatic host chromosomes during parasitism. We found evidence of HIM-mediated HT of PDV DNA circles in no less than 124 species belonging to 15 lepidopteran families. Thus, this mechanism underlies a major route of HT of genetic material from wasps to lepidopterans with likely important consequences on lepidopterans.
A new study in Genome Biology and Evolution reveals that egg yolk proteins may have been co-opted to provide maternal nutrition in live-bearing sharks and their relatives.
AbstractThe yellow nutsedge (Cyperus esculentus L. 1753) is an unconventional oil plant with oil-rich tubers, and a potential alternative for traditional oil crops. Here, we reported the first high-quality and chromosome-level genome assembly of the yellow nutsedge generated by combining PacBio HiFi long reads, Novaseq short reads, and Hi-C data. The final genome size is 225.6 Mb with an N50 of 4.3 Mb. More than 222.9 Mb scaffolds were anchored to 54 pseudochromosomes with a BUSCO score of 96.0%. We identified 76.5 Mb (33.9%) repetitive sequences across the genome. A total of 23,613 protein-coding genes were predicted in this genome, of which 22,847 (96.8%) were functionally annotated. A whole-genome duplication event was found after the divergence of Carex littledalei and Rhynchospora breviuscula, indicating the rich genetic resources of this species for adaptive evolution. Several significantly enriched GO terms were related to invasiveness of the yellow nutsedge, which may explain its plastic adaptability. In addition, several enriched Kyoto Encyclopedia of Genes and Genomes pathways and expanded gene families were closely related with substances in tubers, partially explaining the genomic basis of characteristics of this oil-rich tuber.
AbstractThe degree of divergence between the sex chromosomes is not always proportional to their age. In poeciliids, four closely related species all exhibit a male heterogametic sex chromosome system on the same linkage group, yet show a remarkable diversity in X and Y divergence. In Poecilia reticulata and P. wingei, the sex chromosomes remain homomorphic, yet P. picta and P. parae have a highly degraded Y chromosome. To test alternative theories about the origin of their sex chromosomes, we used a combination of pedigrees and RNA-seq data from P. picta families in conjunction with DNA-seq data collected from P. reticulata, P. wingei, P. parae, and P. picta. Phylogenetic clustering analysis of X and Y orthologs, identified through segregation patterns, and their orthologous sequences in closely related species demonstrates a similar time of origin for both the P. picta and P. reticulata sex chromosomes. We next used k-mer analysis to identify shared ancestral Y sequence across all four species, suggesting a single origin to the sex chromosome system in this group. Together, our results provide key insights into the origin and evolution of the poeciliid Y chromosome and illustrate that the rate of sex chromosome divergence is often highly heterogenous, even over relatively short evolutionary time frames.
AbstractReproductive modes of vertebrates are classified into two major embryonic nutritional types: yolk deposits (i.e., lecithotrophy) and maternal investment (i.e., matrotrophy). Vitellogenin (VTG), a major egg yolk protein synthesized in the female liver, is one of the molecules relevant to the lecithotrophy-to-matrotrophy shift in bony vertebrates. In mammals, all VTG genes are lost following the lecithotrophy-to-matrotrophy shift, and it remains to be elucidated whether the lecithotrophy-to-matrotrophy shift in nonmammalians is also associated with VTG repertoire modification. In this study, we focused on chondrichthyans (cartilaginous fishes)—a vertebrate clade that underwent multiple lecithotrophy-to-matrotrophy shifts. For an exhaustive search of homologs, we performed tissue-by-tissue transcriptome sequencing for two viviparous chondrichthyans, the frilled shark Chlamydoselachus anguineus and the spotless smooth-hound Mustelus griseus, and inferred the molecular phylogeny of VTG and its receptor very low-density lipoprotein receptor (VLDLR), across diverse vertebrates. As a result, we identified either three or four VTG orthologs in chondrichthyans including viviparous species. We also showed that chondrichthyans had two additional VLDLR orthologs previously unrecognized in their unique lineage (designated as VLDLRc2 and VLDLRc3). Notably, VTG gene expression patterns differed in the species studied depending on their reproductive mode; VTGs are broadly expressed in multiple tissues, including the uterus, in the two viviparous sharks, and in addition to the liver. This finding suggests that the chondrichthyans VTGs do not only function as the yolk nutrient but also as the matrotrophic factor. Altogether, our study indicates that the lecithotrophy-to-matrotrophy shift in chondrichthyans was achieved through a distinct evolutionary process from mammals.
AbstractWe present a chromosome-length genome assembly and annotation of the Black Petaltail dragonfly (Tanypteryx hageni). This habitat specialist diverged from its sister species over 70 million years ago, and separated from the most closely related Odonata with a reference genome 150 million years ago. Using PacBio HiFi reads and Hi-C data for scaffolding we produce one of the most high-quality Odonata genomes to date. A scaffold N50 of 206.6 Mb and a single copy BUSCO score of 96.2% indicate high contiguity and completeness.
AbstractThere have been many population-based genomic studies on human-managed honeybees (Apis mellifera and Apis cerana), but there has been a notable lack of analysis with regard to wild honeybees, particularly in relation to their evolutionary history. Nevertheless, giant honeybees have been found to occupy distinct habitats and display remarkable characteristics, which are attracting an increased amount of attention. In this study, we de novo sequenced and then assembled the draft genome sequence of the Himalayan giant honeybee, Apis laboriosa. Phylogenetic analysis based on genomic information indicated that A. laboriosa and its tropical sister species Apis dorsata diverged ∼2.61 Ma, which supports the speciation hypothesis that links A. laboriosa to geological changes throughout history. Furthermore, we re-sequenced A. laboriosa and A. dorsata samples from five and six regions, respectively, across their population ranges in China. These analyses highlighted major genetic differences for Tibetan A. laboriosa as well as the Hainan Island A. dorsata. The demographic history of most giant honeybee populations has mirrored glacial cycles. More importantly, contrary to what has occurred among human-managed honeybees, the demographic history of these two wild honeybee species indicates a rapid decline in effective population size in the recent past, reflecting their differences in evolutionary histories. Several genes were found to be subject to selection, which may help giant honeybees to adapt to specific local conditions. In summary, our study sheds light on the evolutionary and adaptational characteristics of two wild giant honeybee species, which was useful for giant honeybee conservation.
AbstractSpecies phylogenetic trees represent the evolutionary processes of organisms, and they are fundamental in evolutionary research. Therefore, new methods have been developed to obtain more reliable species phylogenetic trees. A highly reliable method is the construction of an ortholog data set based on sequence information of genes, which is then used to infer the species phylogenetic tree. However, although methods for constructing an ortholog data set for species phylogenetic analysis have been developed, they cannot remove some paralogs, which is necessary for reliable species phylogenetic inference. To address the limitations of current methods, we developed OrthoPhy, a program that excludes paralogs and constructs highly accurate ortholog data sets using taxonomic information dividing analyzed species into monophyletic groups. OrthoPhy can remove paralogs, detecting inconsistencies between taxonomic information and phylogenetic trees of candidate ortholog groups clustered by sequence similarity. Performance tests using evolutionary simulated sequences and real sequences of 40 bacteria revealed that the precision of ortholog inference by OrthoPhy is higher than that of existing programs. Additionally, the phylogenetic analysis of species was more accurate when performed using ortholog data sets constructed by OrthoPhy than that performed using data sets constructed by existing programs. Furthermore, we performed a benchmark test of the Quest for Orthologs using real sequence data and found that the concordance rate between the phylogenetic trees of orthologs inferred by OrthoPhy and those of species was higher than the rates obtained by other ortholog inference programs. Therefore, ortholog data sets constructed using OrthoPhy enabled a more accurate phylogenetic analysis of species than those constructed using the existing programs, and OrthoPhy can be used for the phylogenetic analysis of species even for distantly related species that have experienced many evolutionary events.
AbstractLong-read sequencing has revolutionized genome assembly, yielding highly contiguous, chromosome-level contigs. However, assemblies from some third generation long read technologies, such as Pacific Biosciences (PacBio) continuous long reads (CLR), have a high error rate. Such errors can be corrected with short reads through a process called polishing. Although best practices for polishing non-model de novo genome assemblies were recently described by the Vertebrate Genome Project (VGP) Assembly community, there is a need for a publicly available, reproducible workflow that can be easily implemented and run on a conventional high performance computing environment. Here, we describe polishCLR (https://github.com/isugifNF/polishCLR), a reproducible Nextflow workflow that implements best practices for polishing assemblies made from CLR data. PolishCLR can be initiated from several input options that extend best practices to suboptimal cases. It also provides re-entry points throughout several key processes, including identifying duplicate haplotypes in purge_dups, allowing a break for scaffolding if data are available, and throughout multiple rounds of polishing and evaluation with Arrow and FreeBayes. PolishCLR is containerized and publicly available for the greater assembly community as a tool to complete assemblies from existing, error-prone long-read data.
AbstractTaxonomically restricted genes (TRGs) are unique for a defined group of organisms and may act as potential genetic determinants of lineage-specific, biological properties. Here, we explore the TRGs of highly diverse and economically important Bacillus bacteria by examining commonly used TRG identification parameters and data sources. We show the significant effects of sequence similarity thresholds, composition, and the size of the reference database in the identification process. Subsequently, we applied stringent TRG search parameters and expanded the identification procedure by incorporating an analysis of noncoding and non-syntenic regions of non-Bacillus genomes. A multiplex annotation procedure minimized the number of false-positive TRG predictions and showed nearly one-third of the alleged TRGs could be mapped to genes missed in genome annotations. We traced the putative origin of TRGs by identifying homologous, noncoding genomic regions in non-Bacillus species and detected sequence changes that could transform these regions into protein-coding genes. In addition, our analysis indicated that Bacillus TRGs represent a specific group of genes mostly showing intermediate sequence properties between genes that are conserved across multiple taxa and nonannotated peptides encoded by open reading frames.
AbstractRecent studies have highlighted variation in the mutational spectra among human populations as well as closely related hominoids—yet little remains known about the genetic and nongenetic factors driving these rate changes across the genome. Pinpointing the root causes of these differences is an important endeavor that requires careful comparative analyses of population-specific mutational landscapes at both broad and fine genomic scales. However, several factors can confound such analyses. Although previous studies have shown that technical artifacts, such as sequencing errors and batch effects, can contribute to observed mutational shifts, other potentially confounding parameters have received less attention thus far. Using population genetic simulations of human and chimpanzee populations as an illustrative example, we here show that the sample size required for robust inference of mutational spectra depends on the population-specific demographic history. As a consequence, the power to detect rate changes is high in certain hominoid populations while, for others, currently available sample sizes preclude analyses at fine genomic scales.
AbstractKinetochores connect chromosomes to spindle microtubules to ensure their correct segregation during cell division. Kinetochores of human and yeasts are largely homologous, their ability to track depolymerizing microtubules, however, is carried out by the nonhomologous complexes Ska1-C and Dam1-C, respectively. We previously reported the unique anti-correlating phylogenetic profiles of Dam1-C and Ska-C found among a wide variety of eukaryotes. Based on these profiles and the limited presence of Dam1-C, we speculated that horizontal gene transfer could have played a role in the evolutionary history of Dam1-C. Here, we present an expanded analysis of Dam1-C evolution, using additional genome as well as transcriptome sequences and recently published 3D structures. This analysis revealed a wider and more complete presence of Dam1-C in Cryptista, Rhizaria, Ichthyosporea, CRuMs, and Colponemidia. The fungal Dam1-C cryo-EM structure supports earlier hypothesized intracomplex homologies, which enables the reconstruction of rooted and unrooted phylogenies. The rooted tree of concatenated Dam1-C subunits is statistically consistent with the species tree of eukaryotes, suggesting that Dam1-C is ancient, and that the present-day phylogenetic distribution is best explained by multiple, independent losses and no horizontal gene transfer was involved. Furthermore, we investigated the ancient origin of Dam1-C via profile-versus-profile searches. Homology among 8 out of the 10 Dam1-C subunits suggests that the complex largely evolved from a single multimerizing subunit that diversified into a hetero-octameric core via stepwise subunit duplication and subfunctionalization of the subunits before the origin of the last eukaryotic common ancestor.
AbstractGenome assemblies are growing at an exponential rate and have proved indispensable for studying evolution but the effort has been biased toward vertebrates and arthropods with a particular focus on insects. Onychophora or velvet worms are an ancient group of cryptic, soil dwelling worms noted for their unique mode of prey capture, biogeographic patterns, and diversity of reproductive strategies. They constitute a poorly understood phylum of exclusively terrestrial animals that is sister group to arthropods. Due to this phylogenetic position, they are crucial in understanding the origin of the largest phylum of animals. Despite their significance, there is a paucity of genomic resources for the phylum with only one highly fragmented and incomplete genome publicly available. Initial attempts at sequencing an onychophoran genome proved difficult due to its large genome size and high repeat content. However, leveraging recent advances in long-read sequencing technology, we present here the first annotated draft genome for the phylum. With a total size of 5.6Gb, the gigantism of the Epiperipatus broadwayi genome arises from having high repeat content, intron size inflation, and extensive gene family expansion. Additionally, we report a previously unknown diversity of onychophoran hemocyanins that suggests the diversification of copper-mediated oxygen carriers occurred independently in Onychophora after its split from Arthropoda, parallel to the independent diversification of hemocyanins in each of the main arthropod lineages.
AbstractAscetosporea are endoparasites of marine invertebrates that include economically important pathogens of aquaculture species. Owing to their often-minuscule cell sizes, strict intracellular lifestyle, lack of cultured representatives and minimal availability of molecular data, these unicellular parasites remain poorly studied. Here, we sequenced and assembled the genome and transcriptome of Paramikrocytos canceri, an endoparasite isolated from the European edible crab Cancer pagurus. Using bioinformatic predictions, we show that P. canceri likely possesses a mitochondrion-related organelle (MRO) with highly reduced metabolism, resembling the mitosomes of other parasites but with key differences. Like other mitosomes, this MRO is predicted to have reduced metabolic capacity and lack an organellar genome and function in iron–sulfur cluster (ISC) pathway-mediated Fe–S cluster biosynthesis. However, the MRO in P. canceri is uniquely predicted to produce ATP via a partial glycolytic pathway and synthesize phospholipids de novo through the CDP-DAG pathway. Heterologous gene expression confirmed that proteins from the ISC and CDP-DAG pathways retain mitochondrial targeting sequences that are recognized by yeast mitochondria. This represents a unique combination of metabolic pathways in an MRO, including the first reported case of a mitosome-like organelle able to synthesize phospholipids de novo. Some of these phospholipids, such as phosphatidylserine, are vital in other protist endoparasites that invade their host through apoptotic mimicry.
AbstractAll eukaryotes have linear chromosomes that are distributed to daughter nuclei during mitotic division, but the ancestral state of nuclear division in the last eukaryotic common ancestor (LECA) is so far unresolved. To address this issue, we have employed ancestral state reconstructions for mitotic states that can be found across the eukaryotic tree concerning the intactness of the nuclear envelope during mitosis (open or closed), the position of spindles (intranuclear or extranuclear), and the symmetry of spindles being either axial (orthomitosis) or bilateral (pleuromitosis). The data indicate that the LECA possessed closed orthomitosis with intranuclear spindles. Our reconstruction is compatible with recent findings indicating a syncytial state of the LECA, because it decouples three main processes: chromosome division, chromosome partitioning, and cell division (cytokinesis). The possession of closed mitosis using intranuclear spindles adds to the number of cellular traits that can now be attributed to LECA, providing insights into the lifestyle of this otherwise elusive biological entity at the origin of eukaryotic cells. Closed mitosis in a syncytial eukaryotic common ancestor would buffer mutations arising at the origin of mitotic division by allowing nuclei with viable chromosome sets to complement defective nuclei via mRNA in the cytosol.
AbstractWhy do some genomes stay small and simple, while others become huge, and why are some genomes more stable? In contrast to angiosperms and gymnosperms, liverworts are characterized by small genomes with low variation in size and conserved chromosome numbers. We quantified genome evolution among five Marchantiophyta (liverworts), measuring gene characteristics, transposable element (TE) landscape, collinearity, and sex chromosome evolution that might explain the small size and limited variability of liverwort genomes. No genome duplications were identified among examined liverworts and levels of duplicated genes are low. Among the liverwort species, Lunularia cruciata stands out with a genome size almost twice that of the other liverwort species investigated here, and most of this increased size is due to bursts of Ty3/Gypsy retrotransposons. Intrachromosomal rearrangements between examined liverworts are abundant but occur at a slower rate compared with angiosperms. Most genes on L. cruciata scaffolds have their orthologs on homologous Marchantia polymorpha chromosomes, indicating a low degree of rearrangements between chromosomes. Still, translocation of a fragment of the female U chromosome to an autosome was predicted from our data, which might explain the uniquely small U chromosome in L. cruciata. Low levels of gene duplication, TE activity, and chromosomal rearrangements might contribute to the apparent slow rate of morphological evolution in liverworts.
AbstractIon channels are highly diverse in the cnidarian model organism Nematostella vectensis (Anthozoa), but little is known about the evolutionary origins of this channel diversity and its conservation across Cnidaria. Here, we examined the evolution of voltage-gated K+ channels in Cnidaria by comparing genomes and transcriptomes of diverse cnidarian species from Anthozoa and Medusozoa. We found an average of over 40 voltage-gated K+ channel genes per species, and a phylogenetic reconstruction of the Kv, KCNQ, and Ether-a-go-go (EAG) gene families identified 28 voltage-gated K+ channels present in the last common ancestor of Anthozoa and Medusozoa (23 Kv, 1 KCNQ, and 4 EAG). Thus, much of the diversification of these channels took place in the stem cnidarian lineage prior to the emergence of modern cnidarian classes. In contrast, the stem bilaterian lineage, from which humans evolved, contained no more than nine voltage-gated K+ channels. These results hint at a complexity to electrical signaling in all cnidarians that contrasts with the perceived anatomical simplicity of their neuromuscular systems. These data provide a foundation from which the function of these cnidarian channels can be investigated, which will undoubtedly provide important insights into cnidarian physiology.
AbstractHelicoverpa zea (Lepidoptera: Noctuidae) is an insect pest of major cultivated crops in North and South America. The species has adapted to different host plants and developed resistance to several insecticidal agents, including Bacillus thuringiensis (Bt) insecticidal proteins in transgenic cotton and maize. Helicoverpa zea populations persist year-round in tropical and subtropical regions, but seasonal migrations into temperate zones increase the geographic range of associated crop damage. To better understand the genetic basis of these physiological and ecological characteristics, we generated a high-quality chromosome-level assembly for a single H. zea male from Bt-resistant strain, HzStark_Cry1AcR. Hi-C data were used to scaffold an initial 375.2 Mb contig assembly into 30 autosomes and the Z sex chromosome (scaffold N50 = 12.8 Mb and L50 = 14). The scaffolded assembly was error-corrected with a novel pipeline, polishCLR. The mitochondrial genome was assembled through an improved pipeline and annotated. Assessment of this genome assembly indicated 98.8% of the Lepidopteran Benchmark Universal Single-Copy Ortholog set were complete (98.5% as complete single copy). Repetitive elements comprised approximately 29.5% of the assembly with the plurality (11.2%) classified as retroelements. This chromosome-scale reference assembly for H. zea, ilHelZeax1.1, will facilitate future research to evaluate and enhance sustainable crop production practices.