Reference: Vakirlis N, et al. (2024) Ancestral Sequence Reconstruction as a Tool to Detect and Study De Novo Gene Emergence. Genome Biol Evol 16(8)

Reference Help

Abstract


New protein-coding genes can evolve from previously noncoding genomic regions through a process known as de novo gene emergence. Evidence suggests that this process has likely occurred throughout evolution and across the tree of life. Yet, confidently identifying de novo emerged genes remains challenging. Ancestral sequence reconstruction is a promising approach for inferring whether a gene has emerged de novo or not, as it allows us to inspect whether a given genomic locus ancestrally harbored protein-coding capacity. However, the use of ancestral sequence reconstruction in the context of de novo emergence is still in its infancy and its capabilities, limitations, and overall potential are largely unknown. Notably, it is difficult to formally evaluate the protein-coding capacity of ancestral sequences, particularly when new gene candidates are short. How well-suited is ancestral sequence reconstruction as a tool for the detection and study of de novo genes? Here, we address this question by designing an ancestral sequence reconstruction workflow incorporating different tools and sets of parameters and by introducing a formal criterion that allows to estimate, within a desired level of confidence, when protein-coding capacity originated at a particular locus. Applying this workflow on ∼2,600 short, annotated budding yeast genes (<1,000 nucleotides), we found that ancestral sequence reconstruction robustly predicts an ancient origin for the most widely conserved genes, which constitute "easy" cases. For less robust cases, we calculated a randomization-based empirical P-value estimating whether the observed conservation between the extant and ancestral reading frame could be attributed to chance. This formal criterion allowed us to pinpoint a branch of origin for most of the less robust cases, identifying 49 genes that can unequivocally be considered de novo originated since the split of the Saccharomyces genus, including 37 Saccharomyces cerevisiae-specific genes. We find that for the remaining equivocal cases we cannot rule out different evolutionary scenarios including rapid evolution, multiple gene losses, or a recent de novo origin. Overall, our findings suggest that ancestral sequence reconstruction is a valuable tool to study de novo gene emergence but should be applied with caution and awareness of its limitations.

Reference Type
Journal Article
Authors
Vakirlis N, Acar O, Cherupally V, Carvunis AR
Primary Lit For
Additional Lit For
Review For

Gene Ontology Annotations


Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Gene/Complex Qualifier Gene Ontology Term Aspect Annotation Extension Evidence Method Source Assigned On Reference

Phenotype Annotations


Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details.

Gene Phenotype Experiment Type Mutant Information Strain Background Chemical Details Reference

Disease Annotations


Increase the total number of rows showing on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Gene Disease Ontology Term Qualifier Evidence Method Source Assigned On Reference

Regulation Annotations


Increase the total number of rows displayed on this page using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; to filter the table by a specific experiment type, type a keyword into the Filter box (for example, “microarray”); download this table as a .txt file using the Download button or click Analyze to further view and analyze the list of target genes using GO Term Finder, GO Slim Mapper, or SPELL.

Regulator Target Direction Regulation Of Happens During Method Evidence

Post-translational Modifications


Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through its pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Site Modification Modifier Reference

Interaction Annotations


Genetic Interactions

Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details about experiment type and any other genes involved in the interaction.

Interactor Interactor Allele Assay Annotation Action Phenotype SGA score P-value Source Reference

Physical Interactions

Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through the table's pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table; click on the small "i" buttons located within a cell for an annotation to view further details about experiment type and any other genes involved in the interaction.

Interactor Interactor Assay Annotation Action Modification Source Reference

Functional Complementation Annotations


Increase the total number of rows showing on this page by using the pull-down located below the table, or use the page scroll at the table's top right to browse through its pages; use the arrows to the right of a column header to sort by that column; filter the table using the "Filter" box at the top of the table.

Gene Species Gene ID Strain background Direction Details Source Reference