Yeast “adopt a proto-gene” project

What is the adopt a proto-gene project?

In a synergistic educational activity designed to promote literacy in evolutionary biology, we are developing a novel “adopt a proto-gene” initiative whereby students and educators working with undergraduates can characterize individual proto-genes at their home institution. The project will provide modules (bioinformatic and wet lab) for undergraduate students to explore proto-genes in the model eukaryote Saccharomyces cerevisiae. Virtual workshops will be offered to assist faculty in using the modules with undergraduates at their home institutions

What is a proto-gene?

It has become increasingly clear that eukaryotic genomes are pervasively transcribed and translated.
Thousands of small, evolutionarily novel polypeptides expand the coding potential of fungal, plant and
animal genomes beyond established protein-coding genes. Genomic scientists have proposed that pervasive
translation generates a reservoir of “proto-genes” that promote de novo gene birth by exposing genetic
variation to natural selection in the form of novel polypeptides. Some proto-genes are occasionally
retained by selection and become de novo genes, but most eventually return to a non-genic state. Aside
from their evolutionary potential, how do proto-genes impact cell biology? The physiological significance
of proto-genes has not yet been systematically explored in any species. As a result of this gap in
knowledge, current models of cellular systems are missing thousands of genetic elements that are
potentially critical for understanding genotype-phenotype relationships. This missing biology is likely to
explain key molecular differences between species, to unveil novel mechanisms of evolutionary
adaptation, and to shed light on the first steps of de novo gene emergence.

The “adopt a proto-gene” initiative is support by an NSF-CAREER award to Anne-Ruxandra Carvunis, Associate Professor in the Department of Computational and Systems Biology at the University of Pittsburgh School of Medicine

Lab Modules

These modules allow students to utilize cutting-edge bioinformatics algorithms to explore proto-genes via web based tools without requiring any coding experience. Linked here are proto-genes we pre-curated to serve as good illustrations for each module.

Module 1. Genome Browser
This module provides an introduction to the SGD site and genome browser JBrowse. In this module participants will learn how to use a genome browser to view the position of genes relative to one another and how to integrate across different data types at the genome scale.  A video walk through of this module can be viewed at

Genome Browser Guide (google doc) (PDF)

Genome Browser Worksheet (google doc) (PDF)

Module 2. Cellular Localization
This module provides instructions to predict protein localization from amino acid sequence and to identify sequence or structure motifs important for predicting localization. A video walk through of this module can be viewed at

Cellular Localization Guide (google doc) (PDF)

Cellular Localization Worksheet (google doc) (PDF)

Module 3. Structure Prediction
This module provides instructions to predict protein structure from amino acid sequence and search for proteins with similar structures using cutting-edge machine learning algorithms such as ESMFold and Foldseek. Coming soon: a video walk through of this module.

Structure Prediction Guide (google doc) (PDF)

Structure Prediction Worksheet (google doc) (PDF)

Module 4. Coexpression
This module provides instructions on how to query a large coexpression network to identify genes and other proto-genes that have similar transcriptional patterns. Coming soon: a video walk through of this module.

Coexpression Guide (google doc) (PDF)

Coexpression Worksheet (google doc) (PDF)

Module 5. Ancestral Reconstruction
This module provides instructions for reconstructing the ancestral sequences that gave rise to a given extant sequence and use alignment tools to compare sequence similarities across yeast species. Coming soon: a video walk through of this module.

Ancestral Reconstruction Guide (google doc) (PDF)

Ancestral Reconstruction Worksheet (google doc) (PDF)

other links

2023 workshop welcome & introduction presentation (PDF)

student presentation template (google slides)

faculty presentation template (google slides)


How to Use a Genome Browser: JBROWSE: GUIDE

How to Use a Genome Browser: JBROWSE: WORKSHEET

NOTE: If worksheet (docx) links don’t work, right click, copy link address and paste in a new tab to download


We envision that the Teaching Yeast Slack will become a community resource for all those using Yeast as a tool for Teaching in the classroom and in the lab. We can use it to exchange slides, protocols, online resources or physical resources such as strains, plasmids etc. We can use it to brainstorm new experiments or new exam questions. We can use it to chat about anything, e.g. conferences we plan to go to, our newest papers, or weird technical artifacts we can’t seem to troubleshoot on our own. In sum, Teaching Yeast can bring us together and serve any purpose we need it to. Hope to see you there!

How to join “Teaching Yeast” Slack Team:

  1. Send an email to explaining who you are and why you would like to join the Teaching Yeast Slack Team in 250 words or less.
  2. You will receive an email invitation
  3. Click on “Join now”.
  4. After getting redirected to the slack webpage, enter your Full Name and click on the button below it.
  5. Follow the instructions and contact if you encounter any issues
  6. You are all set to explore! Introduce yourself in the #general channel and browse the available channels that may be of interest to you!