New & Noteworthy

Explore the S288C Transcriptome in JBrowse

April 25, 2019

We have recently equipped our genome browsing tool JBrowse with 9 new Transcriptome data tracks, making JBrowse an even more powerful way to explore the vast heterogeneity of the S288C transcriptome. These information-rich data tracks visualize RNA transcripts from the TIF-seq dataset published by Pelechano et al. (2013), enabling quick and easy viewing of the position, length, and abundance of transcript isoforms sequenced in the study.

You can easily access these new tracks by entering JBrowse and clicking on the left-hand “Select tracks” tab. They are located in the Transcriptome category. In addition to viewing the data in JBrowse, you can also download the .gff3 and .bw files for these tracks for use in your own analyses.

Check out our video tutorial from the SGD YouTube channel at the top of this page for a quick overview of the new transcriptome data tracks and how to access them. More information about these tracks and how SGD created them can also be found on our Genome Browser help page.

If you have any questions or feedback about the new Transcriptome data tracks or about our genome browser, please don’t hesitate to contact us.

Data tracks that visualize transcript isoforms that fully overlap a gene coding region:

Data Track Title Description
longest_full-ORF_transcripts_ypd This track contains the longest transcript overlapping each individual ORF completely for WT cells grown in glucose (ypd) media.
longest_full-ORF_transcripts_gal This track contains the longest transcript overlapping each individual ORF completely for WT cells grown in galactose (gal) media.
most_abundant_full-ORF_transcripts_ypd This track contains the most abundant transcript overlapping each individual ORF completely for WT cells grown in glucose (ypd) media.
most_abundant_full-ORF_transcripts_gal This track contains the most abundant transcript overlapping each individual ORF completely for WT cells grown in galactose (gal) media.
unfiltered_full-ORF_transcripts This track contains all transcripts that overlapped individual open reading frame (ORF) completely for WT cells grown in either glucose (ypd) or galactose (gal) media.

Data tracks that quantify the number of transcripts that cover a given nucleotide in the S288c genome:

Data Track Title Description
plus_strand_coverage_ypd For WT cells grown in glucose media (ypd), the amount of transcripts covering each position on the plus strand is represented in this track.
plus_strand_coverage_gal For WT cells grown in galactose media (gal), the amount of transcripts covering each position on the plus strand is represented in this track.
minus_strand_coverage_ypd For WT cells grown in glucose media (ypd), the amount of transcripts covering each position on the minus strand is represented in this track.
minus_strand_coverage_gal For WT cells grown in galactose media (gal), the amount of transcripts covering each position on the minus strand is represented in this track.

 

Categories: Tutorial, New Data

Proteome-wide abundance data

March 11, 2019

SGD has now incorporated proteome-wide protein abundance data obtained from a comprehensive meta-analysis by Ho et al., 2018. The authors normalized and combined 21 different S. cerevisiae protein abundance datasets—including data from both untreated cells and cells treated with various environmental stressors—to create a unified protein abundance dataset where all values are in the intuitive units of molecules per cell. The original datasets were initially obtained using different methodologies (mass spectrometry, fluorescence microscopy, flow cytometry, and TAP-immunoblot), allowing Ho et al. to evaluate the strengths and weaknesses of these methods in addition to providing the community with a comprehensive reference map of the yeast proteome.

Normalized abundance measurements and associated metadata from untreated and treated cells are displayed in tabular form in the experimental data section of protein-tabbed pages (e.g. CDC28). Several different controlled vocabularies have been employed to standardize the metadata display. In addition, calculated median abundance and median absolute deviation (MAD) values are displayed in the protein section of Locus Summary pages (e.g. PHO85). Two new YeastMine templates have been created to provide access to these data: Gene -> Protein Abundance and Gene -> Median Protein Abundance

Special thanks to Brandon Ho and Grant Brown for generating this comprehensive reference map of protein abundance, and for their help in making this data available to the larger community.

Categories: New Data

New Data Tracks added to JBrowse

January 15, 2019


SGD has updated our JBrowse genome browser with 157 new data tracks related to genome-wide experiments and omics data for you to explore. You can easily access these new tracks, which visualize data from the twenty publications listed below, by entering JBrowse and clicking on the left-hand “Select tracks” tab. Then, search for the PMID associated with the reference of interest.

Note that some references appear more than once, as they have multiple data tracks associated that belong to different categories in JBrowse.

For more information on using JBrowse, be sure to check out our playlist of JBrowse video tutorials on YouTube. If you have any questions or feedback about the new tracks or about our genome browser, please don’t hesitate to contact us.

Transcription & Transcriptional Regulation

Reference PMID Description in JBrowse
Baptista et al. (2017) 28918903 ChEC-seq to map the genome-wide binding of the SAGA coactivator complex in budding yeast.
Castelnuovo et al. (2014) 24497191 Genome-wide measurement of whole transcriptome versus histone modified mutants
El Hage et al. (2014) 25357144 Genome-wide distribution of RNA-DNA hybrids identifies RNase H targets in tRNA genes retrotransposons and mitochondria.
Freeberg et al. (2013) 23409723 Mapped regions of untranslated, polyadenylated transcriptome bound by RNA-binding proteins (RBPs)
Kang et al. (2015) 25213602 Genome-wide transcript profiling by paired-end ditag sequencing
Lee et al. (2018) 29339748 ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae.
Park et al. (2014) 24413663 Simultaneous mapping of RNA ends by sequencing (SMORE-seq) to identify the strongest transcription start sites and polyadenylation sites genome-wide
Rossbach et al. (2017) 28924058 Authors utilized the Calling Cards Ty5 retrotransposon insertion method to identify binding sites of cdc7kd, cdc7kdΔcterm and Gal4 transcription factor within the yeast genome.
Schaughnency et al. (2014) 25299594 Genome-wide identification of transcription termination sites; pA pathway and non-polyadenylation pathway in strains missing Sen1p or Nrd1p

Histone Modification

Reference PMID Description in JBrowse
Castelnuovo et al. (2014) 24497191 Genome-wide measurement of whole transcriptome versus histone modified mutants
Hu J. et al. (2015) 26628362 ChIP-seq and MNase-seq to determine how histone modifications and chromatin structure directly regulate meiotic recombination. Identified acetylation of histone H4 at Lys44 (H4K44ac) as a new histone modification
Joo et al. (2017) 29203645 Next-Generation-Sequecing (NGS)-derived genome-wide occupancy of TAF (Taf1) compared with other basal initiation components (TBP and TFIIB), histones (H3, H4, Htz1 and H4 acetylation) and histone regulator complexes (Swr1, Bdf1) in S. cerevisiae
Kniewel et al. (2017) 28986445 ChIP-seq to determine the whole-genome enrichment of Mek1 targeted histone H3 threonine 11 phosphorylation (H3 T11ph) during Saccharomyces cerevisiae meiosis.
Lee et al. (2018) 29339748 ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae.
Weiner et al. (2018) 25801168 Examining chromatin dynamics through genome-wide mapping of 26 histone modifications at 0 4 8 15 30 and 60 minutes after diamide addition using MNase-ChIP

Chromatin Organization

Reference PMID Description in JBrowse
Chereji et al. (2014) 29426353 Genome binding/occupancy profiling of single nucleosomes and linkers by high throughput sequencing
Gutierrez et al. (2017) 29212533 Authors sought to correct sequence bias of MNase-Seq with a method based on the digestion of naked DNA and the use of the bioinformatic tool DANPOS
Hu Z. et al. (2014) 24532716 Genome-wide measurement of nucleosome occupancy during cell aging
Hu J. et al. (2015) 26628362 ChIP-seq and MNase-seq to determine how histone modifications and chromatin structure directly regulate meiotic recombination. Identified acetylation of histone H4 at Lys44 (H4K44ac) as a new histone modification
Joo et al. (2017) 29203645 Next-Generation-Sequecing (NGS)-derived genome-wide occupancy of TAF (Taf1) compared with other basal initiation components (TBP and TFIIB), histones (H3, H4, Htz1 and H4 acetylation) and histone regulator complexes (Swr1, Bdf1) in S. cerevisiae
Lee et al. (2018) 29339748 ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae.

RNA Catabolism

Reference PMID Description in JBrowse
Geisberg et al. (2014) 24529382 Half-lives of 21,248 mRNA 3_ isoforms in yeast were measured by rapidly depleting RNA polymerase II from the nucleus and performing direct RNA sequencing throughout the decay process.
Smith et al. (2014) 24931603 Identification of genome-wide transcripts; looking at nonsense-mediated RNA decay pathway

Transposons

Reference PMID Description in JBrowse
Lee et al. (2018) 29339748 ChIP-Seq, mRNA-seq, ATAC-seq, and MNase-seq samples in wild-type (WT) and various mutants were prepared using Saccharomyces cerevisiae.
Michel et al. (2017) 28481201 Genome-wide examination of protein function by using transposons for targeted gene disruption
Rossbach et al. (2017) 28924058 Authors utilized the Calling Cards Ty5 retrotransposon insertion method to identify binding sites of cdc7kd, cdc7kdΔcterm and Gal4 transcription factor within the yeast genome.

DNA Replication, Recombination, and Repair

Reference PMID Description in JBrowse
Mao et al. (2017) 28912372 Map of N-methylpurine (NMP) lesion alkalation damage across the yeast genome

 

Categories: New Data

Disease Pages at SGD: Linking Yeast Genetics and Human Disease

December 22, 2018


neurodegenerative_disease

SGD’s Disease Ontology page for neurodegenerative disease

To promote the use of yeast as a catalyst for biomedical research, SGD utilizes the Disease Ontology (DO) to describe human diseases that are associated with yeast homologs. Disease Ontology annotations to yeast genes are now available through SGD’s new Disease pages. Each page corresponds to a Disease Ontology term, such as amyotrophic lateral sclerosis, and lists out all yeast genes annotated to the term by SGD.

Yeast genes with one or more human disease associations will also have a new Disease Summary tab (example: MIP1), accessible from the genes’ respective locus pages. The Disease summary tab shows all manually curated, high-throughput, and computational disease annotations for the yeast gene. Additionally, these pages feature a network diagram that depicts shared disease annotations for other yeast genes and their human homologs.

network_diagram_disease

The shared disease annotations diagram for MIP1

For more information, check out SGD’s Disease Ontology help page. Explore the new Disease pages and features, and be sure to let us know if you have any feedback or questions.

Categories: New Data

Macromolecular Complex Pages Now Available

December 14, 2018


Macromolecular complexes, already retrievable from SGD’s YeastMine data warehouse, are now available on new pages on the SGD website. These new Complex pages (example: GAL3-GAL80 complex) provide manually curated information about the complex as well as helpful links and diagrams. Key features of Complex pages include:

  • Manually curated summaries of the complex’s function and biology
  • A list of all known subunits and other complex participants
  • A Complex Diagram that shows the physical interactions between each subunit
  • Gene Ontology (GO) terms annotated to the complex
  • Images of complex structure from the Protein Data Bank (PDB), if available
  • A network diagram that shows how the complex relates to other complexes in terms of function and shared subunits

Complex pages can be accessed by running a search for the complex, or by visiting the gene summary pages of its subunits. For example, to find the GAL3-GAL80 complex page, simply run a search for “GAL3-GAL80” and click on the Complexes category (symbolized by the gold dot). Or, go to the GAL3 or GAL80 gene page and locate the Complex section.

SGD curated these macromolecular complex data in collaboration with curators at EMBL-EBI’s Complex Portal. Be sure to check out the page for your favorite complex, and let us know if you have any feedback or questions.

Categories: New Data

Out of China: Changing our Views on the Origins of Budding Yeast

April 17, 2018

1,011. That’s the number of different Saccharomyces cerevisiae yeast strains that were whole-genome sequenced and phenotyped by a team of researchers jointly led by Joseph Schacherer and Gianni Liti, published this week in Nature (Peter et al., 2018; data at: http://bit.ly/1011genomes-DataAtSGD).

1011genomes_FigS1bPie_chart_PaperVersion-crop

Ecological origins of the 1,011 isolates (from Peter et al., 2018; Creative Commons license)

Scrupulously gathering isolates of S. cerevisiae from as many diverse geographical locations and ecological niches as possible, the authors and their collaborators plucked yeast cells not only from the familiar wine, beer and bread sources, but also from rotting bananas, sea water, human blood, sewage, termite mounds, and more. The authors then surveyed the evolutionary relationships among the strains to describe the worldwide population distribution of this species and deduce its historical spread.

They found that the greatest amount of genome sequence diversity existed among the S. cerevisiae strains collected from Taiwan, mainland China, and other regions of East Asia. This means that in all likelihood the geographic origin of S. cerevisiae lies somewhere in East Asia. According to the authors, our budding yeast friend began spreading around the globe about 15,000 years ago, undergoing several independent domestication events during its worldwide journey. For example, it turns out that wine yeast and sake yeast were domesticated from different ancestors, thousands of years apart from each other. Whereas genomic markers of domestication appeared about 4,000 years ago in sake yeast, such markers appeared in wine yeast only 1,500 years ago.

Additionally — and similar to the situation where human interspecific hybridization with Neanderthals occurred only after humans migrated out of Africa — it appears that S. cerevisiae has inter-bred very frequently with other Saccharomyces species, especially S. paradoxus, but that most of these interspecific hybridization events occurred after the out-of-China dispersal.

There are many more gems to be found among the treasure trove of information in this paper. Some notable conclusions from the authors include: diploids are the most fit ploidy; copy number variation (CNV) is the most prevalent type of variation; most single nucleotide polymorphisms (SNPs) are very rare alleles in the population; extensive loss of heterozygosity is observed among many strains. There are also phenotype results (fitness values) for 971 strains across 36 different growth conditions.

As is often the case for yeast, the ability to sequence and analyze whole genomes at very deep coverage has yielded broad insights on eukaryotic genome evolution. The team’s work highlights this by presenting a comprehensive view of genome evolution on many different levels (e.g., differences in ploidy, aneuploidy, genetic variants, hybridization, and introgressions) that is difficult to obtain at the same scale and accuracy for other eukaryotic organisms.

SGD is happy to announce that in conjunction with the authors and publishers, we are hosting the datasets from the paper at this SGD download site. These datasets include: the actual genome sequences of the 1,011 isolates; the list of 4,940 common “core” ORFs plus 2,856 ORFs that are variable within the population (together these make up the “pangenome”); copy number variation (CNV) data; phenotyping data for 36 conditions; SNPs and indels relative to the S288C genome; and much more. We hope that the easy availability of these large datasets will be useful to many yeast (and non-yeast) researchers, and as the authors say, will help to “guide future population genomics and genotype–phenotype studies in this classic model system.”

Categories: Announcements, New Data

Tags: strains, evolution, genome wide association study, Saccharomyces cerevisiae

New Protein Half-life Data in SGD and YeastMine

September 08, 2016


Protein turnover for budding and fission yeast proteins, and scatterplot comparing homologous protein half-lives. Image from Cell Reports via Creative Commons license.

Ever wonder how quickly your favorite protein turns over within the cell? SGD has just incorporated half-life data for 3700 yeast proteins from a paper by Christiano et al., 2014. In this study, Christiano and colleagues pulse labeled exponentially growing wild type yeast cells in synthetic medium with a heavy lysine isotope (pulse SILAC), and followed the decay of native untagged proteins using high-resolution mass spectrometry based proteomics. The data generated in this study can be accessed by viewing the Experimental Data section of the Protein tab for your favorite gene, such as the short-lived Ctk1p or the long-lived Rsc1p.

In addition, you can retrieve this half-life data using YeastMine for one or more proteins with the Gene–>Protein Half-life template or obtain a list of proteins with half lives within a given range using the Retrieve–>Proteins with half-life in a given range template. Both of these templates can be found in the “Templates” section of YeastMine under the “Protein” category.

Thanks to Romaine Christiano and Tobias Walther for their help integrating this information into SGD.

Categories: New Data

New High-throughput GO Annotations Added to SGD

June 06, 2016


We’ve added 1,400 high-throughput (HTP) cellular component GO annotations from a new paper published by Maya Schuldiner’s lab. In this paper, Yofe et al., 2016 devised and implemented a methodology, called SWAT (short for SWAp-Tag), creating a parental library containing 1,800 strains, all known or predicted to localize to the yeast endomembrane system. Once created, this novel acceptor library serves as a template that can be ’swapped’ into other libraries, thus facilitating the rapid interconversion to new libraries by simply replacing the acceptor module with a new tag or sequence of choice. As proof of principle, this paper describes the parental library (N’ SWAT-GFP), and its utility as a gateway to the construction of two additional libraries (N’ mCherry and N’ seamless GFP). A high-content screening platform was used to generate images that were then manually reviewed and used to assign subcellular locations for proteins in these collections. Based on these results, SGD has incorporated GO annotations for proteins when at least two of three tags gave the same cellular localization. In addition, Locus Summary page descriptions for genes within this collection that did not have a known cellular location prior to this study have been updated. Finally, this study also provides access to a list of proteins predicted to contain signal peptides using three different algorithms. We would like to thank Maya Schuldiner and members of her lab for help with the integration of this information into SGD.

Categories: New Data

New SGD Help Video: Yeast-Human Functional Complementation Data

June 30, 2015


Yeast and humans diverged about a billion years ago, but there’s still enough functional conservation between some pairs of yeast and human genes that they can be substituted for each other. How cool is that?! Which genes are they? What do they do?

This two-minute video explains how to find, search, and download the yeast-human functional complementation data in SGD. You can find help with many other aspects of SGD in the tutorial videos on our YouTube channel. And as always, please be sure to contact us with any questions or suggestions.

Categories: Homologs, Tutorial, New Data

Tags: yeast model for human disease, video

Yeast-Human Functional Complementation Data Now in SGD

June 10, 2015


Yeast and humans diverged about a billion years ago. So if there’s still enough functional conservation between a pair of similar yeast and human genes that they can be substituted for each other, we know they must be critically important for life. An added bonus is that if a human protein works in yeast, all of the awesome power of yeast genetics and molecular biology can be used to study it.

To make it easier for researchers to identify these “swappable” yeast and human genes, we’ve started collecting functional complementation data in SGD. The data are all curated from the published literature, via two sources. One set of papers was curated at SGD, including the recent systematic study of functional complementation by Kachroo and colleagues.  Another set was curated by Princeton Protein Orthology Database (P-POD) staff and is incorporated into SGD with their generous permission.

As a starting point, we’ve collected a relatively simple set of data: the yeast and human genes involved in a functional complementation relationship, with their respective identifiers; the direction of complementation (human gene complements yeast mutation, or vice versa); the source of curation (SGD or P-POD); the PubMed ID of the reference; and an optional free-text note adding more details. In the future we’ll incorporate more information, such as the disease involvement of the human protein and the sequence differences found in disease-associated alleles that fail to complement the yeast mutation.

You can access these data in two ways: using two new templates in YeastMine, our data warehouse; or via our Download page. Please take a look, let us know what you think, and point us to any published data that’s missing. We always appreciate your feedback!

Using YeastMine to Access Functional Complementation Data

YeastMine is a versatile tool that lets you customize searches and create and manipulate lists of search results. To help you get started with YeastMine we’ve created a series of short video tutorials explaining its features.

Gene –> Functional Complementation template

This template lets you query with a yeast gene or list of genes (either your own custom list, or a pre-made gene list) and retrieve the human gene(s) involved in cross-species complementation along with all of the data listed above.

Human Gene –> Functional Complementation template

This template takes either human gene names (HGNC-approved symbols) or Entrez Gene IDs for human genes and returns the yeast gene(s) involved in cross-species complementation, along with the data listed above. You can run the query using a single human gene as input, or create a custom list of human genes in YeastMine for the query. We’ve created two new pre-made lists of human genes that can also be used with this template. The list “Human genes complementing or complemented by yeast genes” includes only human genes that are currently included in the functional complementation data, while the list “Human genes with yeast homologs” includes all human genes that have a yeast homolog as predicted by any of several methods.

Downloading Functional Complementation Data

If you’d prefer to have all the data in one file, simply visit our Curated Data download page and download the file “functional_complementation.tab”.

Categories: Yeast and Human Disease, New Data

Tags: yeast model for human disease

Next