US HUPO: Proteomics Informatics (2010)

Sunday, March 7, 2010, 9:00 am - 4:00 pm (with one hour break for lunch-on-your-own)


Nathan Edwards (Georgetown) and Martin McIntosh (Fred Hutchinson Cancer Research Center)

Course Description

This class will cover introductory and selected advanced informatics and data analysis topics related to tandem mass spectrometry proteomics. The class is intended for the applied laboratory or computational researchers in that it is intended to provide basic and practical insight into a variety of topics, including: search engines, protein sequence databases, statistical significance for peptide and protein identification, and quantitative proteomics.

Advanced topics include combining and refining results of multiple search engines, and refinement, evaluating statistical significance of quantitative experiments using pathway or gene-set style analyses adapted from genomics. Case studies borrowed from the experience of the instructors will be used to demonstrate the basic principles. This will not include a survey of tools nor emphasis on any specific workflow, but rather instructors will focus on general ideas useful for multiple strategies but provide specific examples using a variety of tools familiar to the instructors.

Basics I: Search engines and protein/peptide inference MS/MS Search Engines: Framework, understanding when the do and do not work.
Protein Sequence Databases: Origins, protein families, redundancy & isoforms.
Peptide and protein inference: P-values, E-values and decoy searching.
Case Study: Searching genomic sequence evidence

Basics II: Quantitative analysis Label free and labeled analysis: Strengths and weaknesses of each.
Design: pools versus individual level, selecting among various strategies.
Case studies: two SILAC experiments with very different analysis strategies.

Advanced topic I: Combining & Refining Database Searches. Increasing sensitivity of scores, calibration, and other topics of combining and refining search engine results.

Advanced topic II: Analyzing proteomics results using basic tools borrowed from genomics, including gene-set enrichment and other pathway analyses, in order to increase the statistical power and biological relevance.


Hands-on Exercises