Home » Software » PyNISTPL

PyNISTPL

Introduction

PyNISTPL is a python module that interfaces to the NIST MS Search Engine libraries to provide easy to use command-line searching and construction of peptide spectrum libraries.

PyNISTPL includes python scripts that can be run from the Windows or Cygwin command prompt that read many common peptide MS/MS spectral formats, including mzXML, mzData, MGF, etc.

Installation

There are three installation options currently available.

Binary distribution
Requires no Python installation. Executables and NIST DLLs (by permission) contained in a single zip file, with no external depedencies (other than the NIST spectral libraries). Use this unless you want to modify the python scripts.
Python package distribution
Requires Python 2.4 installation. Python scripts and NIST DLLs supplied via Windows installers. Python scripts can be modified, in their installation directory. Various support packages must also be installed. Binary interface to the NIST DLLs cannot be modified. Use this if you want to write new python scripts that interface to the NIST DLLs.
Source distribution
Requires Python installation. Presumes NIST library is available for compilation which must be obtained from NIST as necessary. Requires Visual C++ for compilation. Various support packages must also be installed. Binary interface to the NIST DLLs can be modified at will. Use this if you want to write a new python API to the NIST DLL.

Binary distribution
  1. Download the pymssearch.zip file from edwardslab.bmcb.georgetown.edu.
  2. Unzip the pymssearch.zip file wherever you like.
  3. Note that the python modules assume the peptide libraries are in the default location: C:\NIST_PEPLIB\LIBS.

Python package distribution
  1. If necessary, download and install the latest version of the ActiveState ActivePython (Python version 2.4 only!) package for Windows. The default installation location (C:\Python24) will require the least tweaking of paths.
  2. Download and install the python elementtree package from effbot.org.
  3. Download and install the python cjson package from edwardslab.bmcb.georgetown.edu.
  4. Download and install the python PyMSIO package from edwardslab.bmcb.georgetown.edu.
  5. Download and install the python PyNISTPL package from edwardslab.bmcb.georgetown.edu.
  6. Scripts are installed in C:\Python24\Scripts.
  7. Note that the python modules assume the peptide libraries are in the default location: C:\NIST_PEPLIB\LIBS.

Source distribution
  1. If necessary, download and install the latest version of the ActiveState ActivePython (Python version 2.4 only!) package for Windows. The default installation location (C:\Python24) will require the least tweaking of paths.
  2. Download and install the python elementtree package from effbot.org.
  3. Download and install the python cjson package from edwardslab.bmcb.georgetown.edu.
  4. Download the source for the PyMSIO package from edwardslab.bmcb.georgetown.edu.
  5. Unzip the PyMSIO-0.9.zip file, and run the following command inside the PyMSIO-0.9 folder (adjust paths as necessary):
    c:\Python24\python.exe setup.py install
    
  6. Obtain the file NIST_PEP_SRCH_ENGINE.zip from NIST, and unzip to C:\.
  7. Download the source for the PyNISTPL package from edwardslab.bmcb.georgetown.edu.
  8. Unzip the pynistpeplib-0.95.zip file, and run the following command inside the pynistpeplib-0.95 folder (adjust paths as necessary) from the Visual Studio Command Prompt:
    C:\Python24\python.exe setup.py install
    
  9. Copy the NIST DLLs NISTDL32.dll and ctNt66.dll from C:\NIST_PEPLIB\SRCH_ENGINE\CODE to C:\Python24.
  10. Note that the python modules assume the peptide libraries are in the default location: C:\NIST_PEPLIB\LIBS.

Usage

PyNISTPL provides three scripts that can be run from the Windows or Cygwin command-line. pymssearch searches a peptide spectrum library, such as the human peptide spectrum library from NIST; pymakelib makes a spectrum library from a set of input spectra; and pylibspectra outputs the number of spectra in a given spectrum library.

pymssearch [ options ] spectrum-files

Options:

-L library, --lib library
Name, or full path, of spectrum library to search. Required.

-o file, --output file
Send output to designated file. If the filename ends in .gz, then output is automatically gzipped. Default: - (stdout).

-s sim, --similarity_threshold sim
Minimum similarity of hits to output. Default: All.

-n hits, --number hits
Maximum number of spectral hits to output. Default: All.

-q, --quiet
Only output query spectra with at least 1 hit. Default: False.

-N peaks, --npeaks peaks
Lower bound on number of peaks for query spectra. Default: 0.

-t tol, --precursor_tolerance tol
Precursor tolerance, in Daltons, for library spectra to match with. Default: 2 Da.

-f tol, --fragment_tolerance tol
Fragment tolerance, in Daltons. Default: 2 Da.

-E, --EOMSSA_weighting
Use OMSSA E-value to weight match factor. Default: False.

-e, --output_eomssa
Display OMSSA E-value. Default: False.

-Q, --QTOF_weighting
Use QTOF weighting in match factor. Default: False.

-R, --NumRep_weighting
Use number of replicates weighting in match factor. Default: False.

-T, --output_tf
Display Query and Lib T/F scores. Default: False.

-P, --no_precursor_filter
Whether or not to restrict the precursors of library spectra to match with. Default: False.

--qformat fmt
Format for query spectrum output. To see valid fields, use --qformat "-". Default: ">%(base)s %(scan)d %(title)s\n".

--rformat fmt
Format for reference spectrum output. To see valid fields, use --rformat "-". Default: "%(rank)d. %(peptide)s/%(charge)d ID %(libid)d Similarity %(sim).3f RevSim %(revsim).3f Probability %(prob).4f Mods %(Mods)s Spec %(Spec)s\n".

-W work-directory, --workdir work-directory
Search engine working directory. According to the NIST documentation, concurrent search engines processes should have different work directories. Default: C:\NIST_PEPLIB.

-h, --help
Help on command line options.
pymakelib [ options ] spectrum-files

Options:

-L library, --lib library
Name, or full path, of spectrum library create or add to. If not supplied, and there is only one spectrum file on the command line, then the spectrum library name is inferred from the spectrum filename.

-N name, --name name
Spectra in the library are identified by their input spectrum filename and their index within the file. The option sets the name that is used to identify the set of spectra.

-W work-directory, --workdir work-directory
Search engine working directory. According to the NIST documentation, concurrent search engines processes should have different work directories. Default: C:\NIST_PEPLIB.

-v, --verbose
Provide feedback on whether or not the library is created or already exists, and the number of spectra contained in it before and after the input spectra are added.

-h, --help
Help on command line options.

pylibspectra [ options ]

Options:

-L library, --lib library
Name, or full path, of spectrum library create or add to. If not supplied, and there is only one spectrum file on the command line, then the spectrum library name is inferred from the spectrum filename.

-W work-directory, --workdir work-directory
Search engine working directory. According to the NIST documentation, concurrent search engines processes should have different work directories. Default: C:\NIST_PEPLIB.

-h, --help
Help on command line options.

Examples

Search a gzip'ed mzXML spectrum file of MS/MS spectra against the NIST human peptide spectrum library:

C:\> \Python24\Scripts\pymssearch -L human spectra.mzXML.gz > spectra_v_human.out

Compute all-vs-all spectral similarity for spectra in MGF format:

C:\> \Python24\Scripts\pymakelib -L myspectra -v spectra.mgf
C:\> \Python24\Scripts\pylibspectra -L myspectra
C:\> \Python24\Scripts\pymssearch -L myspectra spectra.mgf > spectra_v_myspectra.out