PRIDE-NMR Logo
PRIDE-NMR
Protein fold estimation
from NMR distance restraints
 

PRIDE-NMR Server description

The PRIDE-NMR Server compares NOE data sets to H-H distributions back-calculated from protein 3D coordinates. The basic idea is similar to the PRIDE method, originally described in:

Carugo O and Pongor S, Protein fold similarity estimated by a probabilistic approach based on Cα-Cα distance comparison, J. Mol. Biol (2002) 315:887-898
[abstract]

The PRIDE-NMR approach is described in:

Angyan A, Perczel A, Pongor S and Gaspari Z, Fast protein fold estimation from NMR-derived distance restraints, Bioinformatics (2008) 24:272-275.
[abstract]


Outline Brief theory of the PRIDE-NMR approach

The PRIDE-NMR method compares a single input histogram to those stored in a database. The input histogram is generated from the NOE dataset, where the bins representing different sequential distances contain the number of intrabackbone NOE restraints. The backbone is defined here as containing the amide H, Hα and Hβ atoms. This is necessary to render the method largely sequence-independent. The histograms in the database are similar, back-calculated H-H distributions using cutoff distances of 5, 6 and 7 Ångstroms. The comparison of the input and database histograms is done with contingency analysis analogously as described in Carugo and Pongor, 2002. The resulting probability is the PRIDE-NMR Score. It is possible to use multiple cutoff distances for the database search, in this case the scores obtained with different cutoffs will be averaged.


To filter out hits that represent either too long or too short proteins, a weighting approach is introduced:

Using the PRIDE-NMR server

The server expects a NOE restraint list in X-PLOR / CNS format. (There are scripts to convert other formats into this, e.g. from xeasy .upl format you may use the upl2tbl.awk script by Eiso). Best recognized are files that contain one-line assign records, although every effort was made to read in X-PLOR / CNS datasets we found in databases. However, it might be that your file is not acceptable by the server - if you feel that it should be, please send an e-mail to Zoltán Gáspári.
If you know the length of the protein in question, for best results you should supply it (or a good estimate) to the server. If the length option is zero, the server will estimate the length of the protein from the input file (that will be the maximum sequential distance recognized in the restraint file).
The database cutoff option tells the server to use those histograms from the database that were generated with the selected cutoff distance(s). If you choose multiple distances, the PRIDE-NMR score will be an average calculated with the corresponding histograms.
The hit length restriction option may be used to filter out hits deviating from the length of the protein corresponding to the input dataset more than the percentage set. By default, this option is turned off (set to zero meaning no such filtering is done).
The minimum sequential distance controls the "effective start" of the histograms, only bins corresponding to sequential distances equal or greater to the selected are used for comparison.
The score for ranking the hits can be selected by an option, all four score variants are returned (PRIDE-NMR and its three weigthed variants), the one used for sorting with yellow background.

PRIDE-NMR output interpretation

By definition, a high PRIDE-NMR score means high similarity between the NOE pattern and the back-calculated H-H distance distribution. However, even true positive hits may get a relatively low score (below 0.5). Nevertheless, in an ideal case (when the number of restraints per residue is sufficiently high and the pattern is informative), the first 10 hits should contain more than one homolog of the query protein (if any in the database). If you know to what SCOP family/superfamily your protein belongs and no positive hits are obtained, it may be worth to try several non-default options, i.e. turn on length filtering or choose a minimum sequential distance larger than 3. If you still do not get positive hits, than it might well be that the information content of the NOE restraints is not sufficient to represent the structure in question.

Evaluation of the PRIDE-NMR method

Detailed evaluation data are available in xls format: pridenmreval.xls. These data correspond to the results described in the PRIDE-NMR paper.