In this tutorial we will
We will use the software CYANA.
For viewing the molecular structures, we will use VMD.
In real experimental situation, we have never complete set of distances between each pair of protons. The NOE crosspeaks are detectable for distances up to around 5 Angstrom. From these, many signal would share the
same frequency in the spectrum, and thus, assignment between signal and atom (atom pair) can be done only within some group, or not at all. Furthermore, the experimentally-derived distances contain various sources of error.
Partly, it is due to random noise, but partly due to incompletely resolved relayed transfer and partly due to different (local) dynamics influencing the cross-relaxation rate.
Let's start anyway with the unrealistic situation, where we know all the distances within 5.5 A, accurately.
They are written Upper distance Limit files (here PxP.upl) file, which is used by CYANA.
There is no closed-form formula to calculate the conformation (structure) from a set of distances. The setup starts with defining an energy penalty for every experimental distance not fulfilled by the molecular conformation. These are also called distance restraints. Starting from one chosen conformation, and trying to minimize the structure (using steepest descent or other local method) to fulfill the distances measured by NOE (or any other means) would fail: the structure would end-up in a local minimum. Instead we have to search for a global minimum. A commonly used algorithm for a global minimum is called simulated annealing, where the molecule is heated up such that high energy barriers (due to van der Waals clashes) can be surpassed. By a subsequent cooling, the imposed distance restraints will drive the molecule towards the conformation with minimal violation of the distance restraints. Many attempts will nevertheless end up in different local minima, and hence, only a subset of resulting conformers, the lowest-energy conformers will be likely to represent the global minimum.
In practice, we have to input the knowledge about the covalent (bonding) structure of the molecule, and the distance restraints. The bonding structure can be as simple as the chain of amino acids, as the standard programs would have libraries of the actual atomic bonding (topology) for those. For an unknown molecule, we have to supply a full topology ourselves. These would be different for different programs.
We will use a program specialized in structure calculation from NMR restraints: CYANA by Prof. Dr. Peter Günter.
CYANA can obtain the bonding topology from a .mol2 molecular structure file, converting it into its own (library) format, a .lib file. This library file will contain information about one molecule, but since biopolymers - proteins contain chain (sequence) of building blocks like amino acid residues of nucleotides, there has to be also information about the sequence of those building blocks. In our case, it contains only one line: the name and the index of our ligand molecule.
Hint: More information on the CYANA commands etc. is in the CYANA 3.0 Reference Manual.
Remark: CYANA is a proprietary software. For any installation problem, contact Peter Güntert, the author of CYANA.
It used to be common to only simply classify the experimental NOE intensities to strong, medium and weak, and assign corresponding distance ranges of around 2 Angsrom for the strongest and 5.5 to the weakest.
Here we use a technique, which would be a starting point for more accurate methods. It uses spectra recorded for different mixing times. As the NOESY spectrum of a protein can take days to record, recording series of them is a large investment. In that the NOE crosspeak volume (intensity) is divided by a geometric mean of the corresponding diagonal volumes. The slope is a better approximation of the crossrelaxation rate than if this "linearization" or "normalization" was not done. In the case of two isolated nuclei, such buildup curve is approximately linear over longer mixing times.
Here we probe three effortless options to obtain a better set of distances. The first comes from an idea, that short distances are much less likely to be affected by relayed transfer (spin diffusion), so we try to keep only those, within 2.5 Angstrom.
In the next simple attempt, we multiply all the distances by a constant factor (1.75) and use again those within 5.5A.
Next, we can think that the number of (inaccurate) constraints is simply too large. Commonly we would expect up to
around 10 constraints per aminoacid residue. In the previous example, it was still around 12800/97 > 100 restraints
per residue. Here we try to use, only every 10th restraints.
.
We will investigate the effect of exchange of the ligand between the free state in the solution and bound to the protein.
The NOESY spectrum of the ligand (the intramolecular NOEs) would be formed as a population average of the bound and free form. In the free form, cross-relaxation rate is commonly negative, whereas in the bound form, it is positive, the same as the protein. Moreover, it is commonly much larger (absolute value).
What is even more important is the population weighting of the intermolecular cross-rates (hence also NOEs)
The "Re weighted" matrix contains includes the correction, whereas the "weight averaged over P, L, PL" does not.
We have exact distances (assume we are able to obtain them), but we ignore the populations, so the intermolecular calibration is wrong.
Here the populations are take into account correctly.
Assembled by Dr. Jiří Mareš, shaped by discussion with Prof. Julien Orts and other members of the research group (https://bionmr.univie.ac.at/people/)