Protein NMR - A Practical Guide
Isotopic Labelling New Icon

This page describes some of the different isotopic labelling strategies that are commonly used in Protein NMR.

In most cases (except in cell-free labelling), the protein is expressed by bacteria. The isotopic labels are introduced by feeding the bacteria specific nutrients. In most cases the basis will be so-called minimal medium which contains all the salts and trace elements needed by the bacteria but contains no carbon or nitrogen sources. These elements can then be introduced using a variety of different isotopically labelled carbon and nitrogen sources.

Top

Protein Preparation
This is the simplist and the cheapest form of labelling. The protein is produced by expression from bacteria which are grown on minimal medium supplemented with 15NH4Cl and wild-type (wt) glucose.

Applications
Using 15N-labelled protein you can record the standard solution-NMR HSQC experiment. This is a good spectrum with which to check whether your protein is folded and assess the quality of your spectrum to see whether it is worth recording other spectra and possibly proceeding on to other, more expensive labelling schemes. It is also possible to record a variety of dynamics experiments using 15N-labelled protein as well as N-H residual dipolar couplings (RDCs). If the protein is comparitively small (< ~150 amino acids) you can assign the 1H and 15N backbone resonances using 15N-NOESY and 15N-TOCSY epxeriments. 15N-labelled protein can also be useful for titrations with ligands or other proteins with which it forms a complex.

Reference:
L.P. McIntosh and F.W. Dahlquist (1990) Quart. Rev. Biophys. 23 1-38. (Link to Article)

Top

Protein Preparation
15N,13C-labelling is commonly referred to as double labelling. The protein is produced by expression from bacteria which are grown on minimal medium supplemented with 15NH4Cl and 13C-glucose.

Applications
This is a very common form of labelling and enables straight forward assignment of both the backbone and side-chain 1H, 13C and 15N atoms using triple-resonance spectra. A high proportion of these assignments are required if you wish to calculate the structure of your protein accurately.

Reference:
L.P. McIntosh and F.W. Dahlquist (1990) Quart. Rev. Biophys. 23 1-38. (Link to Article)

Top

Protein Preparation
15N,13C,2H-labelling is commonly referred to as triple labelling. The protein is produced by expression from bacteria which are grown on minimal medium supplemented with 15NH4Cl and 13C-glucose and using D2O instead of H2O. This will result in about 70-80% deuteration of the side-chains, as there is a certain amount of contaminating 1H present from the glucose. Higher levels of deuteration of around 95% can be achieved if 13C,2H-glucose is used. However, this is more expensive and in many cases the cheaper version is sufficient. Note that the NH groups are exchangeable. This means that they will back-exchange to 1H when the protein is purified in normal aqueous solution. In this way, many of the normal NH-based experiments can be carried out on triple-labelled protein.

Applications
Deuteration is mainly used for large proteins. On account of their slower tumbling, the relaxation properties become less favourable, and the spectral quality deteriorates - eventually the peaks become so broad, that no spectrum is detectable. By deuterating the protein and thus removing most 1H atoms (protons), the relaxation properties are improved again. Backbone and Cβ assignment of triply labelled proteins is possible using out-and-back versions of the HNCACB and HN(CO)CACB experiments. Note that the Cα and Cβ chemical shifts will be shifted by up to half a ppm or more from their values in 15N,13C-labelled protein due to the deuterium isotope effect.

Reference:
D.M. LeMaster (1990) Quart. Rev. Biophys. 23 133-174. (Link to Article)

Top

Protein Preparation
The IVL labelling scheme produces protein which is uniformly 2H,13C,15N-labelled, except for the Ile, Val and Leu side-chains which are labelled as follows:

IVL labelling pattern

Isoleucine, valine and leucine therefore each contain one methyl group which is 2H12C labelled and one which is 1H13C labelled. In isoleucine it is always the δ1 methyl group which is protonated and 13C labelled and the γ2 methyl group which is deuterated and 12C labelled. For leucine and valine the methyl groups are not stereoselectively labelled. This means that half the valine/leucine in the protein will be 13C and 1H labelled at the γ1/δ1 methyl group and the other half at the γ2/δ2 group. For this reason both valine and leucine methyl groups give rise to peaks in the NMR spectra, but at only about half the intensity of the isoleucine δ1 methyl group.

The protein is produced by expression from bacteria which are grown on minimal medium in D2O using 13C,2H-glucose as the main carbon source and 15NH4Cl as the nitrogen source. One hour prior to induction α-ketobutyrate and α-keto-isovalerate (labelled as shown below) are added to the growth medium and lead to the desired labelling of the Ile and the Val and Leu residues, respectively.

Labelled precursors used for IVL labelling

Applications
The labelling scheme was developed in order to improve structure calulcations of large proteins. Large proteins are usually triply labelled and thus only the (mainly backbone) NH groups are 1H-labelled and visible in NMR spectra. NOEs are thus only observed between backbone NH groups which leads to lower quality structures. Tugarinov and Kay developed the IVL labelling scheme, based on the fact, that methyl groups (a) give strong signals due to signal averaging, and (b) are usually found burried in the core - they are thus close to one another and are able to give rise to NOE signals; they will also provide very important structural restraints which are highly complementary to the NH-NH restraints.

Reference:
V. Tugarinov and L.E. Kay (2003) J. Am. Chem. Soc. 125 13868-1387. (Link to Article)

Top

Protein Preparation
SAIL-labelled protein is prepared using cell-free methods. The amino acids used for the protein production are prepared using chemical and enzymatic syntheses. Their labelling is guided by the following principles:
- in each methylene group, one of the 1H atoms is stereo-selectively replaced by a 2H atom.
- in each methyl group, two of the 1H atoms are replaced by 2H atoms
- the prochiral methyl groups of Leu and Val are stereo-selectively 12C2H3 and 13C1H2H2 labelled
- six-membered aromatic rings have alternating 12C2H and 13C1H moieties
The SAIL Wiki site has a figure illustrating the labelling scheme.

Applications
The SAIL labeling pattern is particularly useful for the structure calculation of large proteins with advantages seen in the NOESY spectra. Non-exchangeable side-chain protons are essential for defining side-chain conformations, but are often prone to overlap. By reducing the number of these protons in SAIL-labelled samples, the overlap is decreased and the quality of NOESY spectra is improved. Indeed, although there are fewer NOE peaks for a SAIL-labelled protein than for a uniformly 15N,13C-labelled one, it is likely that there are more interpretable NOE peaks. In any case, many of the peaks which are removed, have a low information content because they involve fixed (geminal) distances or duplicate information already present from other NOEs. In addition, the information obtained is immediately stereo-specific. A further advantage is that the most serious sources of spin diffusion are removed which improves the accuracy of inter-proton distance measurements.
More general advantages of SAIL-labelled samples include the fact that lines become sharper due to a decrease in long-range couplings and by eliminating dipolar relaxation pathways. The aromatic ring labeling strategy removes one-bond 13C-13C couplings, which often complicate spectra or require the use of constant time data collection methods to reduce spectral complexity. And the measurements of couplings is made easier because CH2 and CH3 groups are converted to CHD and CHD2 groups.
The labelleding of methyl and methylene groups in SAIL-labelled samples also makes them amenable to straight forward relaxation experiments for the investigation of side-chain motions.
Despite the clear advantages to using SAIL-labelled protein for structure determination, this labelling method is not very common - probably because it is very expensive and requires cell-free technologoy which is not available in many labs.

Reference:
M. Kainosho, T. Torizawa, Y. Iwashita, T. Terauchi, A.M. Ono and P. Güntert (2006) Nature 440 52-57. (Link to Article)

SAIL Wiki run by the Kainosho, Jee and Güntert labs

Top

Protein Preparation
The protein is produced by expression from bacteria which are grown on minimal medium supplemented with 15NH4Cl and either 1,3-13C-labelled glycerol or 2-13C-labelled glycerol.

The labelling scheme obtained is somewhat involved. In the following figure the sites coloured in blue are labelled in the 1,3-13C glycerol sample and the sites coloured in red are labelled in the 2-13C glycerol sample.

Glycerol labelling pattern

For those amino acids synthesised via the citric acid cycle several so-called isotopomers are obtained which then result in the average labelling shown above. For threonine the individual isotopomers which give rise to the average labelling are shown. A complete figure of all the isotopomers is available here. Note that the relative populations of isotopomers shown in these figures are those determined for SH3 (Castellani et al. 2002) - for other proteins they may be slightly different.

Applications
Originally labelling with 2-13C-glycerol was used to for relaxation measurements in solution (LeMaster and Kushlan, 1996). The Cα atoms in such samples do not contain any neighbouring 13C-labelled atoms (with the exception only of Valine). Relaxation measurements on the isolated 13Cα-1Hα groups can therefore be interepreted more easily without having to take couplings to the neighbouring carbon atoms into account.
More recently Castellani et al. (2002) showed that 1,3-13C- and 2-13C-glycerol labelled samples are useful for protein structure determination in solid-state MAS NMR. The samples have several advantages: (a) many J-couplings are removed which results in sharper lines, (b) very strong dipolar couplings between neighbouring atoms are removed and enable more long-range correlations via weak dipolar couplings to be observerd and (c) the spectral overlap is decreased. Currently this is probably the most common application of 1,3-13C- and 2-13C-glycerol labelling. The characteristic labelling patterns for each amino acid can also be exploited for assignment purposes (Higman et al. 2009).
Hong and Jakes (1999) combined 2-13C-glycerol labelling with amino acid specific labelling in order to label only half the amino acids with 2-13C-glycerol.

References:
D.M. LeMaster and D.M. Kushlan (1996) J. Am. Chem. Soc. 118 9255-9264. (Link to Article)
M. Hong and K. Jakes (1999) J. Biomol. NMR 14 71-74. (Link to Article)
F. Castellani, B.-J. van Rossum, A. Diehl, M. Schubert, K. Rehbein and H. Oschkinat (2002) Nature 420 98-102. (Link to Article)
V. A. Higman, J. Flinders, M. Hiller, S. Jehle, S. Markovic, S. Fiedler, B.-J. van Rossum and H. Oschkinat (2009) J. Biomol. NMR 44 245-260. (Link to Article)

Top

Protein Preparation
The protein is produced by expression from bacteria which are grown on minimal medium supplemented with small amounts of 15NH4Cl and 13C-labelled glucose as well as labelled and unlabelled amino acids. The idea is that only those amino acids which are added in labelled form become labelled in the protein. Unfortunately, this may not always work as desired, since the E.Coli metabolism and catabolism causes a degree of interconversion between amino acids. Thus, it is not possible to create a sample with any combination of labelled amino acids. The situation can be improved somewhat by using auxotrophic bacterial strains or incorporating enzyme inhibitors. However, if complete control over the incorporation of amino acids is required, then cell-free methods must be used.

A cheaper way of labelling only certain amino acids, often called reverse labelling, involves expression from bacteria which are grown on minimal medium supplemented with 15NH4Cl and 13C-labelled glucose as well as unlabelled amino acids. This supresses the labelling of these amino acids and only those which have not been added unlabelled will be synthesised by the bacteria using the 13C-glucose as the carbon source. Again, a certain amount of scrambling may occur.

Applications
Amino acid specific labelling is usually used in order to reduce the spectral overlap and be able to monitor certain amino acids without interference from other signals. In some cases is may also be used to help with assignments processes.

References:
L.P. McIntosh and F.W. Dahlquist (1990) Quart. Rev. Biophys. 23 1-38. (Link to Article)

Top

Protein Preparation
Rather than expressing the protein in a cell, the protein is expressed in vitro. The DNA or mRNA for the target protein is added to a cell extract containing the transcription and translation machinery along with a variety of other compounds including the 20 amino acids, nucleoside triphosphates (NTPs), several enyzmes as well as buffers, salts etc. The reaction mixiture is typically only a few μl-ml large, but yields of several mg of protein per ml of reaction mixture can be achieved. Chaperones, detergents and other compounds that facilitate folding can also be added.
It is very easy to label individual amino acids simply by adding them to the reaction mixture in labelled form and adding the other amino acids unlabelled. Although labelled amino acids are quite expensive, the small reaction mixtures used in cell-free protein expression prevent the costs from getting out of hand too much.

Applications
Cell-free protein expression has several advantages both for creating specific isotopic labelling schemes, but also for efficient expression, generally. Amino-acid specific labelling is possible without scrambling from the E.Coli catabolism and metabolism. Furthermore, it is possible to introduce alternative amino acids, e.g. fluorotryptophan, with far greater efficiency than when using cell-based systems. More generally, cell-free protein expression can be very useful for proteins which do not express well in cells, for example because they are toxic to the cell. Cell-free expression is also proving to be successful for membrane proteins which can traditionally be very difficult to express in large amounts in cells.

References:
G. Zubay (1973) Ann. Rev. Genet. 7 267-287. (Link to Article)
T. Kigawa, Y. Muto and S. Yokoyama (1995) J. Biomol. NMR 6 129-134. (Link to Article)
D. Schwarz, V. Dötsch and F. Bernhard (2008) Proteomics 8 3933-3946. (Link to Article)

Top

Protein Preparation
Segmentally labelled proteins are only labelled along one section of the amino acid chain (usually the N- or C-terminal sections). Sample preparation falls into three broad categories. Native Chemical Ligation (NTL) involves the preparation of one lablled and one unlablled section of the polypetide chain using solid-phase peptide synthesis (SPPS). These two sections are then joined using chemical ligation. This method is restricted to relatively short polypeptides which can be chemically synthesised using SPPS.

By contrast Expressed Protein Ligation (EPL) involves recominant expression of one or both protein halves followed by chemical ligation. Since chemical ligation would not be possible for unprotected polypeptide chains, the N-terminal fragment is expressed C-terminally fused to an intein. Attack by a thiol results in the intein being cleaved off and the N-terminal fragment forming an α-thioester. The C-terminal section meanwhile is prepared with an N-terminal cysteine. This can now attack the α-thioester and form a peptide bond between the two protein halves. Thus the point at which the two protein segments are joined will contain a cysteine - this may have to be added to the protein sequence if there is no suitable native cystine which can be used as the dividing point.

Diagram illustrating the principles of Expressed Protein Ligation

In Protein trans-Splicing (PTS) the ligation step is conducted by a functionally reconstituted split intein. An intein can covelently link the the two peptide segments flanking it, while excising itself. For protein trans-splicing, the N-terminal parts of the target protein and of the intein are encoded as a single peptide chain in one plasmid and are expressed. Meanwhile the C-terminal parts of the target protein and of the intein are encoded in a second plasmid and expressed separately using different isotopic labelling. When the two protein-intein chains are mixed under the correct conditions, the intein can fold into its functionally correct form and can covalently link the two target protein segments.

Diagram illustrating the principles of Protein trans-splicing

As with EPL a Cys residue must be present at the join of the two segments. In fact, for the trans-splicing reaction to occur efficiently, the residues flanking the Cys must also be chosen carefully depending on which intein is used. If this requires mutations or insertions to the native protein, it is advisible to check that these do not significantly affect protein strucutre and function.

Otomo et al. (1998) have even gone so far as to use two split inteins in order to create a protein in which only the middle segment is labelled.

Diagram illustrating the use of two inteins to fuse three protein segments

Applications
Segmental isotope labelling is not entirely straight forward and thus not particularly common. However, it is extremely useful for the study of multidomain proteins in which signal overlap can be reduced by isotopically labelling only one of the domains. Alternatively, it can be used to reduce signal overlap in a large protein by labelling only part of it at a time. A further application involves only labelling a small segment of a protein which is of functional interest and which can then be studied without other signals cluttering the spetra.

References:
T. Yamazaki, T. Otomo, N. Oda, Y. Kyogoku, K. Uegaki, N. Ito, Y. Ishino and H. Nakamura (1998) J. Am. Chem. Soc. 120 5591-5592. (Link to Article)
T. Otomo, N. Ito, Y. Kyogoku and T. Yamazaki (1999) Biochemistry 38 16040-16044. (Link to Article)
V. Muralidharan and T.W. Muir (2006) Nature Methods 3 429-438. (Link to Article)

Top

Here are a few other labelling schemes which I am aware of - but this is by no means a complete list. Please look at the references for more details.

Alanine Methyl Labelling

An extension/alternative to the IVL labelling method in which alanine residues are 13C labelled at the Cβ position and deuterated at the Hα potision.

Reference:
R.L. Isaacson, P.J. Simpson, M. Liu, E. Cota, X. Zhang, P. Freemont and S. Matthews (2007) J. Am. Chem. Soc 129 15428-15429. (Link to Article)

10% Glucose Labelling

This results in a somewhat complicated labelling scheme rather like the glycerol labelling. It can be used to enable sterospecific assignment of Val and Leu methyl groups and also simplifies MAS-NMR 13C-13C correlation experiments.

References:
H. Senn, B. Werner, B.A. Messerle, C. Weber, R. Traber and K. Wüthrich (1989) FEBS Letters 249 113-118. (Link to Article)
D. Neri, T. Szyperski, G. Otting, H. Senn and K. Wüthrich (1989) Biochemistry 28 7510-7516. (Link to Article)
M. Schubert, T. Manolikas, M. Rogowski and B.H. Meier (2006) J. Biomol. NMR 35 167-173. (Link to Article)

1-Glucose Labelling

This scheme introduces isolated 13C atoms into aromatic side chains. This was used to study the dynamics of aromatic side chains using relaxation methods.

Reference:
K. Teilum, U. Brath, P. Lundström and Mikael Akke (2006) J. Am. Chem. Soc 128 2516-2507. (Link to Article)

Succinic Acid Labelling

[1,2,3,4-13C], [1,4-13C] or [2,3-13C] succinic acid are used as the carbon sources in the bacterial growth medium to label a light harvesting complex in the photosynthetic purple bacteria R. acidophila. The resulting MAS-NMR spectra show an increase in resolution and reduced spectral crowding.

Reference:
A.J. van Gammeren, F.B. Hulsvergen, J.G. Hollander and H.J.M. de Groot (2004) J. Biomol. NMR 30 267-274. (Link to Article)

Acetate Labelling

The protein is produced using a mixture of 15% 13C1-acetate, 15% 13C2-acetate and 70% 12C1,12C2-acetate. This reduces (but does not eliminate) the number of directly bonded 13C atoms and makes relaxation studies easier.

Reference:
A.J. Wand, R.J. Bieber, J.L. Urbauer, R.P. McEvoy and Z. Gan (1995) J. Mag. Res 108 173-175. (Link to Article)

Top

This site conforms to the following standards:

Valid HTML 4.01 Valid CSS level2