Lee Makowski - Principal Investigator

 

4HHB: The crystal structure of human deoxyhaemoglobin at 1.74 A resolution

Project Members

Lee Makowski
Diane J. Rodi           
Bob Fischetti
 GM/CA-CAT, APS
Satish Devarapalli
Suneeta Mandava

 

 

 

The high-resolution, three-dimensional structure of a protein provides an important basis for evaluating protein function. Unfortunately, high resolution structural imaging via macromolecular crystallography and NMR spectroscopy is applicable only to a relatively small proportion of the proteome and published success rates for high-throughput (HTP) structural genomics centers are currently less than 2% from target identification to fold characterization. For proteins with detectable sequence similarity to a structurally characterized protein it is possible to predict the fold or structural class of a protein with a level of confidence dependent on the degree of similarity. There are a number of biophysical tools that can provide information on the size, shape and secondary structure of a protein. But experimental verification of a fold prediction is only possible using protein crystallography or NMR.

Wide angle x-ray scattering from proteins in solution produces data that contains information relevant to determination of protein fold. But at relevant scattering angles these data are weak, and the degree to which they might be used to categorize the fold of a protein is unknown. Our preliminary work at BioCAT (sector 18 of the Advanced Photon Source) has demonstrated the capability of collecting scattering data from proteins in solution to spacings of 2.2 Ǻ (q=2.8 A-1), and the potential for the use of this data in structural characterization of proteins has not yet been explored. This project seeks to optimize and enhance the collection and analysis of solution scattering data and to determine the extent to which it is possible to accurately determine the fold of a protein from this data.

Small angle x-ray scattering (SAXS) of proteins and macromolecular complexes in solution has long been used to reliably yield information about the size and shape of proteins. More recently it has been demonstrated that wide angle scattering patterns (WAXS) obtained at high flux third generation synchrotron beam lines are not only sensitive to protein conformation states, but that the scattering patterns generated can be quantitatively compared to data calculated from detailed structural models (Svergun et al., 1995; Hirai et al., 2002). Our recent data collection demonstrates that it is possible to extend this work to even higher scattering angles.

This data, made possible by third generation synchrotron sources, provides a rich source of structural information that has not yet been exploited. Given the broad range of conditions and particle sizes amenable to solution x-ray scattering, a combination of HTP SAXS and WAXS analysis downstream from a large-scale protein production facility has the potential to generate information on the size, shape and structural class (i.e. fold) of every expressed protein. It is applicable to all classes of proteins including membrane proteins, large protein complexes and proteins with substantial amounts of disordered material. An unambiguous determination of fold cannot be obtained directly from solution scattering data (Svergun et al.,2001), but comparison of solution scattering from proteins of unknown structure with data from proteins of known structure has the potential for reducing the number of possible folds to a very short list, if not a unique designation. In addition to information on the structure of the protein, information about processes accompanied by large structural changes (that cannot be accommodated within a crystal lattice) can be obtained from solution scattering. These include, protein folding; unfolding; protein-ligand interactions and domain movement.
 

Preliminary Results

Feasibility studies performed at the BioCAT undulator beam line at the APS indicate that WAXS studies of protein solutions have significant potential for closing the gap between target and fold designation that presently exists in the structural genomics pipeline. Our group has designed an apparatus for the initial studies using WAXS of protein solutions with minimal air gaps, coupled with a CCD detector specially designed for imaging measurements requiring relatively high sensitivity (a dynamic range of 10,000 to 1) and high spatial resolution (50 μm) (Phillips et al., 2002). Initial scattering experiments with well characterized commercially available protein samples indicated that radiation damage was reduced to a negligible level through the use of a sample flow cell. Under these conditions, a scattering curve for a q range of between 1/100 and 1/2.5 Ǻ-1 is measured in less than ten seconds for protein solutions of concentration 5 mg/ml or higher. Smaller angle information, when required, can be obtained with an alternate camera using very short exposures.

Two sessions of small- and wide- angle x-ray solution scattering were carried out at BioCAT in late 2002. During the first session, two camera arrangements were used, a small angle camera with specimen to detector distance of 186 cm; and a wide angle camera with specimen to detector distance of 147 mm. Diffraction from a protein solution was preceded by collection of data from an identical buffer in the absence of protein. Use of a flow through specimen cell allowed collection of data from protein and from buffer using identical portions of the cell.

Solution scattering from myoglobin (red curve) compared to that from hemoglobin (black curve). The general forms of the two scattering curves are very similar, reflecting the similarity in the structure of the two proteins. In diffraction from hemoglobin additional structure is observed in the 0.02-0.1 A-1 range due to its quarternary structure.

Thirteen proteins were analyzed using both SAXS and WAXS. Of these 13, 8 were proteins provided by the MCSG and two were integral membrane proteins supplied by collaborators in the Biosciences Division at Argonne. The remainders were purchased. Experiments were carried out to investigate the effect of radiation damage and the effect of concentration on the quality of the data. Repetitive exposure to stationary protein solutions was used to gauge the effect of high exposure on wide angle scattering. These experiments verified that high quality WAXS data could be collected prior to protein degradation in the beam. Nevertheless, the effect of radiation was minimized through the use of a flow cell in which protein flowed across the beam path during the exposure and x-ray exposure to any one portion of the specimen was less than 100 ms. Wide-angle data from solutions of proteins at higher concentration showed no signs of aggregation, and lowered potential errors due to mis-scaling of diffraction from the buffer solution prior to subtraction from the data.

During the second session data from an additional 25 proteins were collected. A newly designed camera was used for collection of WAXS data, enabling collection of observable data to spacings of approximately 0.45 A-1 (q=2.8 A-1). Data from myoglobin and hemoglobin were remarkably similar, except for the presence, in the hemoglobin data, of a modulation corresponding to the distance between the four myoglobin-like domains (see figure). Data from several proteins were collected as a function of GuHCl. In each case, at higher concentrations, the higher frequency modulations weakened relative to the overall distribution of diffracted intensity, suggesting a loss of structural coherence on the scale of the diameter of the protein. Data from myoglobin in 2M and 4M GuHCl is included in the second figure. Comparison of this effect in two immunoglobulin domains (one engineered for additional stability) indicated that the observed effect of GuHCl on diffracted intensity was significantly different in proteins known to have different levels of structural stability.

Comparison of the solution scattering data with that calculated using the program CRYSOL showed that the calculated data differed from that observed in several interesting and informative ways. Out to spacings of about 0.15 A-1, the positions of major features in the observed diffraction usually corresponded well to those calculated. At higher angles, the number of harmonics used by CRYSOL appears to be inadequate to duplicate the detail in the observed patterns. Furthermore, the relative scaling of small angle diffraction and wide angle diffraction is different in the calculated and observed patterns. In the observed patterns, the ratio of intensity of peaks at high angle to those at low angle is greater than in the calculated patterns. This could also be due to the limited number of harmonics available in CRYSOL, or could be due to a mis-scaling (within CRYSOL) of the contribution of internal (atomic) detail with the contribution of the solvent excluding volume.
  

Wide angle scattering curves for cytochrome C. The red curve is the theoretical solution scattering curve calculated using CRYSOL with 50 spherical harmonics as described in Methods. The black curve is the measured scattering solution curve obtained with data collection mode RD as described in Methods. The y axis is in arbitrary units for relative intensity, and the x axis is in Å-1.

The correspondence of calculated and observed diffraction for cytochrome C, shown in Figure 1, indicated that the measured scattering from the protein, although low compared to background scatter, was consistent with expectations based on the atomic coordinates of cytochrome C. Specifically, there was good agreement in both peak position and relative heights for both plots. Roughly the same degree of correspondence was obtained for both myoglobin (see below) and hemoglobin (see below).

Wide angle scattering curves for myoglobin. The red curve is the theoretical solution scattering curve calculated using CRYSOL with 50 spherical harmonics as described in Methods. The black curve is the measured scattering solution curve obtained with data collection mode RD as described in Methods. The y axis is in arbitrary units for relative intensity, and the x axis is in Å-1.


Wide angle scattering curves for hemoglobin. The red curve is the theoretical solution scattering curve calculated using CRYSOL with 50 spherical harmonics as described in Methods. The black curve is the measured scattering solution curve obtained with data collection mode RD as described in Methods. The y axis is in arbitrary units for relative intensity, and the x axis is in Å-1. Note that the quaternary structure of hemoglobin as compared to the monomeric myoglobin leads to additional high frequency fluctuations in the 1/d range of 0.021 to 0.09 Å-1.

Hemoglobin exhibited greater loss of structure (when compared to myoglobin) in the presence of 2 M GuHCl, but maintained a greater degree of secondary structure in the presence of 4 M GuHCl.  

Effect of increasing concentration of guanidine hydrochloride on the solution scattering profile from hemoglobin. The black curve is 0 M guanidine hydrochloride, the red curve is 2M guanidine hydrochloride and the blue curve is 4M guanidine hydrochloride. Note that the addition of guanidine hydrochloride to 2M completely obliterates the peak at 1/d of around 0.21 Å-1, a spacing which is consistent with the tetrameric form of hemoglobin. The y axis is in arbitrary units for relative intensity, and the x axis is in Å-1.

A major concern in the use of third generation sources is radiation damage to proteins. Most crystallographic stations use crystals flash frozen in the presence of cryo-protectants to minimize the effect of radiation on protein structure. The experiments described here were carried out at room temperature and heating due to x-ray exposure was not monitored. Data collection protocols were designed to assess the effect of radiation damage under these conditions. In order to assess the effect of radiation dose on proteins, three scattering data collection modes were employed.

(i) Mode one collected a series of 0.7 second exposures from protein samples sitting stationary within the sample cell in the beam path. (ST mode)

(ii) Mode two collected a series of 8.3 second exposures from protein samples that were oscillated within the sample capillary at a rate of 10.3 oscillations/minute during beam exposure. (FRY mode)

(iii) Mode three collected a series of 8.3 second exposures from protein samples that were kept flowing unidirectionally through the beam during exposure, so that no one part of the solution was exposed more than once in the direct beam. (RD mode) Flow rate was adjusted so that no single protein spent more than 100 ms in the direct beam (2.4 μL per second).

The diffraction data shown below were collected from cytochrome C in mode 3 (RD-continuous flow) at a flow rate that ensured proteins were in the direct beam for no more than 100 ms. The close correspondence of calculated and observed diffraction from cytochrome C in the figure suggests that little radiation damage occurred to the protein molecules during collection of data in the flow cell. Background-subtracted scattering profiles for all three proteins in all three beam exposure modes (i.e. RD, FRY and ST mode-generated profiles) were calculated and contrasted. Neither cytochrome C nor myoglobin exhibited any observable difference between any of the three beam exposure protocols. In the figure below, the data for myoglobin in RD mode (red) is compared to data collected using 0.5 second exposures to a stationary sample or ST mode (black; shown is the average of ten 0.5 second exposures). The two data sets are identical to within the counting errors (the signal to noise ratio is much lower for ST mode because they represent data from a total exposure of about 0.7s X 8 = 5.6s or about 15% of that used to collect data in RD mode (8.3s X 5 or 41.5s).

The samples were not moved between exposures during the series of still shots (ST mode). Nevertheless, no progressive degradation was observed within the myoglobin or cytochrome C series - data from the first and last exposures were identical to within counting statistics. As it is unlikely that either cytochrome C or myoglobin is impervious to 5 seconds of exposure to the direct x-ray beam, we consider it more likely that the beam resulted in heating-induced convection within the capillary, minimizing the amount of time any one molecule was exposed to the beam. The data does not, however, unequivocally prove this assertion.
 

Wide angle scattering curves for cytochrome C. The red curve is the theoretical solution scattering curve calculated using CRYSOL with 50 spherical harmonics as described in Methods. The black curve is the measured scattering solution curve obtained with data collection mode RD as described in Methods. The y axis is in arbitrary units for relative intensity, and the x axis is in Å-1.


 

Comparison of the average solution scattering profiles obtained from myoglobin using RD data collection mode (red) and ST data collection mode (black). The y axis is in arbitrary units for relative intensity, and the x axis is in Å-1.

In contrast to the above data, scatter patterns for hemoglobin demonstrated significant deviations between the RD and FRY modes (virtually identical) and the ST mode. The figure below shows the average from ten ST shots (10 X 0.5s = 5s total exposure) in black contrasted with the average of four RD shots (8.1s X 4 = 32.4s total exposure) in red. The protein sample in the stationary (ST) mode shows clear signs of degradation across the entire pattern, indicating a breakdown of features within the size range for both secondary and tertiary structure (note the partial loss of peaks at spacings of roughly 0.08 and 0.095 Å). To explore this breakdown further, hemoglobin samples were intentionally denatured by the addition of guanidine hydrochloride to 2 molar and then 4 molar final concentration, then exposed to the beam in RD mode (figure above). It can be seen that upon the addition of guanidine-HCl to 2M (black to red) there is a significant loss of features at both very high spacings (over 0.3 Å-1) and at lower spacing. Further addition of guanidine-HCl to 4M shows a catastrophic loss of structure over the entire pattern (blue curve). Identical treatment of myoglobin with 2M and 4M guanidine-HCl resulted in a similar pattern, with the tertiary structure-associated doublet stronger at 2M, suggesting that myoglobin is somewhat more resistant to the effect of guanidine-HCl than hemoglobin (data not shown).

Comparison of the average solution scattering profiles obtained from hemoglobin using RD data collection mode (red) and ST data collection mode (black). Note that the ST mode shows clear signs of degradation. . The y axis is in arbitrary units for relative intensity, and the x axis is in Å-1.


Denaturing gel electrophoresis of aliquots from all three protein samples pre- and post-beam exposure was carried out to determine whether any chemical degradation occurred as a result of x-ray exposure. Non-reducing SDS gel electrophoresis revealed no detectable crosslinking or cleavage reactions occurring upon irradiation (see figure below). No new bands of either higher or lower molecular weight were seen following irradiation (lanes c, e and g), even in grossly overloaded lanes (lanes c and e), and no detectable disulfide bond cleavage of globin homodimers was observed (lanes c and f), as the ratio of dimer to monomer remained roughly unchanged. These data imply that the only x-ray beam induced damage detectable at this level of analysis was at the secondary and tertiary structure level. 


 

Non-reducing SDS polyacrylamide gel of unexposed and exposed cytochrome C (lanes b and c); myoglobin (lanes d and e); and hemoglobin (lanes f and g). Lanes a and h are duplicate loadings of standard proteins of molecular weight (in descending order from top to bottom): 112, 81, 49.9, 36.2, 29.9 and 21.3 KDaltons.


It is becoming increasingly apparent that three-dimensional structural information will be critical in making a comprehensive functional analysis of many, if not most, proteins. For the majority of proteins that cannot be readily crystallized new methods of structural characterization are needed. For those proteins that can be crystallized, methods for characterization of functional processes that are accompanied by large structural changes will be required. X-ray scattering from proteins in solution provides direct structural information about the secondary, tertiary and quaternary organization of a protein. With optimized hardware and software, this data can be collected in a high throughput fashion to provide information about three dimensional structures and structural changes that occur in solution.

A major concern for x-ray scattering of protein solutions is radiation induced degradation by the third generation synchrotron radiation source utilized in these studies. Earlier work performed at the JASRI at Spring8 (Hirai et al. 2002) involving 60s exposure times to collect data at spacings of ~ 0.003 to ~ 0.4 Å-1 did not address this issue. The data presented here indicates that x-ray damage is observable when proteins are intentionally overexposed to x-rays, and that experimental protocols can be designed to minimize that damage. In the case of hemoglobin, protein degradation associated with overexposure includes breakdown of secondary and tertiary, but not quaternary structure. Although similar in some aspects to the effect of a denaturant, radiation damage to hemoglobin results in degradation that is distinctly different in detail.

These solution scattering experiments on cytochrome C, myoglobin and hemoglobin indicate that accurate solution scattering data can be collected to spacings approaching 2.2 Å; that this data can be adequately predicted from the atomic coordinates of crystallized proteins; that it can be used for comparative structural analyses; and that it can monitor structural changes that occur in the sample. Unlike circular dichroism (CD) spectroscopy which provides extensive short range information on the percentage content of α-helices, β-sheets, etc., the data shown here clearly demonstrates a sensitivity to tertiary and quaternary structural influences that are not apparent in CD spectra. This suggests that this technique may ultimately prove to be a valuable tool for the rapid confirmation or rejection of structural hypotheses derived from amino acid sequence data via bioinformatic analysis. The impact of WAXS data would grow substantially if an extensive data base of solution scattering from proteins of known structure were constructed. This data base could provide the basis for making predictions about the domain and fold structure of proteins of unknown structure and make possible detailed structural analyses of dynamic functional processes.

This work was supported by Laboratory Directed Research and Development funding provided by the Department of Energy. Use of the Advanced Photon Source was supported by the U.S. Department of Energy, Basic Energy Sciences, Office of Science. BioCAT is a National Institutes of Health-supported Research Center RR-08630.

Most recent beamline configuration (Design by Bob Fischetti) 

Page-1
 

Page-2

Picture courtesy of Lee Makowski  

References
2005 © Argonne National Laboratory