Biosciences Division
>> Structural Biology
   
Argonne Home > BIO Home > Structural Biology > Homology Analysis for Structure, Function, and Malfunction
 
About Us

Home

Research
People
Site Index
Org Chart
Contact Us

Inside BIO
BIO Safety

About Argonne
   
 

Search WWW
www.bio.anl.gov

 

Homology Analysis for Structure, Function, and Malfunction

Fred Stevens
Sr. Biophysicist 

Bldg: 202. Room: B229
E-mail: fstevens@anl.gov  

Phone: (630) 252-3837 

 

Biographical Sketch
Publications

> Research

A major challenge for structural biology is to maximize the information content in the amino acid sequence of a protein.  The information relevant to functional genomics includes function(s), stability, and interaction partners.  For two proteins closely linked by evolution, comparison of global amino acid sequences can provide good guidance for structural recognition and, in fewer cases than is often acknowledged, function.  However, confident assignment of functional attributes requires knowledge of the limited number of amino acids that impart those functions.  This information is usually the product of multiple crystallographic analyses of a protein of known function to reveal interaction with substrates and cofactors or their analogs. To accurately assign function or functions to a protein based on amino acid sequence, it is necessary to first recognize the fold of the protein.  This identifies one or more structural homologs of the protein of interest; the structural and functional studies of these proteins can then be mined to achieve accurate annotation.  In some cases what will be achieved is accurate knowledge that the function remains unknown or partially known.  In other cases, substantive hypotheses suitable for efficient experimental testing will emerge.

Thus, maximizing the information content of protein sequence data is dependent upon increasing our ability to recognize the fold of a protein.  Most soluble bacterial proteins appear to have known, or partially known folds.  In many of the cases in which we do not recognize a fold, it is probable that the fold has already been characterized in another protein.  Although our coverage of the structures of human proteins is less complete, it remains likely that a higher percentage of these proteins can be structurally analyzed at a level sufficient to facilitate generation of hypotheses to guide experimentation.  Since most proteins with related structure and function are too evolutionarily dispersed to be considered “significantly similar” by statistically based comparisons, a fundamental need exists to replace “statistical significance” with some form “evolutional significance” in which statistical significance is based not on a global comparison, which is dominated by evolutionary noise, but by statistical significance at the limited number of positions that dominate the structural and functional properties of any protein.

> Tri-Psi BLAST against NR database

  • Psi-BLAST iteratively uses sequence data to effectively optimize similarity parameters to recognize homologs.
  • The NCBI database of non-redundant protein sequences provides much more “training” data than is present in the database of PDB sequences.
  • Three rounds of Psi-BLAST appears to be near optimal for fold recognition while minimizing drift to identification of distant evolutionary remnants.
  • Late round Psi-BLAST recognition does not imply functional relationships.

Privacy & Security Notice | Contact Us | Site IndexWebMaster | Home | Updated 23.08.06 2:25 PM © ANL Biosciences Division, 2005