Protein Structure Comparison
Protein Structure Comparison
Disciplines
Biology (40%); Computer Sciences (30%); Mathematics (30%)
Keywords
-
Protein Structures,
Structural Proteomics,
Protein Classification,
Rigid Body Superimposition
The major challenge for the post genomic area is the functional annotation of genes and their corresponding proteins. The knowledge of the three dimensional structure of a protein is important to functionally understand and annotate the protein. Currently, there are 17.000 protein structures publicly available. Structural genomics or proteomics initiatives have been launched to determine 3D structures in large scale. This will increase the number of 3D structures tremendously. Biologists will be confronted with an enormous amount of structural data which need to be processed. The known structures have to be classified in protein families, newly arriving structures have to be compared to classified proteins, structural frames for modeling unknown structures have to be prepared and structure prediction methods have to be developed and optimized. For these tasks, strong and powerful protein structure comparison methods are essential. Great demands are made on precision, flexibility and speed of comparison approaches. We have developed an approach for protein structure comparison called ProSup. ProSup is based on rigid body superimposition. It provides a set of alternative solutions and implements several filters to ensure the alignment quality. ProSup has been applied in a number of different fields in bioinformatics, for example in the optimization of fold recognition or in the evaluation of the large scale structure prediction experiments CASP3 and CASP4. During the application we encountered several limitations of the approach. In particular this is (1) the comparison of flexible structures (2) treatment of topological differences and (3) scoring and speed in database searches. The goal of this project is to develop an approach for protein structure comparison, which addresses those problems. The outcome will be a program which can be installed locally as well as a web server which can be accessed over the Internet. Thus it can be used for large scale applications as well as for occasional studies on individual protein pairs. Both, the program and the web service will be made available to the scientific community.
Protein molecules play a major role in living systems. They show an enormous variety in size, shape and physical and chemical properties. The three dimensional (3D) structure of a protein, i.e. the exact location of the atoms, determines its biological function. Chemically, proteins are linear chains of amino acids, where we know 20 different types, and the arrangement of the amino acids in this linear chain determines the 3D structure. Since more than 30 years experimentalists determine protein sequences and structures. Almost three million sequences and about 33.000 structures have been collected in public databases. Proteins with similar sequence usually have similar function. Thus, by means of sequence comparison one can predict the function of an experimentally yet uncharacterized protein in many cases. However, during evolution, sequences considerably diverged even if the function of a protein remained, which means that the structure is better conserved. In so called structural genomic projects an increasing amount of 3D structure data appear, where the function of the protein is unknown. Here, protein structure comparison is the method of choice to predict a possible function. This describes just one application of protein structure comparison, there are certainly more. Unfortunately, comparing the 3D structure of two proteins is a much more difficult computational problem than comparing two linear protein sequences. A number of methods have been developed in several groups. All the methods have in common, that they work well and give identical results if the similarity is high, but fail or considerably disagree in results if the similarity is low. During this project, we collected a dataset of related protein pairs with low similarity, showing all different kind of problems like flexibility, changes in the sequence order and the like. Then we analyzed the current methods how they behave in these cases, i.e. identified their strengths and weaknesses. We implemented a software platform, acting as basis for developing new approaches. Based on what we have learned during the analysis phase, we implemented a protein structure comparison method which is able to handle the majority of the problematic cases covered by the dataset. As the structure comparison problem is difficult, there is still room for improvements in the methods. However, our computer platform allows us to develop new solutions quicker than before and the test-set allows us to evaluate them better than before. Improved structure comparison methods, at the end, allow us to annotate and classify newly determined protein structures with better success and precision.
- Universität Salzburg - 100%
Research Output
- 78 Citations
- 2 Publications
-
2007
Title Comparative Analysis of Protein Structure Alignments DOI 10.1186/1472-6807-7-50 Type Journal Article Author Mayr G Journal BMC Structural Biology Pages 50 Link Publication -
2008
Title Automated Quantitative Assessment of Proteins' Biological Function in Protein Knowledge Bases DOI 10.1155/2008/897019 Type Journal Article Author Mayr G Journal Advances in Bioinformatics Pages 897019 Link Publication