Sequence Analysis Methods for Protein Function Prediction
Sequence Analysis Methods for Protein Function Prediction
Disciplines
Other Human Medicine, Health Sciences (50%); Biology (30%); Mathematics (20%)
Keywords
-
SEQUENCE ANALYSIS,
POSTTRANSLATIONAL MODIFICATION PREDICTIO,
LOW COMPLEXITY REGION,
SEQUENCE DATABASE SEARCH,
LARGE-SCALE SEQUENCE ANNOTATION,
PROTEIN FUNCTION PREDICTION
The event of complete genome sequencing has opened new possibilities and promises. The pure availability of genomic sequences is of little help in life science research, albeit the development of techniques for sequence analysis did not keep pace with the progress in literal sequence information production. Whereas the delineation of protein sequences possibly encoded in the given genome data is sometimes difficult but generally possible, the insufficient understanding of biological function for proteins known only as conceptual translations has become the major bottleneck. With this project, we promise the solution of three important tasks with general impact for protein sequence analysis: Part 1: Low complexity regions (LCRs) are generally not well characterized in terms of function and structure. This might improve if a BLAST-like database search tool becomes available clustering LCRs with some type of similarity criterion into families. We want to develop such tool. This task cannot be solved with the recently applied apparatus for homology searches since the statistical criteria applied do not work for segments with strongly biased amino acid composition. Due to the degeneracy of the sequence within LCRs, it appears that the actual sequence of residues is very often not the critical point in their biological function compared with certain integral sequence characteristics (for example, their composition). We want to create a tool selecting families of sequence segments defined by integral sequence similarity criteria and to apply this method for LCR classification and functional characterization. Part 2: A large variety of biological features (structures and functions) such as many posttranslational modifications of amino acid residues in proteins or proteolytic scissions of polypeptide chains in cellular processes cannot be predicted with theoretical methods at all. Thus, large classes of sequences will remain non-annotated or incompletely annotated since the scope of the existing prediction techniques is limited. We want to develop techniques for the recognition of various lipid anchor sites in protein sequences (for example GPI-lipid anchors in non-animal taxa, myristoyl, farnesyl and palmitoyl anchors) as well as for cell cycle-specific protein cleavage points (for example, in the substrate proteins for separins in the early anaphase). If our resources and the research progress allow, we will extent the effort to other types of posttranslational modifications. Part 3: The existing methods (algorithms and software solutions) are not prepared for the generation of annotation of large amounts of sequences. Especially, automated implications (deduction heuristics) based on the combined output results of different prediction methods are impossible. We want to create a portable software solution ("Automated Sequence Analyzer") that can invoke different prediction methods, parse their output, and store this data in an electronically retrievable form. Specifically within this project, we want to develop the deduction algorithms that should mimic the routine activities of bioinformatics researchers if they invoke prediction programs and analyse their outputs manually. This effort has great importance in context with the functional annotation of sequences encoded on DNA chips.
Research Output
- 2234 Citations
- 15 Publications
-
2005
Title Proteins with two SUMO-like domains in chromatin-associated complexes: The RENi (Rad60-Esc2-NIP45) family DOI 10.1186/1471-2105-6-22 Type Journal Article Author Novatchkova M Journal BMC Bioinformatics Pages 22 Link Publication -
2005
Title Refinement and prediction of protein prenylation motifs DOI 10.1186/gb-2005-6-6-r55 Type Journal Article Author Maurer-Stroh S Journal Genome Biology Link Publication -
2004
Title A Sensitive Predictor for Potential GPI Lipid Modification Sites in Fungal Protein Sequences and its Application to Genome-wide Studies for Aspergillus nidulans, Candida albicans Neurospora crassa, Saccharomyces cerevisiae and Schizosaccharomyces pom DOI 10.1016/j.jmb.2004.01.025 Type Journal Article Author Eisenhaber B Journal Journal of Molecular Biology Pages 243-253 -
2004
Title Human Rif1, ortholog of a yeast telomeric protein, is regulated by ATM and 53BP1 and functions in the S-phase checkpoint DOI 10.1101/gad.1216004 Type Journal Article Author Silverman J Journal Genes & Development Pages 2108-2119 Link Publication -
2004
Title Myristoylation of viral and bacterial proteins DOI 10.1016/j.tim.2004.02.006 Type Journal Article Author Maurer-Stroh S Journal Trends in Microbiology Pages 178-185 -
2004
Title Crystal structure of the p14/MP1 scaffolding complex: How a twin couple attaches mitogen-activated protein kinase signaling to late endosomes DOI 10.1073/pnas.0403435101 Type Journal Article Author Kurzbauer R Journal Proceedings of the National Academy of Sciences Pages 10984-10989 Link Publication -
2004
Title MYRbase: analysis of genome-wide glycine myristoylation enlarges the functional spectrum of eukaryotic myristoylated proteins DOI 10.1186/gb-2004-5-3-r21 Type Journal Article Author Maurer-Stroh S Journal Genome Biology Link Publication -
2004
Title Hidden localization motifs: naturally occurring peroxisomal targeting signals in non-peroxisomal proteins DOI 10.1186/gb-2004-5-12-r97 Type Journal Article Author Neuberger G Journal Genome Biology Link Publication -
2003
Title Kleisins: A Superfamily of Bacterial and Eukaryotic SMC Protein Partners DOI 10.1016/s1097-2765(03)00108-4 Type Journal Article Author Schleiffer A Journal Molecular Cell Pages 571-575 Link Publication -
2003
Title Prediction of Peroxisomal Targeting Signal 1 Containing Proteins from Amino Acid Sequence DOI 10.1016/s0022-2836(03)00319-x Type Journal Article Author Neuberger G Journal Journal of Molecular Biology Pages 581-592 -
2003
Title Protein prenyltransferases DOI 10.1186/gb-2003-4-4-212 Type Journal Article Author Maurer-Stroh S Journal Genome Biology Pages 212 Link Publication -
2003
Title Motif Refinement of the Peroxisomal Targeting Signal 1 and Evaluation of Taxon-specific Differences DOI 10.1016/s0022-2836(03)00318-8 Type Journal Article Author Neuberger G Journal Journal of Molecular Biology Pages 567-579 -
2002
Title On filtering false positive transmembrane protein predictions DOI 10.1093/protein/15.9.745 Type Journal Article Author Cserzö M Journal Protein Engineering Pages 745-752 Link Publication -
2002
Title N-terminal N-myristoylation of proteins: refinement of the sequence motif and its taxon-specific differences11Edited by J. Thornton DOI 10.1006/jmbi.2002.5425 Type Journal Article Author Maurer-Stroh S Journal Journal of Molecular Biology Pages 523-540 -
2002
Title N-terminal N-myristoylation of proteins: prediction of substrate proteins from amino acid sequence11Edited by J. Thornton DOI 10.1006/jmbi.2002.5426 Type Journal Article Author Maurer-Stroh S Journal Journal of Molecular Biology Pages 541-557