CCMB
Research @ BIC

Research Activities

Sequence-Structure Relatedness in Proteins

1.1. Protein Sequence-Structure Analysis Relational Database :

Unraveling the relatedness that exists between sequences and structures of proteins is of interest to us. In this context, we have generated a web-based application called PSSARD - Protein Sequence-Structure Relatedness Database. This bioinformatics application can be used to identify the conformations corresponding to any amino acid sequence from several proteins of known three-dimensional structure simultaneously. Alternatively, it could be used to specify a particular conformation and to identify amino acid sequences that are compatible to it. The application also indicates the native amino acid sequence of the protein of known 3-D structure and regions that could not be defined in the experimental structure. The database can be used in different ways. For instance, it can be used to identify the conformation of a particular sequence motif characteristic of protein function. We have used PSSARD to analyze the conformations associated with hexapeptide and large single amino acid repeats in proteins.

Researcher(s) / References :

  1. Guruprasad, K., Srikanth, K., Babu, A.V.N. (2005) Int. J Biol. Macromol. 36, 259-262.
  2. Sridhar, S., Babu, A.V.N., Guruprasad, K. (2007) Int. J Biol Macromol. 41, 109-113.

1.2. Conformations of Hexapeptide and Large Continuous Single Amino Acid Repeats in Proteins :

Although proteins are known to comprise long stretches of single amino acid repeats, their corresponding conformations have not been analyzed in detail. We therefore examined the conformations associated with hexapeptide and large continuous single amino acid repeat peptides (CARPs) in representative proteins of known three-dimensional structure. The type of amino acid residues observed, the maximum length of the peptide sequence, the conformation of these peptides, proteins containing the CARPs, their location in protein structure and solvent accessibility were some of the features analyzed.

Researcher(s) / References :

  1. Gayatri, M., Guruprasad, K. (2010) Protein Pept. Lett. 17, 1459-1465.

1.3. Certain Heptapeptide and Large Chameleon Sequences in Proteins :

Identical peptide sequences with different conformations or ‘chameleon’ sequences observed in proteins have been implicated in a variety of biological roles. For instance, they have been suggested to play a role in the structural fold conservation and functional diversity of alternative splicing protein isoforms, theories on immune recognition, induction of amyloid-related fatal diseases, “conformational contagion” hypothesis, or as targets of regulation.

In this project, we identified for the first time heptapeptide and large sequences that correspond to a single complete helix, strand or coil, which adopt entirely different secondary structures in another protein. Besides, our analysis suggests that the quality of protein structure data may be important for identifying chameleon sequences in proteins. We have generated a useful tool (CHAMPEP), in order to query whether a given penta-, hexa- or hepta-peptide sequence, is observed as a chameleon in proteins of known three-dimensional structure.

Researcher(s):

  1. Dr. K. Guruprasad

1.4. Single Amino Acid Periodicity Peptides in Proteins :

In the ‘leucine zipper’ structural motif, a leucine residue is repeated at every 7th position and the peptide is associated with an amphipathic alpha helix with a hydrophobic region running along one side. These leucines form the hydrophobic core of a coiled coil. These motifs are usually found as part of a DNA-binding domain in various transcription factors, and are therefore involved in regulating gene expression.

In order to explore whether there are other amino acid repeats that are likely to be associated with a biological role, we are carrying out a systematic analysis on the different amino acid repeats with varying repeat interval and repeat number observed in proteins. In this context, we have also generated a web-based tool called Amino Acid Periodicity Peptides (AAPPs). It can be used to select a particular amino acid type, repeat interval and repeat number, in order to identify the proteins of known structure or annotated proteins containing the specific amino acid periodicity pattern. Alternatively, any protein sequence can be analyzed to identify the amino acid periodicities.

Researcher(s):

  1. Dr. K. Guruprasad

1.5. Heteroatom Groups and their Neighbors in Proteins :

Certain proteins of known three-dimensional structure comprise different heteroatom groups, such as, metal ions, co-factors, etc. We have developed a bioinformatics application tool that can be used to identify proteins of known three-dimensional structure comprising a particular heteroatom group along with the total numbers observed in each protein. A web-based application is also provided that identifies the surrounding amino acid residues and solvent interactions in the protein defined by a certain distance cut-off value in Angstrom units. We have carried out detailed analysis on the identification of sequence templates that characterize certain metal ion binding in proteins.

Researcher(s) / References :

  1. Guruprasad, K., Savitha, S., Babu, A.V.N. (2005) Int. J. Biol. Macromol. 37, 35-41.

1.6. Disulphide Bridge Connectivity Patterns in Proteins :

A disulphide bridge connectivity pattern is generated for proteins of known three-dimensional structure along with the associated protein superfamily name defined according to the Structural Classification of Proteins. Together, it serves as a useful reference to identify suitable templates in order to model distantly related proteins of known disulphide bridge connectivity.

We have also analyzed the conformations corresponding to short intra-chain disulphide bridged peptides (ICDBPs) in proteins. A web-based application – ICDBP was generated, that makes it possible to retrieve peptides in proteins of known three-dimensional structure, using query based on peptide sequence length, amino acid residues flanking the disulphide-bridged half-cystines and their conformations. The peptide length, amino acid sequence, location along protein chain, peptide conformation and the protein superfamily name are amongst the details that can be retrieved. The ICDBP application may be useful for modeling and design. Our analysis revealed certain unexpected instances of juxtaposed half-cystines as disulphide bridged partners in protein tertiary structure.

Researcher(s) / References :

  1. Kartik, V.J., Lavanya, T., Guruprasad, K. (2006) Int. J. Biol. Macromol. 38, 174-179.
  2. Guruprasad, K., Kartik, V.J., Lavanya, T., Guruprasad, L. (2006) Protein Pept. Lett. 13, 577-579.

1.7. Analysis, Detection and the Prediction of Structual Motifs in Proteins :

Turns are important structural motifs in proteins that play an important role in protein folding, stability and molecular recognition processes. We analyzed the statistically significant amino acid preferences at individual positions for the beta turns and gamma turns in proteins. We extended this analysis to multiple turns. We reported for the first time continuous turns in proteins. We examined the hydrogen-bond patterns that are likely to stabilize these turns. Further, we observed that certain proteins contain combination of turns that span large segments of the polypeptide chain. We reported the occurrence of certain unexpected isolated and multiple beta-turns with a proline residue at the third position in beta-turns.

We evaluated the accuracy of prediction of beta-turns and gamma-turns in proteins using the residue-coupled model and demonstrated that it was restricted to 68% and 57%, respectively.

We analyzed the ‘structural plasticity’ associated with the ‘beta-propeller’ structural motif that is known for structural rigidity and functional diversity of proteins in which it is observed. The number of strands associated with ‘blades’ of the beta-propeller, the number of amino acid residues associated with each beta-strand within the blade, the presence of helices and insertions between the blades and their twist are some factors that contribute to the ‘structural plasticity’.Based on this, a method was developed to automatically detect beta-propellers from several protein three-dimensional structure co-ordinates available in the Protein Data Bank format.

Researcher(s) / References :

  1. Guruprasad, K., Dhamayanthi, P. (2004) Int. J. Biol. Macromol. 34, 55-61.
  2. Kotu, A., Guruprasad, K. (2005) Int. J. Biol. Macromol. 36, 176-183.
  3. Guruprasad, K., Rao, M.J., Adindla, S., Guruprasad, L. (2003) J. Peptide Res. 62, 167-174.
  4. Guruprasad, K., Shukla, S. (2003) J. Peptide Res. 61, 159-162.
  5. Guruprasad, K., Shukla, S., Adindla, S., Guruprasad, L. (2003) J. Peptide Res. 61, 243-251.
  6. Guruprasad, K., Prasad, M.S., Kumar, G.R. (2001) J. Peptide Res. 57, 292-300.
  7. Guruprasad, K., Prasad, M.S., Kumar, G.R. (2000) J. Peptide Res. 56, 250-263.
  8. Guruprasad, K., Rajkumar, S. (2000)J. Bioscience 25, 143-156.
  9. Guruprasad, K., Pavan, M.N., Rajkumar, S., Swaminathan, S. (2000)Current Science79, 992-994.

1.8. Database of Structural Motifs in Proteins :

There are a number of well-characterized structural motifs in protein structures. The Promotif program identifies these structural motifs from the Protein Data Bank coordinates. In order to facilitate retrieval of motifs from several proteins, simultaneously, based on a combination of queries, we have implemented a web-based application - DSMPO. This application can be used in a number of ways. For instance, it may be used to identify the different proteins containing a given sequence that is associated with a structural motif. It is also useful to retrieve a particular motif from several proteins that satisfy a specific geometrical criterion defined for the individual motif, for instance, helices of a desired rise in unit height per turn or pitch. It is also possible to retrieve entries that satisfy a particular motif type, for instance, beta-turns of type II or classic gamma turns, and so on. A combination of features defined for each motif may also be used to retrieve entries that meet the query, for instance, alpha-helices with a certain pitch containing the desired amino acid sequence or beta-hairpin of particular type containing a defined amino acid sequence within the strands of the beta-hairpin, and so on. We can search sequences that represent coils observed in protein structures. We have incorporated the domains associated with the proteins according to CATH. Thereby, it is possible to infer not only the helix or strand in which the sequence is observed in certain proteins, but also the protein domains associated with these helices or strands. This provides a higher level of hierarchical relationship of sequence-structure relatedness in proteins.

Researcher(s) / References :

  1. Guruprasad, K., Prasad, M.S., Kumar, G.R. (2000) Bioinformatics 16, 372-375.
  2. Guruprasad, K. (2000) Nature NewsIndia Sep 7.