Wednesday, December 12, 2007

Secondary protein structure prediction

Secondary structure means?

In biochemistry and structural biology,secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids (DNA/RNA).
It does not, however, describe specific atomic positions in three-dimensional space, which are considered to be tertiary structure.



Protein Structure Prediction

-One of the most important goals pursued by bioinformatics and theoretical chemistry.

-Aim is to predict the three-dimensional structure of proteins from their amino acid sequences, sometimes including additional relevant information such as the structures of related proteins.

-It deals with the prediction of a protein’s tertiary structure from its primary structure.

-High importance in medicine (for example, in drug design) and biotechnology (for example, in the design of novel enzymes).

Some Examples of predictions are:

-Ab initio protein modelling
(Ab initio protein modelling methods seek to build three-dimensional protein models "from scratch", i.e., based on physical principles rather than (directly) on previously solved structures.)

-Comparative protein modelling

o Homology modelling (based on the reasonable assumption that two homologous proteins will share very similar structures.)

oProtein threading (scans the amino acid sequence of an unknown structure against a database of solved structures)

-Side Chain geometry prediction.
(Even structure prediction methods that are reasonably accurate for the peptide backbone often get the orientation and packing of the amino acid side chains wrong.

Methods that specifically address the problem of predicting side chain geometry include dead-end elimination and the self-consistent mean field method. Both discretize the continuously varying dihedral angles that determine a side chain's orientation relative to the backbone into a set of rotamers with fixed dihedral angles. The methods then attempt to identify the set of rotamers that minimize the model's overall energy. Rotamers are the side chain conformations with low energy. Such methods are most useful for analyzing the protein's hydrophobic core, where side chains are more closely packed; they have more difficulty addressing the looser constraints and higher flexibility of surface residues.)

MudPIT

Introduction
Before Multidimensional Protein Identification Technology (MudPIT) came about, Liquid Chromatography (LC) and Mass Spectrometry (MS) are used separately to fractionate and then identify protein composition from biological sample.

Disadvantage of using Liquid Chromatography
- Loss of material that commonly occurs in chromatographic processes

Disadvantage of using Mass Spectrometry (gel-based)
Although gel-based methods are widely used when it comes to the identification of protein, this method has several drawbacks:
- Problem identifying hydrophobic proteins
- Difficulty detecting low-level proteins (dye staining is not sensitive)
- Long experiment duration
- Inability to be automated
- Biological sample need to undergo solubilization

These problem decrease the sensitivity of the protein identification process and MudPIT seeks to address these problems by improving the separation and identification of proteins.


So what is MudPIT?
Multidimensional Protein Identification Technology, or MudPIT is a largely unbiased method for rapid and large-scale proteome analysis by multidimensional liquid chromatography, tandem mass spectrometry, and database searching by the SEQUEST algorithm.

Advantage of MudPIT
- Eliminate the problems of gel-based approach when it comes to MS
- More sensitive and thus able to detect low abundance proteins
- two-dimensional chromatography technique reduces sample loss


Workflow of MudPIT
1. Preparation of protein sample
2. Digest protein sample to peptides
3. Peptides are then seperated into two liquid column chromatography steps:
-strong cationic exchange
-reversed-phase high performance liquid chromatography (HPLC)
3. Acquire tandem mass spectra of peptide
4. Search mass spectra against a protein sequence database
5. Identification of protein in the sample using SEQUEST



Some projects using MudPIT:
1) Protein pathway and complex clustering of correlated mRNA and protein expression analyses in Saccharomyces cerevisiae

2)Food Standards Agency
This site is researching on the feasibility of using MudPIT as an alternative to gel-based approach for the rigorous safety assessment of GM plants.

3)Chloroplast proteomics: potentials and challenges
This is a site on botany research, this research is about the analysis of chloroplast proteome.


source:

Technologies and Strategies for Reseach and Development

DRUG Discovery & Development

http://www.hupo.org/educational/past_congresses/2007_seoul/3_MacCoss_color.pdf

Nature Publishing Group

posted by Alvin

Teritary Structure New Break Thru...

For Years.. Bioinformatician has been trying to solve the great mystery of the structure of a protein by prediction
In the past.. experts used to say.. To determining the shape of a protein!, is just a matter of firing X-ray Beam @_@ at its crystalline form and measure it... and!! chemists are way too sceptical to prove him WRONG!
Until recently .. recently...
Ah AH!! on 14/10/2007!~~ David Baker!,
A biochemist at Uni of Washington, and his colleagues found astonishing result! to prove what the experts sayd WRONG!!.. breaking the scepticism in the chemist's world!
Baker came out with a new techique, which combines information from the sum of
By : what is already known about the structure with the vast com - power available
{ what this means? => getting 150k of volunteers to use his program at home }
Soo.. how this program works?
In Basic :1) Breaks the protein sequence into small stretches
2) Match the stretches with all the known protein structures {Logically, if a result = very 100% = its also be 100% accurate}
3) minimize the free energy of the structure so to measure its stability!
4) Repeat all this steps Over and Over Again UNTIL! it lurch towards an ever more Accurate Protein Model!~ Woo{Theres too much data in the world of the web, so if they could get everything done, which will take quite sometime..result would be darn good.}
And this program is called....BOINC

So In full, Its like u take a sequence, with 112 amino acid protein for example,the Network will breakup it up into several million structures, "with some very long time"this Millions of structure will then be whittled down to 5{6 Zeros digit to 5... @_@!}
until 1 structure is found, it will then relate its structure with its determined structure frm its crytsal
But.. Even after this..., result might not be as precision as the cyrstal, BUT!least its good enough to "throw away" the X-ray techique.
Even so... theres still room for improvement,but for now.. the structure prediciton would be the prospect of custom-made proteins.which is used to hunt for sequence that correspond to the desired structures.
Now.. the team os currently redesigning the gp120 Protein of HIV,hoping to make a vaccine that could stimulate the immune system in a different wayfrom the nature of the virus,if this is possible.. "the reshaped protein should attack the virus more effectively then antibodies created"{If Success, i believe.. human, will no longer regard the danger HIV will cause, after all.. our heart contains 7 deadly sinson the other hand, innocent child, might be saved.... }

Baker's Words.."The days when protein modellers thoughtthey could make crystallization obsolete arelong gone"

“If you reallycare about the structure of your protein, youshould get some experimental data and combineit with modelling"
So, by the way things are progressing, "light" will soon be seem by us..

For more info : Rosetta@Home
For more info on Ppl helping out with BOINC : http://www.youtube.com/watch?v=GzATbET3g54

Bye Bye!
Post by Zhong En

Sunday, December 9, 2007

European Bioinformatics Institute (EBI)

About the Institute

An organisation which forms part of the European Molecular Biology Laboratory(EMBL).

- Provides research on bioinformatics

- Manage database of biological data

http://www.ebi.ac.uk/

About the Research groups

Having more than 25 groups at EBI, only abt 5 or more are research groups.

http://www.ebi.ac.uk/Groups/

Research group includes Bertone group, Luscombe group, Huber group, Thornton group and more.

Firstly, I ll talk about the Huber group, it's research focuses on the transcription of gene and the binding of protein and DNA with DNA microarray.

This group will provide aid in the understanding of functional genomics data.

http://www.ebi.ac.uk/huber/

The next group I ll talk about is the Luscombe group. Which focus on genomic analysis of regulatory system.

http://www.ebi.ac.uk/luscombe/

This group actually studies on how the biology of an organism look like by investigating on the cells of the different species, the expression of its gene, the production of protein, etc.

http://www.ebi.ac.uk/luscombe/research.html

Current status of the research:

  • Provide graphical models for understanding the relationship of the regulatory network.

  • Identifying the vulnerabilities of the regulatory network that is prone to diseases.

  • How the transcriptional regulatory network interact with other cellular components.

  • Analysing transcription factors in human genome.

  • Complex bacteria behaviour

Future enhancements:

  • Advancing analysis techniques and better understand regulatory network

  • Consolidation of their research in bacteria and organism.

  • Interacting with research groups performing genome-scale experiments.

Lastly, about Thornton group, they research on how biology works at molecular level, this has a very broad research.

http://www.ebi.ac.uk/Thornton/research.html

One of the many research in this group is the enzyme activity, which is study of how these enzymes work, their functions, and how they are evolved.

Current status of this research: http://www.ebi.ac.uk/Thornton/group_publications.html