View Full Version : FYI: The CASP Protein Folding Competition

09-23-08, 12:20 PM
Further info on our Rosetta competition, in CASP8, and an informative and funny post about X-Ray Protein Crystallography (considered the most definitive way of finding the protein's shape).

The website for CASP7 that you saw is showing the list of the proteins they have sent out for analysis by the scientific teams participating in the "event". I think of it as a "contest". This list is made up of proteins that NO ONE actually knows the structure (i.e. the physical shape) of. They know the ordering of the amino acids, but that's all. The "X-ray" or "NMR" is an indication of the technique that is in progress presently. They are presently doing an X-ray analysis of the protein to define it's physical structure... but that analysis is not complete yet, or will be kept secret for a few weeks during the contest.

So there are over 200 scientific teams participating in CASP7, trying to predict the structures. The technique called Rosetta has been around long before BOINC. But this is the first time that Rosetta has used distributed computing to crunch the numbers for the event.

...and so the question is "If we know how to use X-ray and NMR techniques to find the structure, why mess with Rosetta?". And the answer is that Rosetta hopes to be able to find the structure much faster and cheaper than the current techniques.

Current it can take months for X-ray analysis of a protein to be completed and costs from $10,000 to $100,000. That wouldn't be so bad if you only needed to learn about say SARS, HIV, and avian flu virus proteins... but the goal in the end is to learn about all of them, and there are 100s of thousands! It is just not possible to produce results at the pace that will be needed to gain a complete understanding of how all these proteins are effecting our bodies and health.

So, for these few proteins, they will attempt to predict using various techniques from different teams, at the same time as other scientists are using the existing X-ray techniques to solve the structure. This X-ray analysis will then be used to gauge which predictions were most accurate, and most useful.

And the winner of CASP7 was -- wait for it -- Rosetta!!

Just to show how difficult it is to do an X-ray analysis of a protein:

A colleague of mine was investigating the structure of a protein called "violet-colored acid phosphatase". To do so she first had to process hundreds of kilos of sweet potatoes. Then she had to pre-separate the different ingredients by methods I do not really know any more, something like precipitating things with special chemicals. Then the protein containing rest was subjected to gel-electrophoresis and tiny amounts of the protein were isolated. This had to be done with numerous samples until in the end they got 500 mg of the protein.

Next step was crystallization, which again took many samples and a lot of time. Finally they had some single crystals they could use for structure determination, but it turned out the protein tended to disintegrate under X-rays. They finally got time at the DESY where thy could do measurements with synchrotron radiation which worked out (much higher intensity, thus much shorter measuring time).

The step of solving the structure is the next ordeal: carbon, nitrogen and oxygen have almost identical electron densities and will thus show up in the electron density map the x-ray yields as hardly distinguishable "bulges". Hydrogen, very important to distinguish those atoms, is hardest to find as it has only one electron and is thus just a speck in a sea of possible bulges and will only be seen, when the electron densities of heavier atoms are assigned correctly. So you just work your work forward, guessing the backbone together, assigning CH, CH2, CH3, NH, NH2, and OH groups where you think the heavier atoms carry hydrogen and slowly go towards your aim of a totally solved structure.

What you also have to account for is water which also shows up the way the backbone- and residue atoms do and often adjacent water molecules emulate structures of the backbone or residues, so you have to separate this from the protein itself (when the assignment of atom species is only a rough one, the atoms seam "smeared", so you cannot exactly determine their disctances, which complicates the separation of water and protein a lot).

So it is a long and troublesome thing to do (at least it was ten years ago, maybe things have improved a little with faster PCs and better search algorithms). It mainly depends on your experience, how good you proceed.

The whole thing took three years and three people were working on it in different fields and with different samples (one tried to extract the enzyme from uteri of pregnant pigs). Some attempts were simply fruitless and you need a high frustration threshold to work in that field.

After going through all that labour, trouble and waiting comes the next problem: you have treated the enzyme so brutal and exposed it to so many chemical and physical procedures. Who is now going to guarantee you still have a native enzyme in your crystal? Maybe all those changes in pH and chemical composition of the solution has denaturated your target.

So, to me, the bottom line is: folding is not only much less troublesome, cheaper and faster, it is also much likelier to yield the correct structure. And that is what you finally want.

The CASP competition is run every two years, and usually begins May 10th, and finishes on or near, Aug. 1st.

After Aug. 1st, the judges will begin reviewing the predictions made by each team, with the final placements being announced a few months later in early December, at their convention. There are several categories of competition, so one team is very unlikely to win more than one or two of those categories, in any year.