Archive for the 'Protein Design' Category

Solve Puzzles for Science – FoldIt: An online protein folding game

David Baker is one of my favorite scientists. His group performs the best at CASP. He started the Rosetta protein folding and design software and Rosetta@HOME a distributed computing network to run it. And now he’s behind one of the coolest projects I’ve ever seen. Fold.it is an amazing community-based game where players can compete by folding proteins in a graphical point and click manner. Fold.it has a beautiful UI and molecular graphics not unlike the ones you’ve come to expect from VMD, PyMOL, and UCSF Chimera. Most importantly, this highly addictive puzzle game has real scientific value. Each time you solve a folding puzzle, the software sends your results back to FoldIt. With that data they hope to gain insight into the powerful human capacity to recognize patterns and apply that to new protein structure prediction methods. Players can create and join groups to compete against other players for high-scores.

After playing FoldIt for about an hour the game is actually very fun and addicting! Any game with actions like “Shake Sidechains” and “Wiggle Backbone” is guaranteed to make any bioche/biophysicist smile. While it may compete with GTA4, this game is a huge step in educating students in protein structure. It’s truly brilliant. Thanks to Andrew Perry for pointing this out.

FoldIt – Crowdsourcing to solve the protein folding problem

Around the web 3/21/08

quarternion_jmol

Around the web, week of March 21, 2008

    Journals
    Big science from Andrei Sali and David Baker

  • The molecular architecture of the nuclear pore complex
  • De Novo Computational Design of Retro-Aldol Enzymes
  • Blogs

  • Nature archive visualized – a Processing sketch to visualize the keywords from Nature over the last 30 years. Some of the more spurious terms could probably be cleaned up but even as a draft the effect is pretty neat.
  • Research streaming is born. Mike from Bioinformatics Zen is auto-publishing his svn commit messages and uploading figures he generates to Flikr. This would be well suited to someone like me who has too many projects going on to stop and dedicate time to blog about them here.
  • Universal Parallel Computing Research Centers are being heavily funded by Microsoft and Intel. One at University of Illinois at Urbana-Champaign, well known for the CHARMM++ parallel library and the super-scalable NAMD molecular dynamics package built on top of it. The other will be located at UC Berkeley.
  • The End of the Relational era, is SQL dying? Bill McColl of Computing at Scale says it is. I would argue that relational databases have received the golden hammer treatment over the years. But I totally agree with his prediction that SQL will ultimately be replaced by DSL’s having implicit data-parallelism.
  • The Youtube API has been updated with some significant improvements for developers. Uploads, comments, and video playlists can all be manipulated outside of youtube. This makes a convincing case to leverage the massive youtube userbase if your site deals with video content.
  • Tech

  • I’ve finally moved most of my projects from SVN to Git. I’m now a ‘branch-a-holic’ and git definitely fits my workflow better than subversion now that I’m used to it.
  • Capistrano is typically used for Rails deployment, but I’m finding it’s good for just about anything you want to run across multiple remote hosts. This is a great mini-language for cluster admins who don’t want to struggle with something like mpirun

Protein Design is inverse Structure Prediction

If we imagine protein folding as the exploration by a human explorer navigating a rugged landscape in search of the lowest elevation point, then how do we describe protein design? If we are still baffled by structure prediction in the ‘midnight zone’ how can we have a bioengineering strategy that is worth the title ‘protein design’?

When does Bioinformatics become Bioengineering? Is protein design really just inverse structure prediction? Does that work to simplify the problem?

I have always looked at the problem from the perspective of predicting ternary structure from primary structure. It almost seems more practical to flip the problem around. Given a structure and function, can we predict the amino acid sequence?

DNA synthesizers are rapidly improving. We still do not understand how a change in the amino acid sequence affects changes in protein structure. Template based modeling is progressing, anything beyond a sequence identity of 30% can be reliably modeled with a good pipeline.

Reliable function prediction is obviously the next milestone, but I think the gap from structure to function is still enormous. How do we standardize function specification? Enzyme functions can be described by the reaction they catalyze or the pathway they are part of. However not all proteins are enzymes. And not all enzymes are proteins. Hormones, transcription factors, membrane receptors all have some characteristic structure that supports their function.

Function prediction needs about 10 years before it can provide the type of infrastructure for robust bioengineering and truly rational drug design.