Thirty Years of (Bio)Molecular Simulation: How Far Have We Come?

This was originally intended to be micro-blogged talk. Probably on friendfeed. But when I walked into the old Chevron building on the Pitt campus to listen to Professor Wilfred van Gunsteren the wireless was spotty, so I saved my notes for a triumphant return to normal blogging. The talk is part of a lecture series presented by the CMMS at the University of Pittsburgh. Since it was probably the intended purpose when I started Bleeding Edge Biotech; this is my notepad of the distinguished lecturer’s slides and talking points.

Computation based on molecular models is playing an increasingly important role in biology, biological chemistry, and biophysics. Since only a very limited number of properties of biomolecular systems is actually accessible to measurement by experimental means, computer simulation can complement experiment by providing not only averages, but also distributions and time series of any definable – observable or non-observable – quantity, for example conformational distributions or interactions between parts of molecular systems. Present day biomolecular modelling is limited in its application by four main problems: 1) the force-field problem, 2) the search (sampling) problem, 3) the ensemble (sampling) problem, and 4) the experimental problem. These four problems will be discussed and illustrated by practical examples. Progress over the past thirty years will be briefly reviewed. Perspectives will be outlined for pushing forward the limitations of molecular modelling.

Why Thirty Years?

…first simulations were performed in 1976..

Molecular modeling choices to make:

Simulations can:

  • explain experiment
  • provoke experiment
  • replace experiment
  • aid in establishing intellectual property

The four problems

  • Force field problem
  • The search (sampling) problem
  • The ensemble sampling problem
  • The experimental problem

The Force Field problem

  • small free energy differences
  • account for entropic effects
  • variety of atoms and molecules (keep it simple; transferable parameters)

…using only the PDB for force field development just doesn’t work out.

Most dominant fold is not difficult; equilibra between folds is more important.  Should be able to get melting temperatures from simulations.  Solvent viscosity drives the kinetics of folding.  Todo: Polarizable force-fields.

The searching (sampling) problem

A. convergence
B. alleviated
C. aggrevated

Methods to compute free energy

  • counting configurations
  • thermodynamic integration (many simulations)
  • perturbation formula (one simulation)
  • One-step perturbation (few simulations)

- use “soft-core” atoms for each site where the inhibitors will interact.

Original Viagra and Levitra could have benefitted from this method (IP, patents)

The ensemble (sampling) problem

  • Entropy
  • Averaging
  • Non-linear averaging

Coiled-coil stability has a strong entropic component.  For monomers the solute-solvent interaction decreases.  For trimers the solute-solute interaction decreases.  Entropy increases with temperature.  In trimers atomic fluctuations do not increase with temperature but solute entropy increases with temperature.

The experimental problem

  • Averaging
  • Insufficient data
  • Insufficient accuracy

“Averages are dangerous”

Conclusions:

  • Experimental data cannot determine the average structure
  • Experimental data cannot determine the biomolecular structure

Artifacts of XPLOR NMR refinement disagree with simulations guided by NOE-restraints
- Two ensembles with no ensemble overlap and given same experimental data

“Experimental data is not sufficient”

Don’t rely on structural data (It’s derived; strive for primary data)

History

1957 First molecule
1964 atomic liguid (argon)
1971 molecular liquid (water)

Future

2001 –
2029 Biomolecules in water
2034 E-coli
2056 Mamallian cell (10^-9 sec)
2080 Biomolecules in water (fast as nature) 10^6
2172 Human body (10^27 atoms) 1 sec

So what if you could simulate every atom in your body for 1 second?

– There’s much better things simulation can answer; ask better questions.

Polarizable Force Field

- improves transferability between different environments
- working on these force fields
- solvation drives protein processes

Coarse-graining

- Need to switch FG/CG, back and forth
- Run simulations in parallel
- Easy to clamp 5 atoms to 1 but not easy to map 1 to 5
- FG/CG replica-exchange simulation enhances sampling
- Much faster to cross barriers in CG mode if you can switch
- Both force-fields must be thermodynamically calibrated

We need simulations to explain experiment; so we can see the numbers.  For molecular modelers, there’s still enough work to do at least until 2172!

Questions from the audience

Q: What’s the state of NMR determination
A: It depends, narrow bundles should have more motion.  Stable proteins are easy.  Averaging problem is present even in Crystallography.  Can’t get R-values.  Many many structures are not that good (XPLOR FF is simple, no solvent).  Found 20% of side-chain J-values cannot be right.  Simulation is getting to the point to correct experiment.

Q: Could you comment on CG model ‘clamping atoms’ and potential problems related to entropy
A: Take 5 atoms, make a ball, you lose entropy.  You should compensate that in the energy level?  You must balance it.

Q: Is Path integral still useful?
A: No, we’d like to remove it next version of Gromos.

Professor van Gunsteren is a big believer in using all the data you can get your hands on.

  • March 18, 2009 at 10:22 pm Neil Saunders
    "Don’t rely on structural data (It’s derived; strive for primary data)" - yes, it's often not appreciated that PDB structures are themselves essentially refined models. Though I shudder to think of the number-crunching involved if we start from e.g. electron density.
  • March 18, 2009 at 11:07 pm Bosco Ho
    @Neil: although low resolution PDB structures need to be refined, high-resolution structures are remarkably good (better than 1.5 Å). A reviewer for a paper I wrote recently made me look at high-resolution electron density, and with high-resolution structures, it's very clear where the heavy atoms are, and sometimes, you can even make out the hydrogen atoms.
  • March 21, 2009 at 10:50 pm Deepak Singh
    I forget who (Peter Pulay?), said something along the lines of experimental science is nothing more than applying very simple minded theories :)
  • March 21, 2009 at 10:53 pm Deepak Singh
    Bosco is right. Though the static nature of crystal structures is a major problem in my book, plus crystallization conditions are always artificial, which is another thing to be aware of
  • March 21, 2009 at 10:55 pm Deepak Singh
    And isn't it sad that we are still using those same old force fields, or small improvements of those
  • March 22, 2009 at 10:03 am marcin
    @Neil: IMHO there is no such thing as "primary data" in crystallography, even the raw output from the detector is processed to "correct" for reflection angles, noise levels, sensitivity differences of different parts of the detector.

1 Response to “Thirty Years of (Bio)Molecular Simulation: How Far Have We Come?”


  1. 1 Thirty years of biomolecular simulation : business|bytes|genes|molecules

Leave a Reply

You must login to post a comment.