Potential of Mean Force Approach for Describing Biomolecular Hydration

Co-Investigators: Shekhar Garde
Assistant Professor, Chemical and Biological Engineering, Rensselaer Polytechnic Institute
Angel Garcia
Professor of Physics & Senior Constellation Chaired Professor in Biocomputation and Bioinformatics, Rensselaer Polytechnic Institute
Background
For broader applications that involve studies of hydration of libraries of
several hundreds (or even thousands) of molecules of bio, pharma, or health interest,
we have developed an efficient potentials-of-mean-force expansion (PMF) based method,
employing a library of lower-order correlation functions derived from explicit
simulations to predict the average equilibrium density and the orientation
profile of water in the space surrounding biomolecules or ligands.
The method efficiently approximates the effective free energy of interaction of a biomolecule with surrounding
water in terms of two-particle and multi-particle potentials-of-mean-force
between constituent sites and water molecules. We have previously shown
that a truncation of the expansion (Figure 2) at the three-particle
level provides sufficient accuracy for a variety of applications of interest.
Significant validation work has been performed by exhaustive comparisons of
PMF results to those from detailed protein MD simulations.
PMF expansion uses pre-calculated libraries of hydration two- and three-
particle correlation functions between protein constituent sites and water
molecules. Only simplistic description of proteins (described by three
sites - hydrophobic, polar and ionic) have been employed to date. In our
new approach, we include 11 different constituent protein sites that
represent varies sizes and charge densities, so as to describe the protein
chemistry with much higher fidelity. Clustering algorithms were applied in
the force-field parameter space to obtain these different sites. We are
currently generating a large library of two and three particle hydration
correlations for these sites (requiring over 2000 different MD simulations,
and over 100 microseconds of simulation data). The database needs to be
calculated only once and can be extended as necessary with inclusion of new
canonical sites.
|