You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Molecular Density Fitting Using 3D Gaussian Functions
Description
This project implements a molecular density fitting approach that optimizes the correlation between an experimental density map and a synthetic density map originating from molecular dynamics using 3D Gaussian functions. The algorithm generates a force in the direction that increases the correlation between the simulated and effective maps. It also allows adjusting anisotropically the spread parameter (sigma) in each direction.
Features
Efficient simulation of density maps using 3D Gaussian functions.
Optimization of molecular coordinates and sigma parameters to improve fit.
Calculation of the correlation coefficient to measure fit quality.
Use of numerical and analytical methods to compute derivatives.
Implementation optimized with Numba for high performance.
Installation
Prerequisites
Python 3.6+
Numpy
SciPy
Numba
Usage
Quick Start
Here's a quick example to get started with the Fit class:
Fit particles without a forcefield
importnumpyasnpfromopenfitimportFit# Define your parameters: coordinates, sigma, experimental_map, and voxel_sizecoordinates=np.array([...]) # Particle coordinates (n, 3)sigma=np.array([...]) # Standard deviation (n, 3)density_map=np.array([...]) # Density mapvoxel_size= [1, 1, 1] # Voxel size# Initialize the Fit objectmd_fit=Fit(density_map, voxel_size)
# Calculate the derivative of the correlation over the coordinates and sigmadiff=md_fit.dcorr_coef(coordinates, sigma)
# Perform the fitting process just with the coordinatesmd_fit.fit(coordinates, sigma)
# Access the optimized coordinates and sigmaoptimized_coordinates=md_fit.coordinatesoptimized_sigma=md_fit.sigma
Build a simulated map from coordinates
importopenfitimportnumpyasnp# Parse the pdbscene=openfit.Scene.from_pdb('pdb_file.pdb')
# Corrects missing element namesscene['element'] =scene['name'].str[0]
scene['mass'] =scene['element'].replace({'N': 14, 'C': 12, 'O': 16, 'S': 32})
# Creates an empty voxel arraycoords=scene.get_coordinates()
voxel_size= (coords.max() -coords.min()) /40# ~40x40x40Fit=openfit.Fit.from_dimensions(min_coords=coords.min() -10, max_coords=coords.max() +10, voxel_size=voxel_size)
# Sets the coordinatesFit.set_coordinates(coords.values(), sigma=np.ones(coords.shape) *2, epsilon=scene['mass'].values)
# Saves the density mapFit.save_mrc('sample_map.mrc')
Derivation
Energy
This class implements V_fit, a forcefield potential that can be included in an OpenMM molecular dynamics simulation.
$$ V = V_{ff} + V_{Fit} $$
The potential is defined in terms of the correlation coefficient (c.c.), which is a function of the experimental and simulated densities at each voxel $(i, j, k)$:
where $\rho_{\text{exp}}(i,j,k)$ and $\rho_{\text{sim}}(i,j,k)$ represent the experimental and synthetically simulated density of each voxel $(i,j,k)$, respectively. The synthetically simulated density $\rho_{\text{sim}}(i,j,k)$ is obtained by integrating the three-dimensional Gaussian function over each voxel:
Where $x_i^{min}$, $x_i^{max}$, $y_j^{min}$, $y_j^{max}$, $z_j^{min}$, and $z_j^{max}$ are the boundaries in x, y, and z for the voxel $(i,j,k)$. The solution to this integral involves the error function ($\text{erf}$). For a Gaussian distribution with mean $\mu$ and standard deviation $\sigma$, the integral over a finite range can be expressed using the error function as follows:
Where $\Phi(x; \mu, \sigma)$ represents the cumulative distribution function (CDF) for a normal distribution at point $x$. The error function ($\text{erf}(x)$) is defined as:
To compute the derivative of $V_{Fit}$ with respect to the coordinates of the nth atom $(x_n, y_n, z_n)$, we need to apply the chain rule to the derivative of $V_{Fit}$ in terms of c.c. and then the derivative of c.c. in terms of $(x_n, y_n, z_n)$. Like $V_{Fit}$ with respect to a variable $v$ is:
Where $\rho_{\text{exp}}(i,j,k)$ represents the experimental density at the voxel located at indices (i, j, k), $\rho_{\text{sim}}(i,j,k)$ is the simulated density at the voxel $(i, j, k)$, which is a function of the molecular coordinates and parameters like sigma, and $\frac{\partial \rho_{\text{sim}}(i,j,k)}{\partial v}$ denotes the partial derivative of the simulated density at voxel $(i, j, k)$ with respect to a variable $v$ which can be the coordinates or sigma.
The derivatives of the function $\Phi(x; \mu, \sigma) = \frac{1}{2} \left[ 1 + \text{erf}\left( \frac{x-\mu}{\sigma\sqrt{2}} \right) \right]$ are as follows:
Carlos Bueno was supported by the MolSSI Software Fellowship, under the mentorship of Jessica Nash. We thank AMD (Advanced Micro Devices, Inc.) for the donation of high-performance computing hardware and HPC resources. This project is also supported by the Center for Theoretical Biological Physics (NSF Grants PHY-2019745 and PHY-1522550), with additional support from the D.R. Bullard Welch Chair at Rice University (Grant No. C-0016 to PGW). Project based on the Computational Molecular Science Python Cookiecutter version 1.10.
About
Molecular Density Fitting with 3D Gaussian Functions