|Beyond ATCG: “Dixel” representations of DNA-protein interactions
DIXEL Coordinate System for DNA
Since the DNA structure is, to a first approximation, fairly rigid, the un-relaxed structure of B-form DNA forms a natural starting coordinate system for these calculations. Since the most important sequence-specific interactions between proteins and DNA are often in the major groove, we examined electron density features such as electrostatic potential (EP), local average ionization potential (PIP), and other charge and electronic kinetic energy features on the accessible surfaces of the major groove. Our methods permit the calculation of these features on a grid of rectangles with sides of under 0.5 Ångstroms. We abstracted this high-resolution data to a “Dixel” coordinate system. In this system, each base pair is represented by 10 surface pixels of size 1.6 Ångstroms (along the base pair) by 3.4 Ångstroms (parallel to the axis of the DNA helix), or “Dixels” for each of 10 TAE properties of the electron densities. These generate an abstract representation of chemical features of the accessible surface of the DNA major groove using TAE properties for base pairs in their native electronic environments.
Figure 5 shows a schematic diagram of the mapping from the DNA major groove surface to a dixel representation and color-coded cartoon of the electrostatic potential dixels for the 32 triples. A substantial spread in the “Dixel” distributions over each of the base pair triples for each central base type indicates that the local electronic environments induced by neighboring base pairs have a strong influence on this property. A similar though less marked effect is observed for other features such as PIP, but little or no such effect is apparent for electronic kinetic energy features. This finding supported our hypothesis that employing DNA electronic structure information could capture effects from at least 3 base pairs without requiring additional sequence data, and encouraged us to explore its potential to improve regulatory protein binding site identification.
Previous || Next