The Binding Of Vitamin B12 To Transcobalamin(II); Structural Considerations For Bioconjugate Design – A Molecular Dynamics Study

In press, in the journal Molecular Biosystems. A first official foray into molecular dynamics-only (MD-only) computational work and I am pleased to report that the computational results not only make sense with respect to the experimental results, they also indicate a possible new way to use vitamin B12 for the oral delivery of bio-active molecules more complicated than the binary bioconjugates considered to date.

The Interesting Result

The conclusion from the previous study was that the insulin B Chain (figure below) acts as a tether to separate the structured region of insulin (the region with the largest inflexible steric bulk, see below) from the region of the transcobalamin II (TCII) that bind vitamin B12. It was then determined that the approach employed for the B12-insulin bioconjugate, simply linking one biomolecule onto another with known binding and transport properties (this is a common theme in all bioconjugate design), worked because the last 10 residues in the insulin B Chain (B22 to B30) are flexible in solution (they, in fact, cover the insulin binding region in the crystal form, then uncover this region in the biologically active form).

As a general procedure for B12 bioconjugate design, one of the key requirements for a functional product is a tether length that provides sufficient separation between B12 and any molecular structure large enough to affect B12 binding within its transport proteins (makes sense, as a tethered structure that does not enable B12 binding in its transport proteins will find the B12 bioconjugate delivered to the gut where acids and digestive enzymes will hide the failed binding). This leads to the question, "How long must a tether be to meet this rather general criterion?" This is, partly, the correct question, as the retention of B12 binding within its transport proteins is a function of both proper tether length and [transport protein]-["other molecule"] interaction (in this first case, "other molecule" = insulin).

Saving the exhaustive analysis for the paper, this new study used this flexible region of human insulin (that is, B22 to B30, with the B12 linkage occurring on the B29 lysine side chain) as a proxy for any arbitrary tether, then used MD simulations to consider how the flexibility of this tether might lead to changes in B12 binding within its TCII pocket (the transport protein for which we have the best crystal structure). The result of these simulations was the identification of the side chain of lysine itself being just long enough to separate the B Chain tether region from the TCII protein surface. This does not mean that lysine will always serve as a perfect linkage. This means that, if the tether structure is effectively non-interacting with TCII (so not sterically demanding by itself), the lysine side chain is long enough to span the solvent-accessible hole produced by the encapsulation of B12 in (in this case) TCII.

The result is a design constraint when using lysine that is quite fortuitous! If the target peptide (insulin or whatnot) has a surface-accessible lysine side chain within a region that is flexible in solution, some simple amide chemistry may produce a viable B12 bioconjugate for delivering that peptide orally (thereby avoiding complete peptide degradation in the G.I. tract).

The More Interesting Result

Buried deep within the bottom of the Discussion section. If you watch the dynamics simulation of the TCII-[B12-tether] complex (shown below for a 300 K 50 ns simulation with 1.5 fs time steps in 14,000 waters (not shown)), you see that the binding of B12 within TCII and the geometry of the encapsulation complex are strongly linked. That is, TCII (and, presumably, its cohorts in the B12 transport pathway) can be thought of as two quite rigid fragments (Red and Blue in the animation) connected by a long tether (Green) that are separated in solution but brought into contact by the binding of vitamin B12 (Gold). The B12 is a glue that holds the fragments together, and a simple tabulation of hydrogen-bonding interactions in the crystal structure reveal that the B12 has more interactions to the A and B fragments of TCII individually than A and B have with each other (which is to say, the B12-A Segment interaction and B12-B Segment interaction are stronger than the A-B Segment interaction). From a biological perspective, this should make perfect sense. B12 is a large, extremely important biomolecule that, since we do not make it ourselves, is to be captured and transported as effectively as possible. The best way to bind this molecule is not to wait for it to burrow into a binding pocket, but rather to encapsulate it in a "clam shell" maneuver that provides "maximum embedding." The tether between the A and B Segments technically would not have to be present if the A and B fragments were present in large quantities (although, as you might expect, the A-B tether does considerably reduce the time to complete encapsulation by forcing these fragments within close proximity).

According to the crystal structure, the B12 is entirely embedded within TCII, with only the solvent-accessible hole at the 5'-ribose position readily accessible for bioconjugate formation. If the overall structure were as rigid as a crystal structure might lead one to believe, functionalization at the cobalt position in the corrin ring would be out of the question.

As I just stated that such a binding mode would otherwise be unlikely, you can guess that there are B12 bioconjugates linked at the cobalt ring that are bio-active.

If you watch the dynamics simulation of the TCII-[B12-tether] complex, you see that the clam shell binding mode of TCII is one with a "loose hinge." This loose hinge is really a result of the flexibility of the two protein fragments (typical protein motion) and flexibility in the short propionamide side chains of vitamin B12 that provide a bit of "spring" in the complete complex. In effect, the flexibility within the structure provides a means for cobalt to be coordinated to something without loss of B12 binding provided that the tether linking the cobalt and the "other" molecule is small enough that it does not require a large change in the A-B binding arrangement (that is, does not affect B12-A and B12-B binding).

And Then There Were Three…

The expectation/prediction/untested hypothesis is that vitamin B12 may be able to happily accommodate two additional molecules at the 5'-ribose and cobalt positions (properly designed) that then provide for the transport of two molecules and/or the delivery of three molecules (one being vitamin B12). This opens the door to a wealth of possibilities, from trinary delivery to combined drug delivery + radiopharma characterization. This is the possibility I'm most interested in pursuing in the next rounds of calculations, with the theory (presumably) providing a very good initial guess about the ideal tether designs to use with B12 for enabling delivery and bio-activity.

And Now For The Hard Work

Stepping back from the theoretical analysis for a moment, the most difficult obstacles to overcome in this study were the generation AND incorporation of force field parameters for vitamin B12 and a B12-Lysine mini-bioconjugate into GROMACS, a problem that I've addressed only in passing in several previous posts. What I won't do in this post is explain the procedure (a single blog post will not do the procedure justice given the complexity of force field parameter generation). What I will do is provide the files for the topology for these systems and a short list of the modifications one needs to make in order to get these systems working. For additional reference, the same topology files are provided in the Supplemental Material for the paper (so, if you find yourself using these, obviously cite the paper and not my humble blog).

Files And Contents:

These are not files to be placed in a single directory, but are segments of file that are going to be placed directly into pre-existing topology files. This is not the best way to do it but is the procedure I began with and will not be changing without finding a very simple tutorial on how-to (which, if you have, I'd be happy to read).

The contents of the topology file (which I assume for you will be ffG53a6 but should work generally) are provided below:

ffG53a6_B12_BCN_LYB_LCB_topology.txt

The topology specifications for vitamin B12 (nothing bound to the cobalt in the corrin ring), cyanocobalamin (CN-B12, with a cyanide bound to the cobalt), B12 with a lysine residue attached to the 5'-ribose hydroxyl position (the tether linkage for the GROMACS prep programs), and CN-B12 with a lysine residue attached to the 5'-ribose hydroxyl position.

I am assuming that you're using the ffG53a6 force field, meaning you add the topology sets to the bottom of the ffG53a6.rtp file.

GROMACS Modifications:

GROMACS force field and topology files must be modified slightly in order to read the topologies generated above and, depending on where you got the B12 structure, add/correct the hydrogen atoms in the B12 molecule.

In a typical UNIX/Linux installation (which I have provided compilation instructions for in a previous post), the files to be modified can be found in /usr/local/gromacs. And, if you're using Ubuntu like I am, you'll need to "sudo" these modifications.

1. aminoacids.dat

If you open this file, you see a list of three- and four-letter codes in the format:

50
ABU
ACE
...
VAL
PGLU

The "50" refers to the number of codes. As we're going to be adding the codes B12, BCN, LYB, and LCB into GROMACS, we first change 50 to 54, then just list the four codes at the bottom of the file:

54
ABU
ACE
...
VAL
PGLU
B12
BCN
LYB
LCB

You'll note that B12 and BCN aren't like the others, LYB is not LYS, and LCB is also nowhere to be seen. The codes in this file are STANDARD and make sure you don't inadvertently name your inserted structure one of the structures in the list.

2. ffG53a6.hdb

I specifically used the ffG53a6 force field for the TCII-B12 work, meaning I only made modifications to these force field files. The ffG53a6.hdb file is responsible for adding/correcting hydrogen atoms in your structure (just because the crystallographers do not see them does not mean they aren't there) and contains hydrogen-beautification information for all of the three/four-letter codes recognized in aminoacids.dat. The content below is the hydrogen-correcting data for the B12, BCN, LYB, and LCB structures. Simply paste this into the bottom of the ffG53a6.hdb file.

B12     19
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAB    O8R    C5R    C4R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B 
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11
LYB     20       
1    1    H      N      -C     CA
1    4    HZ1    NZ     CE     CD
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11
BCN     19
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAB    O8R    C5R    C4R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B 
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11
LCB     20       
1    1    H      N      -C     CA
1    4    HZ1    NZ     CE     CD
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11

As brief explanation, the three-letter code is followed by the number of Hydrogen atoms that are to be added. Each line can be read:

First Column – The number of hydrogen atoms added (so all of these entries on the far left mean "add ONE hydrogen")

Second Column – The manner by which the hydrogen atom is to be added (this is listed in section 5.5 of the GROMACS 3.3 Manual (page 93))

Third Column – The name of the Hydrogen atom to be added

Fourth Column – The atom to which the H is going to be directly linked in the topology file

Fifth – Seventh Columns
– atoms that define how the Hydrogen is added with respect to (1) the code in Column 2 and (2) the atom to which the Hydrogen is added.

3. ffG53a6bon.itp

There are a few subtle tweaks to the force constants for a few bonds that I perform here right within the file and that proper MD people likely would scream at. I note that, when you do this, you are making changes to numbers that will affect the results if you somehow start doing heme MD simulations.

Change the gb_NN values to those provided below.

#define gb_34        0.198  0.6400e+06
; NR  -   FE    120
#define gb_4         0.1142  3.7000e+07
; C - O (CO in heme)  2220
#define gb_14       0.1340  1.1000e+07
; C  -  NR (heme)       1000
#define gb_30       0.1880  2.7200e+06
; FE  -  C (Heme)

You will note that I have not done anything to make cobalt appear in the topology or force field files. For the sake of running a simulation, Fe and Co are close enough that simply replacing CO for FE in the PDB file is sufficient. You can do the completely proper job of adding cobalt to the force field to get the mass right.

And that is the bare basics for getting a run to happen. A proper tutorial on how to generate force field parameters and topologies may be forthcoming, depending largely on interest and my ability to find time to do it.

Article citation: Damian G. Allis, Mol. BioSyst., 2010, DOI: 10.1039/c003476b

Damian G. Allis1, Timothy J. Fairchild2 and Robert P. Doyle1

1. Department of Chemistry, Syracuse University, Syracuse, NY 13244, USA
2. School of Chiropractic and Sports Science, Murdoch University, Murdoch, WA 6150, Australia

As part of ongoing research into the use of vitamin B12 (B12; cobalamin; Cbl)-based bioconjugate approaches for the oral delivery of peptides/proteins, a molecular dynamics (MD) study of the binding of a cyanocobalamin-insulin (CN-Cbl-insulin) conjugate to human transcobalamin(II) (TCII) was recently reported that provides a qualitative picture of how the human insulin protein in its open T-state geometry affects CN-Cbl binding to TCII. This initial analysis revealed that the B22-B30 segment of the insulin B-chain acts as a long tether that connects the larger combined insulin A/B region to CN-Cbl when this conjugation is performed at the CN-Cbl ribose 5-hydroxy position. The experimental support for this model of the binding interaction is provided by the consequences of the successful delivery of the CN-Cbl-insulin conjugate in the production of significantly decreased blood glucose levels in diabetic STZ-rat models. In efforts to provide a more detailed description of the (CN-Cbl)-TCII complex for modeling Cbl-based bioconjugate designs, the (CN-Cbl)-TCII system and a CN-Cbl conjugate incorporating a flexible tether composed of only the B22-B30 segment of human insulin have been examined by MD simulations. The implications of these simulations are discussed in terms of successful conjugate positioning on Cbl, especially when such sites are not apparent from the diffraction studies alone, and the possibilities, as yet not reported, for dual-tethered Cbl bioconjugates for multi-component drug delivery applications.

The Vibrational Spectrum Of Parabanic Acid By Inelastic Neutron Scattering Spectroscopy And Simulation By Solid-State DFT

Available as an ASAP in The Journal of Physical Chemistry A. As a general rule in computational chemistry, the smaller the molecule, the harder it is to get right. As a brief summary, parabanic acid has several interesting properties of significance to computational chemists as both a model for other systems containing similar sub-structures and as a complicated little molecule in its own right.

1. The solid-state spectrum requires solid-state modeling. This should be of no surprise (see the figure below for the difference in solid-state (top) and isolated-molecule (bottom)). This task was undertaken with both DMol3 and Crystal06, with DMol3 calculations responsible for the majority of the analysis of this system (as has always been the case in the neutron studies reported on this site).

2. The agreement in the hydrogen-bonded N-H…O vibrations is, starting from the crystal structure, in poor agreement with experiment. You'll note the region between 750 and 900 cm-1 is a little too high (and for clarification, the simulated spectrum is in red below). According to the kitchen sink that Matt threw at the structure, the problem is not the same anharmonicity one would acknowledge by Dr. Walnut's "catalytic handwaving" approach to spectrum assignment (Dr. Walnut does not engage in this behavior, rather endeavors to find it in others where it should not be).

3. The local geometry of the hydrogen-bonding network in this molecular solid leads to notable changes in parabanic acid structure that, in turn, leads to the different behavior of the N-H…O vibrational motions. There is one potentially inflammatory comment in the Conclusions section that results from this identification. The parabanic acid molecule is, at its sub-structure, a set of three constrained peptide linkages that under go subtle but vibrationally-observable changes to their geometry because of crystal packing and intermolecular hydrogen bond formation. This means that the isolated molecule and solid-state forms are different and that peptide groups are influenced by neighboring interactions.

So, why should one care? Suppose one is parameterizing a biomolecular force field (CHARMM, AMBER, GROMOS, etc.) using bond lengths, bond angles, etc., for the amino acid geometry and vibrational data for some aspect of the force constant analysis. The structural data for these force fields often originates with solid-state studies (diffraction results). This means, to those very concerned with structural accuracy, that a geometry we know to be influenced by solid-state interactions is being used as the basis for molecular dynamics calculations that will NOT be used in their solid-state forms. Coupled with the different spectral properties due to intermolecular interactions, the description being used as the basis for the biomolecular force field likely being used in solution (solvent box approaches) is based on data in a phase where the structure and dynamics are altered from their less conformationally-restricted counterpart (in this case, solid-state).

A subtle point, but that's where applied theoreticians do some of their best work.

Matthew R. Hudson, Damian G. Allis, and Bruce S. Hudson

Department of Chemistry, 1-014 Center for Science and Technology, Syracuse University, Syracuse, New York 13244-4100

Abstract: The incoherent inelastic neutron scattering spectrum of parabanic acid was measured and simulated using solid-state density functional theory (DFT). This molecule was previously the subject of low-temperature X-ray and neutron diffraction studies. While the simulated spectra from several density functionals account for relative intensities and factor group splitting regardless of functional choice, the hydrogen-bending vibrational energies for the out-of-plane modes are poorly described by all methods. The disagreement between calculated and observed out-of-plane hydrogen bending mode energies is examined along with geometry optimization differences of bond lengths, bond angles, and hydrogen-bonding interactions for different functionals. Neutron diffraction suggests nearly symmetric hydrogen atom positions in the crystalline solid for both heavy-atom and N-H bond distances but different hydrogen-bonding angles. The spectroscopic results suggest a significant factor group splitting for the out-of-plane bending motions associated with the hydrogen atoms (N-H) for both the symmetric and asymmetric bending modes, as is also supported by DFT simulations. The differences between the quality of the crystallographic and spectroscopic simulations by isolated-molecule DFT, cluster-based DFT (that account for only the hydrogen-bonding interactions around a single molecule), and solid-state DFT are considered in detail, with parabanic acid serving as an excellent case study due to its small size and the availability of high-quality structure data. These calculations show that hydrogen bonding results in a change in the bond distances and bond angles of parabanic acid from the free molecule values.

pubs.acs.org/doi/abs/10.1021/jp9114095
pubs.acs.org/journal/jpcafh
en.wikipedia.org/wiki/Computational_chemistry
accelrys.com/products/materials-studio/quantum-and-catalysis-software.html
www.crystal.unito.it
en.wikipedia.org/wiki/Anharmonicity
chemistry.syr.edu/faculty/walnut.html
en.wikipedia.org/wiki/Hydrogen_bond
en.wikipedia.org/wiki/Peptide
en.wikipedia.org/wiki/Force_field_%28chemistry%29
www.charmm.org
ambermd.org
gromacs.org
en.wikipedia.org/wiki/Molecular_dynamics