The Binding Of Vitamin B12 To Transcobalamin(II); Structural Considerations For Bioconjugate Design – A Molecular Dynamics Study

In press, in the journal Molecular Biosystems. A first official foray into molecular dynamics-only (MD-only) computational work and I am pleased to report that the computational results not only make sense with respect to the experimental results, they also indicate a possible new way to use vitamin B12 for the oral delivery of bio-active molecules more complicated than the binary bioconjugates considered to date.

The Interesting Result

The conclusion from the previous study was that the insulin B Chain (figure below) acts as a tether to separate the structured region of insulin (the region with the largest inflexible steric bulk, see below) from the region of the transcobalamin II (TCII) that bind vitamin B12. It was then determined that the approach employed for the B12-insulin bioconjugate, simply linking one biomolecule onto another with known binding and transport properties (this is a common theme in all bioconjugate design), worked because the last 10 residues in the insulin B Chain (B22 to B30) are flexible in solution (they, in fact, cover the insulin binding region in the crystal form, then uncover this region in the biologically active form).

As a general procedure for B12 bioconjugate design, one of the key requirements for a functional product is a tether length that provides sufficient separation between B12 and any molecular structure large enough to affect B12 binding within its transport proteins (makes sense, as a tethered structure that does not enable B12 binding in its transport proteins will find the B12 bioconjugate delivered to the gut where acids and digestive enzymes will hide the failed binding). This leads to the question, "How long must a tether be to meet this rather general criterion?" This is, partly, the correct question, as the retention of B12 binding within its transport proteins is a function of both proper tether length and [transport protein]-["other molecule"] interaction (in this first case, "other molecule" = insulin).

Saving the exhaustive analysis for the paper, this new study used this flexible region of human insulin (that is, B22 to B30, with the B12 linkage occurring on the B29 lysine side chain) as a proxy for any arbitrary tether, then used MD simulations to consider how the flexibility of this tether might lead to changes in B12 binding within its TCII pocket (the transport protein for which we have the best crystal structure). The result of these simulations was the identification of the side chain of lysine itself being just long enough to separate the B Chain tether region from the TCII protein surface. This does not mean that lysine will always serve as a perfect linkage. This means that, if the tether structure is effectively non-interacting with TCII (so not sterically demanding by itself), the lysine side chain is long enough to span the solvent-accessible hole produced by the encapsulation of B12 in (in this case) TCII.

The result is a design constraint when using lysine that is quite fortuitous! If the target peptide (insulin or whatnot) has a surface-accessible lysine side chain within a region that is flexible in solution, some simple amide chemistry may produce a viable B12 bioconjugate for delivering that peptide orally (thereby avoiding complete peptide degradation in the G.I. tract).

The More Interesting Result

Buried deep within the bottom of the Discussion section. If you watch the dynamics simulation of the TCII-[B12-tether] complex (shown below for a 300 K 50 ns simulation with 1.5 fs time steps in 14,000 waters (not shown)), you see that the binding of B12 within TCII and the geometry of the encapsulation complex are strongly linked. That is, TCII (and, presumably, its cohorts in the B12 transport pathway) can be thought of as two quite rigid fragments (Red and Blue in the animation) connected by a long tether (Green) that are separated in solution but brought into contact by the binding of vitamin B12 (Gold). The B12 is a glue that holds the fragments together, and a simple tabulation of hydrogen-bonding interactions in the crystal structure reveal that the B12 has more interactions to the A and B fragments of TCII individually than A and B have with each other (which is to say, the B12-A Segment interaction and B12-B Segment interaction are stronger than the A-B Segment interaction). From a biological perspective, this should make perfect sense. B12 is a large, extremely important biomolecule that, since we do not make it ourselves, is to be captured and transported as effectively as possible. The best way to bind this molecule is not to wait for it to burrow into a binding pocket, but rather to encapsulate it in a "clam shell" maneuver that provides "maximum embedding." The tether between the A and B Segments technically would not have to be present if the A and B fragments were present in large quantities (although, as you might expect, the A-B tether does considerably reduce the time to complete encapsulation by forcing these fragments within close proximity).

According to the crystal structure, the B12 is entirely embedded within TCII, with only the solvent-accessible hole at the 5'-ribose position readily accessible for bioconjugate formation. If the overall structure were as rigid as a crystal structure might lead one to believe, functionalization at the cobalt position in the corrin ring would be out of the question.

As I just stated that such a binding mode would otherwise be unlikely, you can guess that there are B12 bioconjugates linked at the cobalt ring that are bio-active.

If you watch the dynamics simulation of the TCII-[B12-tether] complex, you see that the clam shell binding mode of TCII is one with a "loose hinge." This loose hinge is really a result of the flexibility of the two protein fragments (typical protein motion) and flexibility in the short propionamide side chains of vitamin B12 that provide a bit of "spring" in the complete complex. In effect, the flexibility within the structure provides a means for cobalt to be coordinated to something without loss of B12 binding provided that the tether linking the cobalt and the "other" molecule is small enough that it does not require a large change in the A-B binding arrangement (that is, does not affect B12-A and B12-B binding).

And Then There Were Three…

The expectation/prediction/untested hypothesis is that vitamin B12 may be able to happily accommodate two additional molecules at the 5'-ribose and cobalt positions (properly designed) that then provide for the transport of two molecules and/or the delivery of three molecules (one being vitamin B12). This opens the door to a wealth of possibilities, from trinary delivery to combined drug delivery + radiopharma characterization. This is the possibility I'm most interested in pursuing in the next rounds of calculations, with the theory (presumably) providing a very good initial guess about the ideal tether designs to use with B12 for enabling delivery and bio-activity.

And Now For The Hard Work

Stepping back from the theoretical analysis for a moment, the most difficult obstacles to overcome in this study were the generation AND incorporation of force field parameters for vitamin B12 and a B12-Lysine mini-bioconjugate into GROMACS, a problem that I've addressed only in passing in several previous posts. What I won't do in this post is explain the procedure (a single blog post will not do the procedure justice given the complexity of force field parameter generation). What I will do is provide the files for the topology for these systems and a short list of the modifications one needs to make in order to get these systems working. For additional reference, the same topology files are provided in the Supplemental Material for the paper (so, if you find yourself using these, obviously cite the paper and not my humble blog).

Files And Contents:

These are not files to be placed in a single directory, but are segments of file that are going to be placed directly into pre-existing topology files. This is not the best way to do it but is the procedure I began with and will not be changing without finding a very simple tutorial on how-to (which, if you have, I'd be happy to read).

The contents of the topology file (which I assume for you will be ffG53a6 but should work generally) are provided below:

ffG53a6_B12_BCN_LYB_LCB_topology.txt

The topology specifications for vitamin B12 (nothing bound to the cobalt in the corrin ring), cyanocobalamin (CN-B12, with a cyanide bound to the cobalt), B12 with a lysine residue attached to the 5'-ribose hydroxyl position (the tether linkage for the GROMACS prep programs), and CN-B12 with a lysine residue attached to the 5'-ribose hydroxyl position.

I am assuming that you're using the ffG53a6 force field, meaning you add the topology sets to the bottom of the ffG53a6.rtp file.

GROMACS Modifications:

GROMACS force field and topology files must be modified slightly in order to read the topologies generated above and, depending on where you got the B12 structure, add/correct the hydrogen atoms in the B12 molecule.

In a typical UNIX/Linux installation (which I have provided compilation instructions for in a previous post), the files to be modified can be found in /usr/local/gromacs. And, if you're using Ubuntu like I am, you'll need to "sudo" these modifications.

1. aminoacids.dat

If you open this file, you see a list of three- and four-letter codes in the format:

50
ABU
ACE
...
VAL
PGLU

The "50" refers to the number of codes. As we're going to be adding the codes B12, BCN, LYB, and LCB into GROMACS, we first change 50 to 54, then just list the four codes at the bottom of the file:

54
ABU
ACE
...
VAL
PGLU
B12
BCN
LYB
LCB

You'll note that B12 and BCN aren't like the others, LYB is not LYS, and LCB is also nowhere to be seen. The codes in this file are STANDARD and make sure you don't inadvertently name your inserted structure one of the structures in the list.

2. ffG53a6.hdb

I specifically used the ffG53a6 force field for the TCII-B12 work, meaning I only made modifications to these force field files. The ffG53a6.hdb file is responsible for adding/correcting hydrogen atoms in your structure (just because the crystallographers do not see them does not mean they aren't there) and contains hydrogen-beautification information for all of the three/four-letter codes recognized in aminoacids.dat. The content below is the hydrogen-correcting data for the B12, BCN, LYB, and LCB structures. Simply paste this into the bottom of the ffG53a6.hdb file.

B12     19
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAB    O8R    C5R    C4R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B 
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11
LYB     20       
1    1    H      N      -C     CA
1    4    HZ1    NZ     CE     CD
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11
BCN     19
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAB    O8R    C5R    C4R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B 
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11
LCB     20       
1    1    H      N      -C     CA
1    4    HZ1    NZ     CE     CD
1    2    HAO    N62    C61    O63
1    2    HAN    N62    C61    C60
1    2    HAM    N52    C50    O51
1    2    HAL    N52    C50    C49
1    2    HAK    N45    C43    O44
1    2    HAJ    N45    C43    C42
1    2    HAI    N40    C38    O39
1    2    HAH    N40    C38    C37
1    2    HAE    N29    C27    O28
1    2    HAD    N29    C27    C26
1    2    HAG    N33    C32    O34
1    2    HAF    N33    C32    C31
1    2    HAA    O7R    C2R    C1R
1    2    HAC    N59    C57    O58
1    1    H2B    C2B    N1B    N3B
1    1    H4B    C4B    C5B    C9B
1    1    H7B    C7B    C8B    C6B
1    1    H10    C10    C9     C11

As brief explanation, the three-letter code is followed by the number of Hydrogen atoms that are to be added. Each line can be read:

First Column – The number of hydrogen atoms added (so all of these entries on the far left mean "add ONE hydrogen")

Second Column – The manner by which the hydrogen atom is to be added (this is listed in section 5.5 of the GROMACS 3.3 Manual (page 93))

Third Column – The name of the Hydrogen atom to be added

Fourth Column – The atom to which the H is going to be directly linked in the topology file

Fifth – Seventh Columns
– atoms that define how the Hydrogen is added with respect to (1) the code in Column 2 and (2) the atom to which the Hydrogen is added.

3. ffG53a6bon.itp

There are a few subtle tweaks to the force constants for a few bonds that I perform here right within the file and that proper MD people likely would scream at. I note that, when you do this, you are making changes to numbers that will affect the results if you somehow start doing heme MD simulations.

Change the gb_NN values to those provided below.

#define gb_34        0.198  0.6400e+06
; NR  -   FE    120
#define gb_4         0.1142  3.7000e+07
; C - O (CO in heme)  2220
#define gb_14       0.1340  1.1000e+07
; C  -  NR (heme)       1000
#define gb_30       0.1880  2.7200e+06
; FE  -  C (Heme)

You will note that I have not done anything to make cobalt appear in the topology or force field files. For the sake of running a simulation, Fe and Co are close enough that simply replacing CO for FE in the PDB file is sufficient. You can do the completely proper job of adding cobalt to the force field to get the mass right.

And that is the bare basics for getting a run to happen. A proper tutorial on how to generate force field parameters and topologies may be forthcoming, depending largely on interest and my ability to find time to do it.

Article citation: Damian G. Allis, Mol. BioSyst., 2010, DOI: 10.1039/c003476b

Damian G. Allis1, Timothy J. Fairchild2 and Robert P. Doyle1

1. Department of Chemistry, Syracuse University, Syracuse, NY 13244, USA
2. School of Chiropractic and Sports Science, Murdoch University, Murdoch, WA 6150, Australia

As part of ongoing research into the use of vitamin B12 (B12; cobalamin; Cbl)-based bioconjugate approaches for the oral delivery of peptides/proteins, a molecular dynamics (MD) study of the binding of a cyanocobalamin-insulin (CN-Cbl-insulin) conjugate to human transcobalamin(II) (TCII) was recently reported that provides a qualitative picture of how the human insulin protein in its open T-state geometry affects CN-Cbl binding to TCII. This initial analysis revealed that the B22-B30 segment of the insulin B-chain acts as a long tether that connects the larger combined insulin A/B region to CN-Cbl when this conjugation is performed at the CN-Cbl ribose 5-hydroxy position. The experimental support for this model of the binding interaction is provided by the consequences of the successful delivery of the CN-Cbl-insulin conjugate in the production of significantly decreased blood glucose levels in diabetic STZ-rat models. In efforts to provide a more detailed description of the (CN-Cbl)-TCII complex for modeling Cbl-based bioconjugate designs, the (CN-Cbl)-TCII system and a CN-Cbl conjugate incorporating a flexible tether composed of only the B22-B30 segment of human insulin have been examined by MD simulations. The implications of these simulations are discussed in terms of successful conjugate positioning on Cbl, especially when such sites are not apparent from the diffraction studies alone, and the possibilities, as yet not reported, for dual-tethered Cbl bioconjugates for multi-component drug delivery applications.

New B12-Insulin-TCII-Insulin Receptor Cover Image For This Month's ChemMedChem (March 2009)

As was the case for the first ChemMedChem December, 2007 cover issue (posted previously), the cover story in this month's issue is a communication by myself and members and collaborators of the Robert Doyle Group here at Syracuse University.  In this case, the work for the cover image actually went into computational research published in the associated article (instead of just a pretty cover image to complement the associated article, which was the intent of the previous cover).

The image below shows the Transcobalamin II (TCII) protein (in teal ribbons, with a bound cyanocobalamin (B12) shown in red.  The PDB code for this complex is 2BB5) sitting within the surface-accessible fragment of the gigantic insulin receptor (PDB code 2DTG.  The cell membrane would be at the bottom of this image, with the remainder of the complete protein sitting both within the cell membrane and then into the cytoplasm).  Saving the lead-up to this structure generation for the associated published article, this image was created to show one of the most important steps in the Oral Insulin project being worked on in the Doyle Group, with the fact that we know it works making the validity of the image content all the more relevant.  In brief, this figure shows that the TCII/B12-Insulin complex can fit within the insulin receptor such that the insulin molecule can bind to its receptor position on the appropriately described insulin receptor (IR), thereby instigating the cascade of events that leads to cellular glucose uptake.

For a larger view, click on the image.

Like many of the protein structures I render, this image would not have been possible without VMD and MegaPOV, my favorite OSX POV-Ray variant (there's quite a bit of Photoshop layering as well).  The final layout for the cover is below, which I think would have benefited from the aerial view on the upper left side being shifted slightly to the left to fill out the black square.

According to the ChemMedChem website:

The cover picture shows three views of a vitamin B12-insulin conjugate bound to transcobalamin II, docked in the insulin receptor (IR). This study reveals how the structure of an orally deliverable insulin changes in solution after vitamin B12 conjugation and its effect on IR binding capacity. The results demonstrate that chemical modification of insulin by linking relatively large pendant groups does not interfere with IR recognition. For more details, see the Full Paper by T. J. Fairchild, R. P. Doyle, et al. on p. 421 ff.

To date, the associated work has received some additional linkage, both in the form of inclusion in the Spotlight list in Angew. Chem. Int. Ed. 2009, 48, 2072 – 2073 and, for those looking for a more pop-sci discussion of the applications of the research, New Scientist (Insulin Chewing Gum, 14 January 2009).  PDFs of the associated content are provided here for Angewandte Chemie and New Scientist.

There is a considerable amount of additional computational work being done on this system and the complete B12 pathway for potential use in various other applications.  Stay tuned for next year's cover.

www3.interscience.wiley.com/journal/110485305/home
www3.interscience.wiley.com/journal/117354609/issue
www.somewhereville.com/?p=103
chemistry.syr.edu/faculty/doyle.html
www.syr.edu
en.wikipedia.org/wiki/Transcobalamin
en.wikipedia.org/wiki/Cyanocobalamin
www.rcsb.org/pdb/home/home.do
www.rcsb.org/pdb/explore/explore.do?structureId=2BB5
en.wikipedia.org/wiki/Insulin_receptor
www.rcsb.org/pdb/explore/explore.do?structureId=2DTG
en.wikipedia.org/wiki/Cytoplasm
en.wikipedia.org/wiki/Insulin
www.ks.uiuc.edu/Research/vmd
megapov.inetart.net
www.apple.com/macosx
www.povray.org
en.wikipedia.org/wiki/Adobe_Photoshop
www3.interscience.wiley.com/journal/122232189/issue
www.newscientist.com/article/dn16413-invention-insulin-chewing-gum.html

Exploring the Implications of Vitamin B12 Conjugation to Insulin on Insulin Receptor Binding and Cellular Uptake

In press, in the journal ChemMedChem (and, because I think it's hip, I note that the current "obligatory" image for the wikipedia article for ChemMedChem features the image I made for the review article on the topic addressed in this new study). As with many theory papers (there's some experiment in there, too), this very brief article summarizes several months of cyanocobalamin (B12) parameterization and molecular dynamics (MD) simulations. The purpose of the theory was to address all of the major structural snapshots in the uptake process associated with the insulin-B12 bioconjugate being developed as part of the much heralded oral insulin project in Robert Doyle's group here at Syracuse. These structures include:

1. The structure and dynamic properties of the insulin-B12 bioconjugate
2. The binding of B12 to Transcobalamin II (TCII) (for B12 parameterization)
3. The binding of the insulin-B12 bioconjugate to TCII (and the steric demands therein)
4. The interaction of the insulin-B12 bioconjugate, bound to TCII, with the insulin Receptor (IR)

The quantum chemical (for the B12 geometry and missing force constants) and molecular dynamics (GROMACS with the GROMOS96 (53a6)) simulation work is going to serve as the basis for several posts here (eventually) about parameterization, topology generation, and force field development.

As an example of some of the insights modeling provides, the figure above shows the insulin-B12 bioconjugate (the insulin is divided into A and B chains, the A chain in blue and the important division of the insulin B chain in the front half of the rainbow). Insulin is a rather large-scale example of many of the same molecular issues that arise in the analysis of solid-state molecular crystals by either terahertz or inelastic neutron scattering spectroscopy. The packing of molecules in their crystal lattices can lead to significant changes in molecular geometry, be these changes in the stabilization of higher-energy molecular conformations or even deformations in the covalent framework. In the case of insulin, it is found that the crystal geometry (also the geometry of stored insulin in the body) is quite different from the solution-phase form. It's even worse! The B chain end (B20-B30) in the solid-state geometry covers (protects?) the business-end of the insulin binding region to the Insulin Receptor. One can imagine the difficulty in proposing the original binding model for insulin to its receptor from the original crystal data given that the actual binding region is blocked off in the solid-state form! The "Extended" form in the figure is representative of "multiple other" conformations of the B20-B30 region (which mimics the characterized T-state of insulin), those geometries for which the insulin binding region (blue and green) is completely exposed. This extended geometry is also the one that separates the bulk of the insulin structure from the covalently-linked B12 (at Lys29) and, it is argued from the MD simulations in the paper, enables the B12 to still tightly bind to TCII despite the presence of all this steric bulk.

Amanda K. Petrus1, Damian G. Allis1, Robert P. Smith2, Timothy J. Fairchild3 and Robert P. Doyle1

1. Department of Chemistry, Syracuse University, Syracuse, NY 13244, USA
2. Department of Construction Management and Wood Products Engineering, SUNY, College of Environmental Science and Forestry, Syracuse, NY 13210, USA
3. Department of Exercise Science, Syracuse University, Syracuse, NY 13244, USA

Extract: We recently reported a vitamin B12 (B12) based insulin conjugate that produced significantly decreased blood glucose levels in diabetic STZ-rat models. The results of this study posed a fundamental question, namely what implications does B12 conjugation have on insulin's interaction with its receptor? To explore this question we used a combination of molecular dynamics (MD) simulations and immuno-electron microscopy (IEM).

www3.interscience.wiley.com/journal/110485305/home
en.wikipedia.org
en.wikipedia.org/wiki/Chemmedchem
www3.interscience.wiley.com/journal/116323633/abstract
en.wikipedia.org/wiki/Cyanocobalamin
en.wikipedia.org/wiki/Molecular_dynamics
en.wikipedia.org/wiki/Insulin
chemistry.syr.edu/faculty/doyle.html
chemistry.syr.edu/faculty/doyle_group/index.html
www.syr.edu
en.wikipedia.org/wiki/Quantum_chemistry
www.gromacs.org
en.wikipedia.org/wiki/Terahertz
en.wikipedia.org/wiki/Inelastic_neutron_scattering
chemistry.syr.edu
www.syr.edu
www.esf.edu