This tutorial is from this website.
Download 3HTB from the RCSB website.
View the structure in VMD.
cd ../gromacs-work/gromacs-prot-ligand-tutorial
mol new 3htb.pdb
mol modstyle 0 0 licorice
A few of the amino acids have 2 conformers that are present in the structure.
awk '!(/HETATM/ && /HOH/ || /HETATM/ && /BME/ || /HETATM/ && /PO4/)' 3htb.pdb > 3htb-clean.pdb
# && is AND, and || is OR
# ! shows the lines that do not have the pattern
The tutorial-supplied file is 3htb_clean.pdb
. For fun, run:
git diff --no-index --word-diff 3htb-clean.pdb 3htb_clean.pdb
Because I used the more specific awk
command instead of multiple grep
commands, I have a few lines in the 3htb-clean.pdb
file that are missing from the 3htb_clean.pdb
file. Also the 3htb_clean.pdb
file is missing all the CONECT
entries. I’ll remove them from 3htb-clean.pdb
:
awk '!/CONECT/' 3htb-clean.pdb > 3htb-tmp.pdb
mv -i 3htb-tmp.pdb 3htb-clean.pdb
Nice!
JZ4
coordinates into its own file.grep JZ4 3HTB-clean.pdb > jz4.pdb
awk '!(/HETATM/ && /JZ4/)' 3HTB-clean.pdb > 3htb-tmp.pdb
mv -i 3htb-tmp.pdb 3htb-clean.pdb
python
you are using with python --version
.CHARMM36
force field file from the MacKerell lab website.cgenff_charmm2gmx.py
script from the same website.tar -zxvf charmm36-feb2021.ff.tgz
rm charmm36-feb2021.ff.tgz
pdb2gmx
:gmx pdb2gmx -f 3htb-clean.pdb -o 3htb_processed.gro
# choose the CHARMM36 force field (option 1)
# choose the default water model (TIP 3-point, recommended, by default uses CHARMM TIP3 with LJ on H)
.mol2
file.@<TRIPOS>MOLECULE
heading, change *****
to JZ4
.@<TRIPOS>ATOM
1 C4 24.2940 -24.1240 -0.0710 C.3 167 JZ4167 -0.0650
2 C7 21.5530 -27.2140 -4.1120 C.ar 167 JZ4167 -0.0613
3 C8 22.0680 -26.7470 -5.3310 C.ar 167 JZ4167 -0.0583
4 C9 22.6710 -25.5120 -5.4480 C.ar 167 JZ4167 -0.0199
5 C10 22.7690 -24.7300 -4.2950 C.ar 167 JZ4167 0.1200
6 C11 21.6930 -26.4590 -2.9540 C.ar 167 JZ4167 -0.0551
7 C12 22.2940 -25.1870 -3.0750 C.ar 167 JZ4167 -0.0060
8 C13 22.4630 -24.4140 -1.8080 C.3 167 JZ4167 -0.0245
9 C14 23.9250 -24.7040 -1.3940 C.3 167 JZ4167 -0.0518
10 OAB 23.4120 -23.5360 -4.3420 O.3 167 JZ4167 -0.5065
11 H 25.3133 -24.3619 0.1509 H 1 UNL1 0.0230
12 H 23.6591 -24.5327 0.6872 H 1 UNL1 0.0230
13 H 24.1744 -23.0611 -0.1016 H 1 UNL1 0.0230
14 H 21.0673 -28.1238 -4.0754 H 1 UNL1 0.0618
15 H 21.9931 -27.3472 -6.1672 H 1 UNL1 0.0619
16 H 23.0361 -25.1783 -6.3537 H 1 UNL1 0.0654
17 H 21.3701 -26.8143 -2.0405 H 1 UNL1 0.0621
18 H 21.7794 -24.7551 -1.0588 H 1 UNL1 0.0314
19 H 22.2659 -23.3694 -1.9301 H 1 UNL1 0.0314
20 H 24.5755 -24.2929 -2.1375 H 1 UNL1 0.0266
21 H 24.0241 -25.7662 -1.3110 H 1 UNL1 0.0266
22 H 23.7394 -23.2120 -5.1580 H 1 UNL1 0.2921
to
@<TRIPOS>ATOM
1 C4 24.2940 -24.1240 -0.0710 C.3 1 JZ4 -0.0650
2 C7 21.5530 -27.2140 -4.1120 C.ar 1 JZ4 -0.0613
3 C8 22.0680 -26.7470 -5.3310 C.ar 1 JZ4 -0.0583
4 C9 22.6710 -25.5120 -5.4480 C.ar 1 JZ4 -0.0199
5 C10 22.7690 -24.7300 -4.2950 C.ar 1 JZ4 0.1200
6 C11 21.6930 -26.4590 -2.9540 C.ar 1 JZ4 -0.0551
7 C12 22.2940 -25.1870 -3.0750 C.ar 1 JZ4 -0.0060
8 C13 22.4630 -24.4140 -1.8080 C.3 1 JZ4 -0.0245
9 C14 23.9250 -24.7040 -1.3940 C.3 1 JZ4 -0.0518
10 OAB 23.4120 -23.5360 -4.3420 O.3 1 JZ4 -0.5065
11 H 25.3133 -24.3619 0.1509 H 1 JZ4 0.0230
12 H 23.6591 -24.5327 0.6872 H 1 JZ4 0.0230
13 H 24.1744 -23.0611 -0.1016 H 1 JZ4 0.0230
14 H 21.0673 -28.1238 -4.0754 H 1 JZ4 0.0618
15 H 21.9931 -27.3472 -6.1672 H 1 JZ4 0.0619
16 H 23.0361 -25.1783 -6.3537 H 1 JZ4 0.0654
17 H 21.3701 -26.8143 -2.0405 H 1 JZ4 0.0621
18 H 21.7794 -24.7551 -1.0588 H 1 JZ4 0.0314
19 H 22.2659 -23.3694 -1.9301 H 1 JZ4 0.0314
20 H 24.5755 -24.2929 -2.1375 H 1 JZ4 0.0266
21 H 24.0241 -25.7662 -1.3110 H 1 JZ4 0.0266
22 H 23.7394 -23.2120 -5.1580 H 1 JZ4 0.2921
sort_mol2_bonds.pl
script.perl sort_mol2_bonds.pl jz4.mol2 jz4_fix.mol2
jz4_fix.mol2
file for the next step.Choose File
button and choose the jz4_fix.mol2
file. Then click the Upload File
button.git diff --no-index --word-diff jz4_fix.str jz4.str
The tutorial file has atoms labeled C1, C2, etc.
whereas the file I generated has atoms labeled C4, C5, etc.
but none labeled C1 etc.
. This is due to how CGenFF
processed the file, because the jz4_fix.mol2
file I generated was the same as the one downloaded from the tutorial site.
All of my penalty scores were less than 10, so the file should be good to go.
cgenff_charmm2gmx_py3_nx2.py
script to convert the topology from CHARMM
to GROMACS
format.python cgenff_charmm2gmx_py3_nx2.py JZ4 jz4_fix.mol2 jz4_fix.str charmm36-feb2021.ff
# I got this error:
# File "cgenff_charmm2gmx_py3_nx2.py", line 53, in <module>
# import numpy as np
# ModuleNotFoundError: No module named 'numpy'
Try to install numpy
.
python3 -m pip install numpy
Looks like it installed okay. Try script again:
python cgenff_charmm2gmx_py3_nx2.py JZ4 jz4_fix.mol2 jz4_fix.str charmm36-feb2021.ff
# got this error:
# ModuleNotFoundError: No module named 'networkx'
# try to fix:
python3 -m pip install networkx
# try again:
python cgenff_charmm2gmx_py3_nx2.py JZ4 jz4_fix.mol2 jz4_fix.str charmm36-feb2021.ff
# Got this error:
# File "cgenff_charmm2gmx_py3_nx2.py", line 991, in <module>
# if(float(nx.__version__) < 2.0):
# ValueError: could not convert string to float: '2.6.3'
# try to use the other script:
python cgenff_charmm2gmx_py3_nx1.py JZ4 jz4_fix.mol2 jz4_fix.str charmm36-feb2021.ff
python3 -m pip uninstall networkx
python3 -m pip install networkx==2.3
python cgenff_charmm2gmx_py3_nx2.py JZ4 jz4_fix.mol2 jz4_fix.str charmm36-feb2021.ff
I got the error:
NOTE 3: Please be sure to use the same version of CGenFF in your simulations that was used during parameter generation:
--Version of CGenFF detected in jz4_fix.str : 4.5
--Version of CGenFF detected in charmm36-feb2021.ff/forcefield.doc : 4.4
WARNING: CGenFF versions are not equivalent!
From the web:
Don’t worry about this difference. The web server needs to be updated, but not in any way that affects the end user.
—Justin A. Lemkul, Ph.D.
jz4_ini.pdb
to .gro
format with editconf
:gmx editconf -f jz4_ini.pdb -o jz4.gro
3htb_processed.gro
to a new file to be manipulated, for instance, call it complex.gro
.cp 3htb_processed.gro complex.gro
jz4.gro
and paste it into complex.gro
, below the last line of the protein atoms, and before the box vectors, like so: 163ASN HD22 2611 0.944 -0.584 -0.525
163ASN C 2612 0.621 -0.740 -0.126
163ASN OT1 2613 0.624 -0.616 -0.140
163ASN OT2 2614 0.683 -0.703 -0.011
1JZ4 C4 1 2.429 -2.412 -0.007
1JZ4 C7 2 2.155 -2.721 -0.411
1JZ4 C8 3 2.207 -2.675 -0.533
1JZ4 C9 4 2.267 -2.551 -0.545
1JZ4 C10 5 2.277 -2.473 -0.430
1JZ4 C11 6 2.169 -2.646 -0.295
1JZ4 C12 7 2.229 -2.519 -0.308
1JZ4 C13 8 2.246 -2.441 -0.181
1JZ4 C14 9 2.392 -2.470 -0.139
1JZ4 OAB 10 2.341 -2.354 -0.434
1JZ4 H1 11 2.531 -2.436 0.015
1JZ4 H2 12 2.366 -2.453 0.069
1JZ4 H3 13 2.417 -2.306 -0.010
1JZ4 H4 14 2.107 -2.812 -0.407
1JZ4 H5 15 2.199 -2.735 -0.617
1JZ4 H6 16 2.304 -2.518 -0.635
1JZ4 H7 17 2.137 -2.681 -0.204
1JZ4 H8 18 2.178 -2.476 -0.106
1JZ4 H9 19 2.227 -2.337 -0.193
1JZ4 H10 20 2.458 -2.429 -0.214
1JZ4 H11 21 2.402 -2.577 -0.131
1JZ4 H12 22 2.374 -2.321 -0.516
5.99500 5.19182 9.66100 0.00000 0.00000 -2.99750 0.00000 0.00000 0.00000
.gro
file, increment the second line of complex.gro
to reflect this change. There should be 2636 atoms in the coordinate file now.#include "jz4.itp"
into topol.top
after the position restraint file is included, like so:; Include Position restraint file
#ifdef POSRES
#include "posre.itp"
#endif
; Include water topology
#include "./charmm36-feb2021.ff/tip3p.itp"
Becomes …
; Include Position restraint file
#ifdef POSRES
#include "posre.itp"
#endif
; Include ligand topology
#include "jz4.itp"
; Include water topology
#include "./charmm36-feb2021.ff/tip3p.itp"
#include
statement to add the jz4.prm
parameters at the TOP of topol.top
, like so:; Include forcefield parameters
#include "./charmm36-feb2021.ff/forcefield.itp"
[ moleculetype ]
; Name nrexcl
Protein_chain_A 3
Becomes
; Include forcefield parameters
#include "./charmm36-feb2021.ff/forcefield.itp"
; Include ligand parameters
#include "jz4.prm"
[ moleculetype ]
; Name nrexcl
Protein_chain_A 3
It is important to place this statement before any [ moleculetype ]
entry. It must also appear AFTER the #include
statement for the parent force field.
[ molecules ]
directive, like so:[ molecules ]
; Compound #mols
Protein_chain_A 1
JZ4 1
gmx editconf -f complex.gro -o newbox.gro -bt dodecahedron -d 1.0
gmx solvate -cp newbox.gro -cs spc216.gro -p topol.top -o solv.gro
grompp
to assemble a .tpr
file, using any .mdp
file. I downloaded the ions.mdp
file from this website.gmx grompp -f ions.mdp -c solv.gro -p topol.top -o ions.tpr
.tpr
file to genion
:gmx genion -s ions.tpr -o solv_ions.gro -p topol.top -pname NA -nname CL -neutral
# when prompted, choose group 15:
# Group 15 ( SOL) has 30882 elements
The names of the ions specified with -pname
and -nname
are always the elemental symbol in all capital letters. Refer to ions.itp
for proper nomenclature if you encounter difficulties.
topol.top
file. Your [ molecules ]
directive should now look like:[ molecules ]
; Compound #mols
Protein_chain_A 1
JZ4 1
SOL 10288
CL 6
grompp
using the em.mdp
input parameter file from this website:gmx grompp -f em.mdp -c solv_ions.gro -p topol.top -o em.tpr
mdrun
to carry out the energy minimization (EM):gmx mdrun -v -deffnm em
For the tutorial author, the system converged relatively quickly:
Steepest Descents converged to Fmax < 1000 in 143 steps
Potential Energy = -4.9014547e+05
Maximum force = 8.7411469e+02 on atom 27
Norm of force = 5.6676244e+01
My system also converged relatively quickly:
Steepest Descents converged to Fmax < 1000 in 156 steps
Potential Energy = -4.9085631e+05
Maximum force = 8.8324829e+02 on atom 2045
Norm of force = 5.6558131e+01
gmx make_ndx -f jz4.gro -o index_jz4.ndx
...
> 0 & ! a H*
> q
genrestr
module and select this newly created index group (which will be group 3 in the index_jz4.ndx
file):gmx genrestr -f jz4.gro -n index_jz4.ndx -o posre_jz4.itp -fc 1000 1000 1000
topol.top
) in the location indicated (Location matters!):; Include Position restraint file
#ifdef POSRES
#include "posre.itp"
#endif
; Include ligand topology
#include "jz4.itp"
; Ligand position restraints
#ifdef POSRES
#include "posre_jz4.itp"
#endif
; Include water topology
#include "./charmm36-feb2021.ff/tip3p.itp"
Important: Do not couple every single species in your system separately.
Group JZ4 with the protein for the purposes of temperature coupling. In the same way, the few Cl- ions we inserted are considered part of the solvent. We need a special index group that merges the protein and JZ4. We accomplish this with the make_ndx
module, supplying any coordinate file of the complete system. Here, we use em.gro
, the output (minimized) structure of our system:
gmx make_ndx -f em.gro -o index.ndx
…
0 System : 33506 atoms
1 Protein : 2614 atoms
2 Protein-H : 1301 atoms
3 C-alpha : 163 atoms
4 Backbone : 489 atoms
5 MainChain : 651 atoms
6 MainChain+Cb : 803 atoms
7 MainChain+H : 813 atoms
8 SideChain : 1801 atoms
9 SideChain-H : 650 atoms
10 Prot-Masses : 2614 atoms
11 non-Protein : 30892 atoms
12 Other : 22 atoms
13 JZ4 : 22 atoms
14 CL : 6 atoms
15 Water : 30864 atoms
16 SOL : 30864 atoms
17 non-Water : 2642 atoms
18 Ion : 6 atoms
19 JZ4 : 22 atoms
20 CL : 6 atoms
21 Water_and_ions : 30870 atoms
>
indicates the make_ndx
prompt:> 1 | 13
Copied index group 1 'Protein'
Copied index group 13 'JZ4'
Merged two groups with OR: 2614 22 -> 2636
22 Protein_JZ4 : 2636 atoms
> q
We can now set tc-grps = Protein_JZ4 Water_and_ions
to achieve our desired “Protein Non-Protein” effect.
nvt.mdp
file from this website.gmx grompp -f nvt.mdp -c em.gro -r em.gro -p topol.top -n index.ndx -o nvt.tpr
gmx mdrun -deffnm nvt
Laptop fan started running almost immediately. I used an ice pack to cool the computer.
Started run at 2:02PM, finished at 2:12PM (634 s).
npt.mdp
file from this website:gmx grompp -f npt.mdp -c nvt.gro -t nvt.cpt -r nvt.gro -p topol.top -n index.ndx -o npt.tpr
gmx mdrun -deffnm npt
Run started at 2:25PM, finished at 2:38PM (690 s).
md.mdp
file from this website.gmx grompp -f md.mdp -c npt.gro -t npt.cpt -p topol.top -n index.ndx -o md_0_10.tpr
gmx mdrun -deffnm md_0_10
The 100 ps runs took \(\approx\) 10 min. For the production MD run of 10 ns, expect:
\[ \mathsf{10\ ns\times\cfrac{1000\ ps}{ns}\times\cfrac{10\ min}{100\ ps}\times\cfrac{1\ hr}{60\ min}=16.7\ hr} \]
Note: to print the math, I used MathJax.
Core t (s) Wall t (s) (%)
Time: 593342.460 74167.808 800.0
20h36:07
(ns/day) (hour/ns)
Performance: 11.649 2.060
trjconv
:gmx trjconv -s md_0_10.tpr -f md_0_10.xtc -o md_0_10_center.xtc -center -pbc mol -ur compact
Protein
for centering and System
for output.
-Extract the first frame (t = 0 ns) of the trajectory; use trjconv -dump
with the recentered trajectory:gmx trjconv -s md_0_10.tpr -f md_0_10_center.xtc -o start.pdb -dump 0
gmx trjconv -s md_0_10.tpr -f md_0_10_center.xtc -o md_0_10_fit.xtc -fit rot+trans
Backbone
to perform least-squares fitting to the protein backbone, and “System” for output.