Tutorial 1: A Standard Protein ================================== .. contents:: :local: In this tutorial, we will show you how to use pdbtop to prepare a protein for calculations from a real case. Download Structure -------------------------- Assume we want to study a protein with PDB ID ``5VBM``, you can download it from the RCSB PDB database https://www.rcsb.org/. The file is called ``5vbm.pdb``, and is shown below: .. code-block:: bash :caption: 5vbm.pdb :linenos: EADER HYDROLASE 5VBM TITLE CRYSTAL STRUCTURE OF SMALL MOLECULE DISULFIDE 2C07 BOUND TO K-RAS CYS TITLE 2 LIGHT M72C GDP KEYWDS GTPASE,INHIBITOR,GDP,HYDROLASE EXPDTA X-RAY DIFFRACTION REMARK 2 REMARK 2 RESOLUTION. 1.49 ANGSTROMS. ... ATOM 11 CA MET A 1 -19.356 -15.943 24.176 1.00 17.37 C ANISOU 11 CA MET A 1 1877 1974 2749 -442 959 401 C ATOM 12 C MET A 1 -17.969 -16.202 23.628 1.00 16.64 C ANISOU 12 C MET A 1 1846 1881 2596 -347 853 470 C ATOM 13 O MET A 1 -17.377 -17.264 23.848 1.00 17.10 O ANISOU 13 O MET A 1 1987 1897 2613 -333 837 563 O ATOM 14 CB MET A 1 -19.217 -14.992 25.375 1.00 18.43 C ANISOU 14 CB MET A 1 2017 2205 2782 -486 1034 371 C ATOM 15 CG MET A 1 -18.149 -15.406 26.395 1.00 38.06 C ANISOU 15 CG MET A 1 4623 4745 5091 -513 1025 485 C ATOM 16 SD MET A 1 -17.536 -14.004 27.341 1.00 44.81 S ANISOU 16 SD MET A 1 5501 5720 5806 -545 1049 443 S ATOM 17 CE MET A 1 -18.860 -13.868 28.537 1.00 67.52 C ANISOU 17 CE MET A 1 8364 8638 8651 -676 1205 342 C ATOM 18 H MET A 1 -19.406 -17.766 24.902 1.00 22.03 H ATOM 19 HA MET A 1 -19.898 -15.510 23.498 1.00 20.85 H ... The structure is shown below: .. image:: _static/figs/p2.png We can see that, there are several water molecules, ions, and ligands in the file. We want to study the protein now. The covalently bonded ligand ``92V`` will be considered in :doc:`tutorial-3`. We will show you how to prepare the protein for calculations. ``check`` Structure -------------------------- This file cannot be used directly in computations, since it contains some useless information and bad atoms. So, the first step is to check the structure using the following command: .. code-block:: bash $ pdbtop check -i 5vbm.pdb -o 5vbm-1 This command means that with input (``-i``) file ``5vbm.pdb``, pdbtop will check the structure and write the output (``-o``) to ```5vbm-1.pdb```. The output files are shown below: .. code-block:: bash $ pdbtop check -i 5vbm.pdb -o 5vbm-1 Read: 5VBM.pdb Warning: The residue name of the 423-th atom is changed from "HIS" to "HSE". Warning: The residue name of the 424-th atom is changed from "HIS" to "HSE". ... Warning: The atom name of the 325-th atom is changed from "CD1" to "CD". Warning: The atom name of the 380-th atom is changed from "CD1" to "CD". ... Warning: The atom N in residue LYS16 at chain A has an occupancy of 0.490. Probably, only 1 of atom N220 and N221 can be kept! Warning: The atom N in residue LYS16 at chain A has an occupancy of 0.510. Warning: The atom CA in residue LYS16 at chain A has an occupancy of 0.490. Probably, only 1 of atom CA222 and CA223 can be kept! Warning: The atom CA in residue LYS16 at chain A has an occupancy of 0.510. ... Current molecule: Molecule: 5vbm.pdb Number of atoms: 2849 Number of residues: 276 Number of amino acids: 168 Number of nucleic acids: 0 Number of waters: 105 Number of ions: 1 Number of ligands: 2 Write PDB: 5vbm-1.pdb We strongly recommend you to read all ``Warning`` statements carefully: #. ``Warning: The residue name of the 423-th atom is changed from "HIS" to "HSE".`` The residue name of the 423-th atom is changed from "HIS" to "HSE". #. ``Warning: The atom name of the 325-th atom is changed from "CD1" to "CD".`` The atom name of the 325-th atom is changed from "CD1" to "CD". #. ``The atom N in residue LYS16 at chain A has an occupancy of 0.490. Probably, only 1 of atom N220 and N221 can be kept!`` This is very important. This and the following lines mean that there are 2 sets of conformation for this residue ``LYS16`` at chain ``A``. We can check this in the file ``5vbm-1.pdb``: .. code-block:: bash :caption: 5vbm-1.pdb :emphasize-lines: 3,5,7,9,11,13 :linenos: ... ATOM 219 HA3 GLY A 15 -11.046 9.692 14.589 1.00 13.71 H H ATOM 220 N LYS A 16 -9.204 7.791 16.304 0.49 9.02 N N ATOM 221 N LYS A 16 -9.203 7.790 16.301 0.51 8.97 N N ATOM 222 CA LYS A 16 -9.168 6.488 16.966 0.49 10.42 C C ATOM 223 CA LYS A 16 -9.163 6.488 16.967 0.51 10.37 C C ATOM 224 C LYS A 16 -10.013 6.496 18.236 0.49 9.00 C C ATOM 225 C LYS A 16 -10.009 6.495 18.237 0.51 9.03 C C ATOM 226 O LYS A 16 -10.840 5.600 18.455 0.49 10.68 O O ATOM 227 O LYS A 16 -10.835 5.597 18.456 0.51 10.55 O O ATOM 228 CB LYS A 16 -7.724 6.108 17.278 0.49 10.05 C C ATOM 229 CB LYS A 16 -7.717 6.106 17.274 0.51 10.08 C C ATOM 230 CG LYS A 16 -6.871 6.043 16.021 0.49 11.81 C C ATOM 231 CG LYS A 16 -6.901 5.904 16.006 0.51 12.28 C C ... There are 2 ``N`` in residue LYS16 at chain A having an occupancy of ``0.49`` and ``0.51``, respectively. For the same reason, there are 2 ``CA`` in residue LYS16 at chain A having an occupancy of ``0.49`` and ``0.51``, respectively. Only 1 set of this conformation should be kept (see :doc:`check` for more information). Note that pdbtop will **NOT** do this and you should do this manually. Now we only keep the conformation of occupancy ``0.51``. So, delete all the atoms with occupancy ``0.49`` in LYS16 at chain A and save it to a new file, say ``5vbm-2.pdb``: .. code-block:: bash :caption: 5vbm-2.pdb :linenos: ... ATOM 219 HA3 GLY A 15 -11.046 9.692 14.589 1.00 13.71 H H ATOM 221 N LYS A 16 -9.203 7.790 16.301 0.51 8.97 N N ATOM 223 CA LYS A 16 -9.163 6.488 16.967 0.51 10.37 C C ATOM 225 C LYS A 16 -10.009 6.495 18.237 0.51 9.03 C C ATOM 227 O LYS A 16 -10.835 5.597 18.456 0.51 10.55 O O ATOM 229 CB LYS A 16 -7.717 6.106 17.274 0.51 10.08 C C ATOM 231 CG LYS A 16 -6.901 5.904 16.006 0.51 12.28 C C ... Also, you need only to delete heavy atoms, since the hydrogen atoms will be deleted in the next step. ``remove`` Structure -------------------------- Now we remove everything except the protein: .. code-block:: bash $ pdbtop remove -i 5vbm-2.pdb -o 5vbm-3 --waters --ions --ligands --Hs The additional options are: #. ``--waters``: remove water molecules. #. ``--ions``: remove ions. #. ``--ligands``: remove ligands. #. ``--Hs``: remove hydrogen atoms. The hydrogen atoms in this file have nonstandard names, so we need to remove them. Now the structure look like this, no hydrogens, no waters, no ions, no ligands: .. image:: _static/figs/p3.png Generate ``topol``\ ogy -------------------------- Now we generate the topology: .. code-block:: bash $ pdbtop topol -i 5vbm-3.pdb -o 5vbm-4 The output are shown below: .. code-block:: bash $ pdbtop topol -i 5vbm-3.pdb -o 5vbm-4 Read: 5vbm-3.pdb ... Building topology ... Building topology done. Patching N-terminus in residue GLY0 at chain A. Patching C-terminus in residue LYS169 at chain A. Write PDB: 5vbm-4.pdb Write PSF: 5vbm-4.psf Total charge: -5.00000 The output indicates that pdbtop has built the topology and patched the N- and C-terminus for each protein chain. The output files are ``5vbm-4.pdb`` and ``5vbm-4.psf``, shown below: .. image:: _static/figs/p4.png At this stage, with ``5vbm-4.pdb`` and ``5vbm-4.psf``, one can start to do calculations for the protein. ``solvate`` System -------------------------- Now, we need to add water to solvate the system, and add ions to neutralize it. The box size is 70x70x70 Angstrom^3. pdbtop will use NaCl to neutralize the system. The command is: .. code-block:: bash $ pdbtop.exe solvate -i 5vbm-4.pdb -t 5vbm-4.psf -o 5vbm-sol --box "70 70 70" The output is: .. code-block:: bash $ pdbtop.exe solvate -i 5vbm-4.pdb -t 5vbm-4.psf -o 5vbm-5 --box "70 70 70" ... Building water box: 70.000 x 70.000 x 70.000 Angstrom^3. 12552 water molecules are added. Add ions: Target charge: 0 Target ionic strength: 0.010 mol/L 5 cations and 0 anions are added. Final ionic strength: 0.012 mol/L Write PDB: 5vbm-sol.pdb Write PSF: 5vbm-sol.psf Total charge: -0.00000 You can adjust the ionic strength in mol/L with ``--ionic-strength``: .. code-block:: bash $ pdbtop.exe solvate -i 5vbm-4.pdb -t 5vbm-4.psf -o 5vbm-5 --box "70 70 70" --ionic-strength 0.02 ... 12552 water molecules are added. Add ions: Target charge: 0 Target ionic strength: 0.020 mol/L 6 cations and 1 anions are added. Final ionic strength: 0.017 mol/L Write PDB: 5vbm-5.pdb Write PSF: 5vbm-5.psf Total charge: -0.00000 Now there are more cations and anions. Now, we have a solvated, neutralized protein solvation box, which is ready for calculations! .. image:: _static/figs/p8.png