Гомологичное моделирование комплекса белка с лигандом

In [51]:
import sys 
import modeller 
import _modeller
import modeller.automodel
In [3]:
# set modeller parameters
env=modeller.environ()
env.io.hetatm=True
                         MODELLER 9.19, 2017/07/19, r11078

     PROTEIN STRUCTURE MODELLING BY SATISFACTION OF SPATIAL RESTRAINTS


                     Copyright(c) 1989-2017 Andrej Sali
                            All Rights Reserved

                             Written by A. Sali
                               with help from
              B. Webb, M.S. Madhusudhan, M-Y. Shen, G.Q. Dong,
          M.A. Marti-Renom, N. Eswar, F. Alber, M. Topf, B. Oliva,
             A. Fiser, R. Sanchez, B. Yerkovich, A. Badretdinov,
                     F. Melo, J.P. Overington, E. Feyfant
                 University of California, San Francisco, USA
                    Rockefeller University, New York, USA
                      Harvard University, Cambridge, USA
                   Imperial Cancer Research Fund, London, UK
              Birkbeck College, University of London, London, UK


Kind, OS, HostName, Kernel, Processor: 4, Linux shadbox 3.2.0-29-generic x86_64
Date and time of compilation         : 2017/07/19 14:40:43
MODELLER executable type             : x86_64-intel8
Job starting time (YY/MM/DD HH:MM:SS): 2017/12/22 17:19:12

In [5]:
%%bash

# get protein template with known structure
wget http://www.pdb.org/pdb/files/1lmp.pdb

# get sequence of related protein with "unknown" structure (human lysozyme)
wget http://www.uniprot.org/uniprot/B2R4C5.fasta
--2017-12-22 17:27:28--  http://www.pdb.org/pdb/files/1lmp.pdb
Resolving www.pdb.org (www.pdb.org)... 128.6.244.52
Connecting to www.pdb.org (www.pdb.org)|128.6.244.52|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.rcsb.org/pdb/files/1lmp.pdb [following]
--2017-12-22 17:27:29--  https://www.rcsb.org/pdb/files/1lmp.pdb
Resolving www.rcsb.org (www.rcsb.org)... 132.249.213.110
Connecting to www.rcsb.org (www.rcsb.org)|132.249.213.110|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://files.rcsb.org/view/1lmp.pdb [following]
--2017-12-22 17:27:30--  http://files.rcsb.org/view/1lmp.pdb
Resolving files.rcsb.org (files.rcsb.org)... 132.249.213.140
Connecting to files.rcsb.org (files.rcsb.org)|132.249.213.140|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: `1lmp.pdb'

     0K .......... .......... .......... .......... ..........  127K
    50K .......... .......... .......... .......... ..........  248K
   100K .......... .......... ........                          221K=0.7s

2017-12-22 17:27:31 (177 KB/s) - `1lmp.pdb' saved [131301]

--2017-12-22 17:27:31--  http://www.uniprot.org/uniprot/B2R4C5.fasta
Resolving www.uniprot.org (www.uniprot.org)... 193.62.193.81, 128.175.240.211
Connecting to www.uniprot.org (www.uniprot.org)|193.62.193.81|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 217 [text/plain]
Saving to: `B2R4C5.fasta'

     0K                                                       100% 39.6M=0s

2017-12-22 17:27:33 (39.6 MB/s) - `B2R4C5.fasta' saved [217/217]

In [8]:
# create alignment object
alignment=modeller.alignment(env)

# add sequence 
alignment.append(file='B2R4C5.fasta', align_codes='all',alignment_format='FASTA')

# add structure
mdl = modeller.model(env, file='1lmp.pdb', model_segment=('FIRST:'+'A', 'LAST:'+'A'))
alignment.append_model(mdl, atom_files='1lmp.pdb', align_codes='1lmp')

# align and save
alignment.salign()
alignment.write(file='all_in_one.ali', alignment_format='PIR')
SALIGN_____> adding the next group to the alignment; iteration    1
In [9]:
cat 'all_in_one.ali'
>P1;tr|B2R4C5|B2R4C5_HUMAN
sequence::     : :     : :::-1.00:-1.00
MKALIVLGLVLLSVTVQGKVFERCELARTLKRLGMDGYRGISLANWMCLAKWESGYNTRATNYNAGDRSTDYGIF
QINSRYWCNDGKTPGAVNACHLSCSALLQDNIADAVACAKRVVRDPQGIRAWVAWRNRCQNRDVRQYVQGCGV--
-*

>P1;1lmp
structureX:1lmp.pdb:   1 :A:+132 :A:MOL_ID  1; MOLECULE  LYSOZYME; CHAIN  A; SYNONYM  MUCOPEPTIDE N-ACETYLMURAMYLHYDROLASE; EC  3.2.1.17:MOL_ID  1; ORGANISM_SCIENTIFIC  ONCORHYNCHUS MYKISS; ORGANISM_COMMON  RAINBOW TROUT; ORGANISM_TAXID  8022; ORGAN  KIDNEY: 2.00: 0.16
------------------KVYDRCELARALKASGMDGYAGNSLPNWVCLSKWESSYNTQATNRNT-DGSTDYGIF
QINSRYWCDDGRTPGAKNVCGIRCSQLLTDDLTVAIRCAKRVVLDPNGIGAWVAWRLHCQNQDLRSYVAGCGV..
.*
In [20]:
s = alignment[0]
pdb = alignment[1]

# build model
a = modeller.automodel.automodel(env, alnfile='all_in_one.ali', knowns= pdb.code, sequence = s.code )
a.name='mod'+s.code
a.starting_model = 1
a.ending_model = 1
a.make()
tr|B2R4C5|B2R4C5_HUMAN 1lmp
fndatmi_285W> Only      129 residues out of      132 contain atoms of type  CA
              (This is usually caused by non-standard residues, such
              as ligands, or by PDB files with missing atoms.)
fndatmi_285W> Only      129 residues out of      132 contain atoms of type  CA
              (This is usually caused by non-standard residues, such
              as ligands, or by PDB files with missing atoms.)

check_ali___> Checking the sequence-structure alignment. 

Implied intrachain target CA(i)-CA(i+1) distances longer than  8.0 angstroms:

ALN_POS  TMPL  RID1  RID2  NAM1  NAM2     DIST
----------------------------------------------
END OF TABLE
read_to_681_> topology.submodel read from topology file:        3
fndatmi_285W> Only      129 residues out of      132 contain atoms of type  CA
              (This is usually caused by non-standard residues, such
              as ligands, or by PDB files with missing atoms.)
patch_s_522_> Number of disulfides patched in MODEL:        4
mdtrsr__446W> A potential that relies on one protein is used, yet you have at
              least one known structure available. MDT, not library, potential is used.
iup2crm_280W> No topology library in memory or assigning a BLK residue.
              Default CHARMM atom type assigned:  C1 -->  CT2
              This message is written only for the first such atom.
0 atoms in HETATM/BLK residues constrained
to protein atoms within 2.30 angstroms
and protein CA atoms within 10.00 angstroms
0 atoms in residues without defined topology
constrained to be rigid bodies
condens_443_> Restraints marked for deletion were removed.
              Total number of restraints before, now:    13194    12190
iupac_m_397W> Atoms were not swapped because of the uncertainty of how to handle the H atom.


>> ENERGY; Differences between the model's features and restraints:
Number of all residues in MODEL                   :      148
Number of all, selected real atoms                :     1157    1157
Number of all, selected pseudo atoms              :        0       0
Number of all static, selected restraints         :    12190   12190
COVALENT_CYS                                      :        F
NONBONDED_SEL_ATOMS                               :        1
Number of non-bonded pairs (excluding 1-2,1-3,1-4):     2338
Dynamic pairs routine                             : 2, NATM x NATM cell sorting
Atomic shift for contacts update (UPDATE_DYNAMIC) :    0.390
LENNARD_JONES_SWITCH                              :    6.500   7.500
COULOMB_JONES_SWITCH                              :    6.500   7.500
RESIDUE_SPAN_RANGE                                :        0   99999
NLOGN_USE                                         :       15
CONTACT_SHELL                                     :    4.000
DYNAMIC_PAIRS,_SPHERE,_COULOMB,_LENNARD,_MODELLER :        T       T       F       F       F
SPHERE_STDV                                       :    0.050
RADII_FACTOR                                      :    0.820
Current energy                                    :         784.9276





Summary of the restraint violations: 

   NUM     ... number of restraints.
   NUMVI   ... number of restraints with RVIOL > VIOL_REPORT_CUT[i].
   RVIOL   ... relative difference from the best value.
   NUMVP   ... number of restraints with -Ln(pdf) > VIOL_REPORT_CUT2[i].
   RMS_1   ... RMS(feature, minimally_violated_basis_restraint, NUMB).
   RMS_2   ... RMS(feature, best_value, NUMB).
   MOL.PDF ... scaled contribution to -Ln(Molecular pdf).

 #                     RESTRAINT_GROUP      NUM   NUMVI  NUMVP   RMS_1   RMS_2         MOL.PDF     S_i
------------------------------------------------------------------------------------------------------
 1 Bond length potential              :    1177       0      0   0.006   0.006      11.531       1.000
 2 Bond angle potential               :    1592       0      6   2.099   2.099      141.87       1.000
 3 Stereochemical cosine torsion poten:     738       0     27  47.960  47.960      267.80       1.000
 4 Stereochemical improper torsion pot:     481       0      0   1.360   1.360      19.813       1.000
 5 Soft-sphere overlap restraints     :    2338       0      0   0.001   0.001     0.56110       1.000
 6 Lennard-Jones 6-12 potential       :       0       0      0   0.000   0.000      0.0000       1.000
 7 Coulomb point-point electrostatic p:       0       0      0   0.000   0.000      0.0000       1.000
 8 H-bonding potential                :       0       0      0   0.000   0.000      0.0000       1.000
 9 Distance restraints 1 (CA-CA)      :    2405       0      0   0.126   0.126      37.277       1.000
10 Distance restraints 2 (N-O)        :    2564       0      0   0.152   0.152      66.586       1.000
11 Mainchain Phi dihedral restraints  :       0       0      0   0.000   0.000      0.0000       1.000
12 Mainchain Psi dihedral restraints  :       0       0      0   0.000   0.000      0.0000       1.000
13 Mainchain Omega dihedral restraints:     147       0      3   4.455   4.455      34.406       1.000
14 Sidechain Chi_1 dihedral restraints:     120       0      1  65.571  65.571      23.458       1.000
15 Sidechain Chi_2 dihedral restraints:      86       0      0  57.460  57.460      30.234       1.000
16 Sidechain Chi_3 dihedral restraints:      35       0      0  73.876  73.876      23.280       1.000
17 Sidechain Chi_4 dihedral restraints:      20       0      0  99.655  99.655      16.666       1.000
18 Disulfide distance restraints      :       4       0      0   0.008   0.008     0.39921E-01   1.000
19 Disulfide angle restraints         :       8       0      0   3.078   3.078      1.6733       1.000
20 Disulfide dihedral angle restraints:       4       0      0  23.403  23.403      2.1872       1.000
21 Lower bound distance restraints    :       0       0      0   0.000   0.000      0.0000       1.000
22 Upper bound distance restraints    :       0       0      0   0.000   0.000      0.0000       1.000
23 Distance restraints 3 (SDCH-MNCH)  :    1691       0      0   0.382   0.382      32.072       1.000
24 Sidechain Chi_5 dihedral restraints:       0       0      0   0.000   0.000      0.0000       1.000
25 Phi/Psi pair of dihedral restraints:     146      15     15  20.399  56.678      23.883       1.000
26 Distance restraints 4 (SDCH-SDCH)  :     972       0      0   0.605   0.605      51.593       1.000
27 Distance restraints 5 (X-Y)        :       0       0      0   0.000   0.000      0.0000       1.000
28 NMR distance restraints 6 (X-Y)    :       0       0      0   0.000   0.000      0.0000       1.000
29 NMR distance restraints 7 (X-Y)    :       0       0      0   0.000   0.000      0.0000       1.000
30 Minimal distance restraints        :       0       0      0   0.000   0.000      0.0000       1.000
31 Non-bonded restraints              :       0       0      0   0.000   0.000      0.0000       1.000
32 Atomic accessibility restraints    :       0       0      0   0.000   0.000      0.0000       1.000
33 Atomic density restraints          :       0       0      0   0.000   0.000      0.0000       1.000
34 Absolute position restraints       :       0       0      0   0.000   0.000      0.0000       1.000
35 Dihedral angle difference restraint:       0       0      0   0.000   0.000      0.0000       1.000
36 GBSA implicit solvent potential    :       0       0      0   0.000   0.000      0.0000       1.000
37 EM density fitting potential       :       0       0      0   0.000   0.000      0.0000       1.000
38 SAXS restraints                    :       0       0      0   0.000   0.000      0.0000       1.000
39 Symmetry restraints                :       0       0      0   0.000   0.000      0.0000       1.000



# Heavy relative violation of each residue is written to: tr|B2R4C5|B2R4C5_HUMAN.V99990001
# The profile is NOT normalized by the number of restraints.
# The profiles are smoothed over a window of residues:    1
# The sum of all numbers in the file:   14242.2764



List of the violated restraints:
   A restraint is violated when the relative difference
   from the best value (RVIOL) is larger than CUTOFF.

   ICSR   ... index of a restraint in the current set.
   RESNO  ... residue numbers of the first two atoms.
   ATM    ... IUPAC atom names of the first two atoms.
   FEAT   ... the value of the feature in the model.
   restr  ... the mean of the basis restraint with the smallest
              difference from the model (local minimum).
   viol   ... difference from the local minimum.
   rviol  ... relative difference from the local minimum.
   RESTR  ... the best value (global minimum).
   VIOL   ... difference from the best value.
   RVIOL  ... relative difference from the best value.


-------------------------------------------------------------------------------------------------

Feature 25                           : Phi/Psi pair of dihedral restraints     
List of the RVIOL violations larger than   :       6.5000

    #   ICSR  RESNO1/2 ATM1/2   INDATM1/2    FEAT   restr    viol   rviol   RESTR    VIOL   RVIOL
    1   4005   1M   2K C   N       7    9 -103.64 -118.00   17.23    0.83  -62.90  175.38   25.17
    1          2K   2K N   CA      9   10  148.62  139.10                  -40.80
    2   4006   2K   3A C   N      16   18  -67.75  -68.20    4.24    0.36  -62.50  169.66   28.06
    2          3A   3A N   CA     18   19  149.52  145.30                  -40.90
    3   4007   3A   4L C   N      21   23  -68.87  -70.70   12.19    0.86  -63.50  170.84   23.42
    3          4L   4L N   CA     23   24  129.55  141.60                  -41.20
    4   4010   6V   7L C   N      44   46 -130.01 -108.50   40.66    1.92  -63.50  165.72   27.13
    4          7L   7L N   CA     46   47  167.01  132.50                  -41.20
    5   4011   7L   8G C   N      52   54  -96.26  -80.20   43.78    1.06   82.20 -124.49   18.26
    5          8G   8G N   CA     54   55 -145.17  174.10                    8.50
    6   4012   8G   9L C   N      56   58  -66.89  -70.70    8.62    0.54  -63.50  175.11   24.12
    6          9L   9L N   CA     58   59  133.87  141.60                  -41.20
    7   4014  10V  11L C   N      71   73  -77.22  -70.70   39.05    3.21  -63.50  144.95   19.38
    7         11L  11L N   CA     73   74  103.10  141.60                  -41.20
    8   4015  11L  12L C   N      79   81  -97.30 -108.50   44.46    2.27  -63.50  134.98   17.12
    8         12L  12L N   CA     81   82   89.47  132.50                  -41.20
    9   4016  12L  13S C   N      87   89 -143.24 -136.60   15.64    0.63  -64.10  178.18   18.58
    9         13S  13S N   CA     89   90  165.36  151.20                  -35.00
   10   4018  14V  15T C   N     100  102 -136.93 -124.80   35.12    1.18  -63.20  159.51   25.23
   10         15T  15T N   CA    102  103  176.45  143.50                  -42.10
   11   4020  16V  17Q C   N     114  116 -126.28 -121.10   46.54    2.12  -63.80  147.62   24.97
   11         17Q  17Q N   CA    116  117 -174.05  139.70                  -40.30
   12   4021  17Q  18G C   N     123  125   85.99   78.70    9.43    0.25   82.20  179.45    8.90
   12         18G  18G N   CA    125  126 -172.09 -166.10                    8.50
   13   4022  18G  19K C   N     127  129 -112.37 -118.00   12.62    0.51  -62.90  175.72   20.66
   13         19K  19K N   CA    129  130  127.81  139.10                  -40.80
   14   4057  53E  54S C   N     415  417 -139.06  -64.10   82.37    8.75  -64.10   82.37    8.75
   14         54S  54S N   CA    417  418   -0.84  -35.00                  -35.00
   15   4069  65A  66G C   N     508  510  -66.34  -62.40    6.16    1.12   82.20  158.20   12.08
   15         66G  66G N   CA    510  511  -45.94  -41.20                    8.50


report______> Distribution of short non-bonded contacts:


DISTANCE1:  0.00 2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80 2.90 3.00 3.10 3.20 3.30 3.40
DISTANCE2:  2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80 2.90 3.00 3.10 3.20 3.30 3.40 3.50
FREQUENCY:     0    0    0    0    0    2    4   31   62  117   96  135  173  167  201


<< end of ENERGY.
iupac_m_397W> Atoms were not swapped because of the uncertainty of how to handle the H atom.


>> ENERGY; Differences between the model's features and restraints:
Number of all residues in MODEL                   :      148
Number of all, selected real atoms                :     1157    1157
Number of all, selected pseudo atoms              :        0       0
Number of all static, selected restraints         :    12190   12190
COVALENT_CYS                                      :        F
NONBONDED_SEL_ATOMS                               :        1
Number of non-bonded pairs (excluding 1-2,1-3,1-4):     2411
Dynamic pairs routine                             : 2, NATM x NATM cell sorting
Atomic shift for contacts update (UPDATE_DYNAMIC) :    0.390
LENNARD_JONES_SWITCH                              :    6.500   7.500
COULOMB_JONES_SWITCH                              :    6.500   7.500
RESIDUE_SPAN_RANGE                                :        0   99999
NLOGN_USE                                         :       15
CONTACT_SHELL                                     :    4.000
DYNAMIC_PAIRS,_SPHERE,_COULOMB,_LENNARD,_MODELLER :        T       T       F       F       F
SPHERE_STDV                                       :    0.050
RADII_FACTOR                                      :    0.820
Current energy                                    :         742.3640





Summary of the restraint violations: 

   NUM     ... number of restraints.
   NUMVI   ... number of restraints with RVIOL > VIOL_REPORT_CUT[i].
   RVIOL   ... relative difference from the best value.
   NUMVP   ... number of restraints with -Ln(pdf) > VIOL_REPORT_CUT2[i].
   RMS_1   ... RMS(feature, minimally_violated_basis_restraint, NUMB).
   RMS_2   ... RMS(feature, best_value, NUMB).
   MOL.PDF ... scaled contribution to -Ln(Molecular pdf).

 #                     RESTRAINT_GROUP      NUM   NUMVI  NUMVP   RMS_1   RMS_2         MOL.PDF     S_i
------------------------------------------------------------------------------------------------------
 1 Bond length potential              :    1177       0      0   0.006   0.006      10.282       1.000
 2 Bond angle potential               :    1592       0      6   2.029   2.029      133.80       1.000
 3 Stereochemical cosine torsion poten:     738       0     25  47.656  47.656      267.93       1.000
 4 Stereochemical improper torsion pot:     481       0      0   1.341   1.341      19.336       1.000
 5 Soft-sphere overlap restraints     :    2411       0      0   0.001   0.001     0.36823       1.000
 6 Lennard-Jones 6-12 potential       :       0       0      0   0.000   0.000      0.0000       1.000
 7 Coulomb point-point electrostatic p:       0       0      0   0.000   0.000      0.0000       1.000
 8 H-bonding potential                :       0       0      0   0.000   0.000      0.0000       1.000
 9 Distance restraints 1 (CA-CA)      :    2405       0      0   0.117   0.117      31.519       1.000
10 Distance restraints 2 (N-O)        :    2564       0      0   0.147   0.147      62.931       1.000
11 Mainchain Phi dihedral restraints  :       0       0      0   0.000   0.000      0.0000       1.000
12 Mainchain Psi dihedral restraints  :       0       0      0   0.000   0.000      0.0000       1.000
13 Mainchain Omega dihedral restraints:     147       0      2   4.286   4.286      31.839       1.000
14 Sidechain Chi_1 dihedral restraints:     120       0      2  63.503  63.503      19.095       1.000
15 Sidechain Chi_2 dihedral restraints:      86       0      0  59.230  59.230      29.127       1.000
16 Sidechain Chi_3 dihedral restraints:      35       0      0  76.183  76.183      22.230       1.000
17 Sidechain Chi_4 dihedral restraints:      20       0      0 103.696 103.696      15.298       1.000
18 Disulfide distance restraints      :       4       0      0   0.010   0.010     0.74938E-01   1.000
19 Disulfide angle restraints         :       8       0      0   2.127   2.127     0.79892       1.000
20 Disulfide dihedral angle restraints:       4       0      0  26.422  26.422      2.8147       1.000
21 Lower bound distance restraints    :       0       0      0   0.000   0.000      0.0000       1.000
22 Upper bound distance restraints    :       0       0      0   0.000   0.000      0.0000       1.000
23 Distance restraints 3 (SDCH-MNCH)  :    1691       0      0   0.403   0.403      32.118       1.000
24 Sidechain Chi_5 dihedral restraints:       0       0      0   0.000   0.000      0.0000       1.000
25 Phi/Psi pair of dihedral restraints:     146      14     14  19.548  54.435      16.560       1.000
26 Distance restraints 4 (SDCH-SDCH)  :     972       0      0   0.584   0.584      46.236       1.000
27 Distance restraints 5 (X-Y)        :       0       0      0   0.000   0.000      0.0000       1.000
28 NMR distance restraints 6 (X-Y)    :       0       0      0   0.000   0.000      0.0000       1.000
29 NMR distance restraints 7 (X-Y)    :       0       0      0   0.000   0.000      0.0000       1.000
30 Minimal distance restraints        :       0       0      0   0.000   0.000      0.0000       1.000
31 Non-bonded restraints              :       0       0      0   0.000   0.000      0.0000       1.000
32 Atomic accessibility restraints    :       0       0      0   0.000   0.000      0.0000       1.000
33 Atomic density restraints          :       0       0      0   0.000   0.000      0.0000       1.000
34 Absolute position restraints       :       0       0      0   0.000   0.000      0.0000       1.000
35 Dihedral angle difference restraint:       0       0      0   0.000   0.000      0.0000       1.000
36 GBSA implicit solvent potential    :       0       0      0   0.000   0.000      0.0000       1.000
37 EM density fitting potential       :       0       0      0   0.000   0.000      0.0000       1.000
38 SAXS restraints                    :       0       0      0   0.000   0.000      0.0000       1.000
39 Symmetry restraints                :       0       0      0   0.000   0.000      0.0000       1.000



# Heavy relative violation of each residue is written to: tr|B2R4C5|B2R4C5_HUMAN.V99990002
# The profile is NOT normalized by the number of restraints.
# The profiles are smoothed over a window of residues:    1
# The sum of all numbers in the file:   13769.2471



List of the violated restraints:
   A restraint is violated when the relative difference
   from the best value (RVIOL) is larger than CUTOFF.

   ICSR   ... index of a restraint in the current set.
   RESNO  ... residue numbers of the first two atoms.
   ATM    ... IUPAC atom names of the first two atoms.
   FEAT   ... the value of the feature in the model.
   restr  ... the mean of the basis restraint with the smallest
              difference from the model (local minimum).
   viol   ... difference from the local minimum.
   rviol  ... relative difference from the local minimum.
   RESTR  ... the best value (global minimum).
   VIOL   ... difference from the best value.
   RVIOL  ... relative difference from the best value.


-------------------------------------------------------------------------------------------------

Feature 25                           : Phi/Psi pair of dihedral restraints     
List of the RVIOL violations larger than   :       6.5000

    #   ICSR  RESNO1/2 ATM1/2   INDATM1/2    FEAT   restr    viol   rviol   RESTR    VIOL   RVIOL
    1   4005   1M   2K C   N       7    9  -63.54  -70.20   12.73    0.78  -62.90  170.35   22.09
    1          2K   2K N   CA      9   10  129.55  140.40                  -40.80
    2   4006   2K   3A C   N      16   18 -144.91 -134.00   12.02    0.27  -62.50 -173.72   34.94
    2          3A   3A N   CA     18   19  152.04  147.00                  -40.90
    3   4007   3A   4L C   N      21   23 -114.90 -108.50   12.24    0.58  -63.50 -176.77   28.61
    3          4L   4L N   CA     23   24  142.93  132.50                  -41.20
    4   4010   6V   7L C   N      44   46  -67.27  -70.70   17.50    1.20  -63.50  165.68   22.79
    4          7L   7L N   CA     46   47  124.44  141.60                  -41.20
    5   4011   7L   8G C   N      52   54   85.29   78.70   51.99    1.21   82.20  133.86    6.65
    5          8G   8G N   CA     54   55  142.33 -166.10                    8.50
    6   4012   8G   9L C   N      56   58 -102.65 -108.50    5.89    0.29  -63.50  178.68   22.87
    6          9L   9L N   CA     58   59  133.14  132.50                  -41.20
    7   4014  10V  11L C   N      71   73 -115.39 -108.50   20.11    1.00  -63.50  175.26   27.54
    7         11L  11L N   CA     73   74  151.40  132.50                  -41.20
    8   4016  12L  13S C   N      87   89 -137.19 -136.60    7.53    0.37  -64.10 -178.36   18.37
    8         13S  13S N   CA     89   90  158.71  151.20                  -35.00
    9   4018  14V  15T C   N     100  102 -145.41 -124.80   37.72    1.17  -63.20  164.78   26.40
    9         15T  15T N   CA    102  103  175.09  143.50                  -42.10
   10   4020  16V  17Q C   N     114  116 -112.47 -121.10    9.66    0.45  -63.80 -177.72   29.40
   10         17Q  17Q N   CA    116  117  144.04  139.70                  -40.30
   11   4021  17Q  18G C   N     123  125   60.93   78.70   30.66    0.52   82.20  151.13    8.39
   11         18G  18G N   CA    125  126 -141.12 -166.10                    8.50
   12   4022  18G  19K C   N     127  129 -114.43 -118.00   26.01    1.21  -62.90  162.53   18.96
   12         19K  19K N   CA    129  130  113.34  139.10                  -40.80
   13   4057  53E  54S C   N     415  417 -137.30  -64.10   81.48    8.51  -64.10   81.48    8.51
   13         54S  54S N   CA    417  418    0.79  -35.00                  -35.00
   14   4069  65A  66G C   N     508  510  -67.13  -62.40    5.23    0.99   82.20  158.10   12.03
   14         66G  66G N   CA    510  511  -43.43  -41.20                    8.50


report______> Distribution of short non-bonded contacts:


DISTANCE1:  0.00 2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80 2.90 3.00 3.10 3.20 3.30 3.40
DISTANCE2:  2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80 2.90 3.00 3.10 3.20 3.30 3.40 3.50
FREQUENCY:     0    0    0    0    0    5    9   47   70  102  100  130  170  181  194


<< end of ENERGY.

>> Summary of successfully produced models:
Filename                          molpdf
----------------------------------------
tr|B2R4C5|B2R4C5_HUMAN.B99990001.pdb      784.92761
tr|B2R4C5|B2R4C5_HUMAN.B99990002.pdb      742.36395

In [44]:
# show model
nglview.show_structure_file('tr|B2R4C5|B2R4C5_HUMAN.B99990001.pdb')
In [1]:
Image('without_ligand.png')
Out[1]:

Скачанная последовательность не содержала лиганда, добавим его:

In [29]:
# check ligand positions in pdb structure 
alignment[1].residues[0:]
Out[29]:
[Residue 1:A (type LYS),
 Residue 2:A (type VAL),
 Residue 3:A (type TYR),
 Residue 4:A (type ASP),
 Residue 5:A (type ARG),
 Residue 6:A (type CYS),
 Residue 7:A (type GLU),
 Residue 8:A (type LEU),
 Residue 9:A (type ALA),
 Residue 10:A (type ARG),
 Residue 11:A (type ALA),
 Residue 12:A (type LEU),
 Residue 13:A (type LYS),
 Residue 14:A (type ALA),
 Residue 15:A (type SER),
 Residue 16:A (type GLY),
 Residue 17:A (type MET),
 Residue 18:A (type ASP),
 Residue 19:A (type GLY),
 Residue 20:A (type TYR),
 Residue 21:A (type ALA),
 Residue 22:A (type GLY),
 Residue 23:A (type ASN),
 Residue 24:A (type SER),
 Residue 25:A (type LEU),
 Residue 26:A (type PRO),
 Residue 27:A (type ASN),
 Residue 28:A (type TRP),
 Residue 29:A (type VAL),
 Residue 30:A (type CYS),
 Residue 31:A (type LEU),
 Residue 32:A (type SER),
 Residue 33:A (type LYS),
 Residue 34:A (type TRP),
 Residue 35:A (type GLU),
 Residue 36:A (type SER),
 Residue 37:A (type SER),
 Residue 38:A (type TYR),
 Residue 39:A (type ASN),
 Residue 40:A (type THR),
 Residue 41:A (type GLN),
 Residue 42:A (type ALA),
 Residue 43:A (type THR),
 Residue 44:A (type ASN),
 Residue 45:A (type ARG),
 Residue 46:A (type ASN),
 Residue 47:A (type THR),
 Residue 48:A (type ASP),
 Residue 49:A (type GLY),
 Residue 50:A (type SER),
 Residue 51:A (type THR),
 Residue 52:A (type ASP),
 Residue 53:A (type TYR),
 Residue 54:A (type GLY),
 Residue 55:A (type ILE),
 Residue 56:A (type PHE),
 Residue 57:A (type GLN),
 Residue 58:A (type ILE),
 Residue 59:A (type ASN),
 Residue 60:A (type SER),
 Residue 61:A (type ARG),
 Residue 62:A (type TYR),
 Residue 63:A (type TRP),
 Residue 64:A (type CYS),
 Residue 65:A (type ASP),
 Residue 66:A (type ASP),
 Residue 67:A (type GLY),
 Residue 68:A (type ARG),
 Residue 69:A (type THR),
 Residue 70:A (type PRO),
 Residue 71:A (type GLY),
 Residue 72:A (type ALA),
 Residue 73:A (type LYS),
 Residue 74:A (type ASN),
 Residue 75:A (type VAL),
 Residue 76:A (type CYS),
 Residue 77:A (type GLY),
 Residue 78:A (type ILE),
 Residue 79:A (type ARG),
 Residue 80:A (type CYS),
 Residue 81:A (type SER),
 Residue 82:A (type GLN),
 Residue 83:A (type LEU),
 Residue 84:A (type LEU),
 Residue 85:A (type THR),
 Residue 86:A (type ASP),
 Residue 87:A (type ASP),
 Residue 88:A (type LEU),
 Residue 89:A (type THR),
 Residue 90:A (type VAL),
 Residue 91:A (type ALA),
 Residue 92:A (type ILE),
 Residue 93:A (type ARG),
 Residue 94:A (type CYS),
 Residue 95:A (type ALA),
 Residue 96:A (type LYS),
 Residue 97:A (type ARG),
 Residue 98:A (type VAL),
 Residue 99:A (type VAL),
 Residue 100:A (type LEU),
 Residue 101:A (type ASP),
 Residue 102:A (type PRO),
 Residue 103:A (type ASN),
 Residue 104:A (type GLY),
 Residue 105:A (type ILE),
 Residue 106:A (type GLY),
 Residue 107:A (type ALA),
 Residue 108:A (type TRP),
 Residue 109:A (type VAL),
 Residue 110:A (type ALA),
 Residue 111:A (type TRP),
 Residue 112:A (type ARG),
 Residue 113:A (type LEU),
 Residue 114:A (type HIS),
 Residue 115:A (type CYS),
 Residue 116:A (type GLN),
 Residue 117:A (type ASN),
 Residue 118:A (type GLN),
 Residue 119:A (type ASP),
 Residue 120:A (type LEU),
 Residue 121:A (type ARG),
 Residue 122:A (type SER),
 Residue 123:A (type TYR),
 Residue 124:A (type VAL),
 Residue 125:A (type ALA),
 Residue 126:A (type GLY),
 Residue 127:A (type CYS),
 Residue 128:A (type GLY),
 Residue 129:A (type VAL),
 Residue 130:A (type NAG),
 Residue 131:A (type NAG),
 Residue 132:A (type NDG)]
In [48]:
# add '.' characters to sequnce for each HETATM records in template
alignment.append_sequence(''.join([x.code for x in alignment[0].residues]+['...']))
In [49]:
# align again 
alignment.salign()
alignment.write(file='all_in_one_with_ligand.ali', alignment_format='PIR')

pdb = alignment[1]
s = alignment[2]

# build model with ligand
a = modeller.automodel.automodel(env, alnfile='all_in_one_with_ligand.ali', knowns= pdb.code, sequence = s.code )
a.name='mod'+s.code
a.starting_model = 1
a.ending_model = 1
a.make()
SALIGN_____> adding the next group to the alignment; iteration    1

SALIGN_____> adding the next group to the alignment; iteration    2

SALIGN_____> adding the next group to the alignment; iteration    3
automodel__W> Topology and/or parameter libraries already in memory. These will
                be used instead of the automodel defaults. If this is not what you
                want, clear them before creating the automodel object with
                env.libs.topology.clear() and env.libs.parameters.clear()
fndatmi_285W> Only      129 residues out of      132 contain atoms of type  CA
              (This is usually caused by non-standard residues, such
              as ligands, or by PDB files with missing atoms.)
fndatmi_285W> Only      129 residues out of      132 contain atoms of type  CA
              (This is usually caused by non-standard residues, such
              as ligands, or by PDB files with missing atoms.)

check_ali___> Checking the sequence-structure alignment. 

Implied intrachain target CA(i)-CA(i+1) distances longer than  8.0 angstroms:

ALN_POS  TMPL  RID1  RID2  NAM1  NAM2     DIST
----------------------------------------------
END OF TABLE

getf_______W> RTF restraint not found in the atoms list:
              residue type, indices:    18   148
              atom names           : C     +N
              atom indices         :  1155     0

getf_______W> RTF restraint not found in the atoms list:
              residue type, indices:    18   148
              atom names           : C     CA    +N    O
              atom indices         :  1155  1151     0  1156
fndatmi_285W> Only      129 residues out of      132 contain atoms of type  CA
              (This is usually caused by non-standard residues, such
              as ligands, or by PDB files with missing atoms.)
patch_s_522_> Number of disulfides patched in MODEL:        4
mdtrsr__446W> A potential that relies on one protein is used, yet you have at
              least one known structure available. MDT, not library, potential is used.
iup2crm_280W> No topology library in memory or assigning a BLK residue.
              Default CHARMM atom type assigned:  C1 -->  CT2
              This message is written only for the first such atom.
43 atoms in HETATM/BLK residues constrained
to protein atoms within 2.30 angstroms
and protein CA atoms within 10.00 angstroms
43 atoms in residues without defined topology
constrained to be rigid bodies
condens_443_> Restraints marked for deletion were removed.
              Total number of restraints before, now:    14595    13591


>> ENERGY; Differences between the model's features and restraints:
Number of all residues in MODEL                   :      151
Number of all, selected real atoms                :     1200    1200
Number of all, selected pseudo atoms              :        0       0
Number of all static, selected restraints         :    13591   13591
COVALENT_CYS                                      :        F
NONBONDED_SEL_ATOMS                               :        1
Number of non-bonded pairs (excluding 1-2,1-3,1-4):     2521
Dynamic pairs routine                             : 2, NATM x NATM cell sorting
Atomic shift for contacts update (UPDATE_DYNAMIC) :    0.390
LENNARD_JONES_SWITCH                              :    6.500   7.500
COULOMB_JONES_SWITCH                              :    6.500   7.500
RESIDUE_SPAN_RANGE                                :        0   99999
NLOGN_USE                                         :       15
CONTACT_SHELL                                     :    4.000
DYNAMIC_PAIRS,_SPHERE,_COULOMB,_LENNARD,_MODELLER :        T       T       F       F       F
SPHERE_STDV                                       :    0.050
RADII_FACTOR                                      :    0.820
Current energy                                    :        1021.3351





Summary of the restraint violations: 

   NUM     ... number of restraints.
   NUMVI   ... number of restraints with RVIOL > VIOL_REPORT_CUT[i].
   RVIOL   ... relative difference from the best value.
   NUMVP   ... number of restraints with -Ln(pdf) > VIOL_REPORT_CUT2[i].
   RMS_1   ... RMS(feature, minimally_violated_basis_restraint, NUMB).
   RMS_2   ... RMS(feature, best_value, NUMB).
   MOL.PDF ... scaled contribution to -Ln(Molecular pdf).

 #                     RESTRAINT_GROUP      NUM   NUMVI  NUMVP   RMS_1   RMS_2         MOL.PDF     S_i
------------------------------------------------------------------------------------------------------
 1 Bond length potential              :    1177       0      0   0.006   0.006      13.791       1.000
 2 Bond angle potential               :    1592       0      8   2.210   2.210      157.87       1.000
 3 Stereochemical cosine torsion poten:     738       0     23  46.565  46.565      256.65       1.000
 4 Stereochemical improper torsion pot:     481       0      1   1.555   1.555      25.185       1.000
 5 Soft-sphere overlap restraints     :    2521       1      2   0.008   0.008      17.126       1.000
 6 Lennard-Jones 6-12 potential       :       0       0      0   0.000   0.000      0.0000       1.000
 7 Coulomb point-point electrostatic p:       0       0      0   0.000   0.000      0.0000       1.000
 8 H-bonding potential                :       0       0      0   0.000   0.000      0.0000       1.000
 9 Distance restraints 1 (CA-CA)      :    2405       0      0   0.128   0.128      39.681       1.000
10 Distance restraints 2 (N-O)        :    2564       2      9   0.228   0.228      134.19       1.000
11 Mainchain Phi dihedral restraints  :       0       0      0   0.000   0.000      0.0000       1.000
12 Mainchain Psi dihedral restraints  :       0       0      0   0.000   0.000      0.0000       1.000
13 Mainchain Omega dihedral restraints:     147       0      5   4.853   4.853      40.832       1.000
14 Sidechain Chi_1 dihedral restraints:     120       0      2  68.643  68.643      31.763       1.000
15 Sidechain Chi_2 dihedral restraints:      86       0      0  76.233  76.233      45.443       1.000
16 Sidechain Chi_3 dihedral restraints:      35       0      0  81.209  81.209      23.591       1.000
17 Sidechain Chi_4 dihedral restraints:      20       0      1  95.315  95.315      15.611       1.000
18 Disulfide distance restraints      :       4       0      0   0.015   0.015     0.16106       1.000
19 Disulfide angle restraints         :       8       0      0   2.000   2.000     0.70661       1.000
20 Disulfide dihedral angle restraints:       4       0      0  22.748  22.748      2.0787       1.000
21 Lower bound distance restraints    :       0       0      0   0.000   0.000      0.0000       1.000
22 Upper bound distance restraints    :       0       0      0   0.000   0.000      0.0000       1.000
23 Distance restraints 3 (SDCH-MNCH)  :    1691       0      0   0.465   0.465      51.411       1.000
24 Sidechain Chi_5 dihedral restraints:       0       0      0   0.000   0.000      0.0000       1.000
25 Phi/Psi pair of dihedral restraints:     146      20     21  22.002  62.518      62.808       1.000
26 Distance restraints 4 (SDCH-SDCH)  :     972       0      0   0.776   0.776      89.817       1.000
27 Distance restraints 5 (X-Y)        :    1401       0      0   0.030   0.030      12.620       1.000
28 NMR distance restraints 6 (X-Y)    :       0       0      0   0.000   0.000      0.0000       1.000
29 NMR distance restraints 7 (X-Y)    :       0       0      0   0.000   0.000      0.0000       1.000
30 Minimal distance restraints        :       0       0      0   0.000   0.000      0.0000       1.000
31 Non-bonded restraints              :       0       0      0   0.000   0.000      0.0000       1.000
32 Atomic accessibility restraints    :       0       0      0   0.000   0.000      0.0000       1.000
33 Atomic density restraints          :       0       0      0   0.000   0.000      0.0000       1.000
34 Absolute position restraints       :       0       0      0   0.000   0.000      0.0000       1.000
35 Dihedral angle difference restraint:       0       0      0   0.000   0.000      0.0000       1.000
36 GBSA implicit solvent potential    :       0       0      0   0.000   0.000      0.0000       1.000
37 EM density fitting potential       :       0       0      0   0.000   0.000      0.0000       1.000
38 SAXS restraints                    :       0       0      0   0.000   0.000      0.0000       1.000
39 Symmetry restraints                :       0       0      0   0.000   0.000      0.0000       1.000



# Heavy relative violation of each residue is written to: alignment_with_ligand.V99990001
# The profile is NOT normalized by the number of restraints.
# The profiles are smoothed over a window of residues:    1
# The sum of all numbers in the file:   16173.0352



List of the violated restraints:
   A restraint is violated when the relative difference
   from the best value (RVIOL) is larger than CUTOFF.

   ICSR   ... index of a restraint in the current set.
   RESNO  ... residue numbers of the first two atoms.
   ATM    ... IUPAC atom names of the first two atoms.
   FEAT   ... the value of the feature in the model.
   restr  ... the mean of the basis restraint with the smallest
              difference from the model (local minimum).
   viol   ... difference from the local minimum.
   rviol  ... relative difference from the local minimum.
   RESTR  ... the best value (global minimum).
   VIOL   ... difference from the best value.
   RVIOL  ... relative difference from the best value.


-------------------------------------------------------------------------------------------------

Feature 10                           : Distance restraints 2 (N-O)             
List of the RVIOL violations larger than   :       4.5000

    #   ICSR  RESNO1/2 ATM1/2   INDATM1/2    FEAT   restr    viol   rviol   RESTR    VIOL   RVIOL
    1   7937  65A  89P N   O     505  705   14.13   10.99    3.14    4.52   10.99    3.14    4.52
    2   7973  69S  89P N   O     533  705    8.64    6.49    2.15    4.73    6.49    2.15    4.73

-------------------------------------------------------------------------------------------------

Feature 25                           : Phi/Psi pair of dihedral restraints     
List of the RVIOL violations larger than   :       6.5000

    #   ICSR  RESNO1/2 ATM1/2   INDATM1/2    FEAT   restr    viol   rviol   RESTR    VIOL   RVIOL
    1   4005   1M   2K C   N       7    9  -80.08  -70.20   23.88    1.51  -62.90  157.99   21.50
    1          2K   2K N   CA      9   10  162.14  140.40                  -40.80
    2   4006   2K   3A C   N      16   18  -70.50  -68.20   20.12    1.54  -62.50  154.02   25.64
    2          3A   3A N   CA     18   19  165.29  145.30                  -40.90
    3   4007   3A   4L C   N      21   23  -67.30  -70.70   13.26    1.14  -63.50  164.42   23.05
    3          4L   4L N   CA     23   24  154.42  141.60                  -41.20
    4   4010   6V   7L C   N      44   46  -60.09  -70.70   20.76    1.96  -63.50  159.39   21.94
    4          7L   7L N   CA     46   47  159.45  141.60                  -41.20
    5   4011   7L   8G C   N      52   54  113.73   78.70   56.54    1.04   82.20  144.50    8.55
    5          8G   8G N   CA     54   55  149.52 -166.10                    8.50
    6   4012   8G   9L C   N      56   58  -67.69  -70.70    3.39    0.25  -63.50  178.81   25.07
    6          9L   9L N   CA     58   59  140.04  141.60                  -41.20
    7   4014  10V  11L C   N      71   73 -109.62 -108.50    3.17    0.16  -63.50 -177.41   23.14
    7         11L  11L N   CA     73   74  135.47  132.50                  -41.20
    8   4015  11L  12L C   N      79   81 -110.34 -108.50    3.57    0.20  -63.50  176.96   22.36
    8         12L  12L N   CA     81   82  129.45  132.50                  -41.20
    9   4016  12L  13S C   N      87   89 -155.31 -136.60   43.18    1.72  -64.10  162.82   18.38
    9         13S  13S N   CA     89   90 -169.88  151.20                  -35.00
   10   4017  13S  14V C   N      93   95  -60.47  -62.40    9.49    1.10 -125.40  177.33   10.00
   10         14V  14V N   CA     95   96  -51.69  -42.40                  143.30
   11   4018  14V  15T C   N     100  102  -95.14  -78.10   39.64    1.30  -63.20  136.11   19.59
   11         15T  15T N   CA    102  103 -174.41  149.80                  -42.10
   12   4020  16V  17Q C   N     114  116 -125.37 -121.10   19.98    0.87  -63.80  171.88   28.53
   12         17Q  17Q N   CA    116  117  159.22  139.70                  -40.30
   13   4021  17Q  18G C   N     123  125 -161.22 -167.20    7.22    0.24   82.20 -153.75   15.06
   13         18G  18G N   CA    125  126  178.64  174.60                    8.50
   14   4022  18G  19K C   N     127  129  -71.45  -70.20   21.71    1.60  -62.90  159.75   20.29
   14         19K  19K N   CA    129  130  118.72  140.40                  -40.80
   15   4057  53E  54S C   N     415  417 -133.61  -64.10   74.83    8.18  -64.10   74.83    8.18
   15         54S  54S N   CA    417  418   -7.28  -35.00                  -35.00
   16   4069  65A  66G C   N     508  510 -139.94  -62.40   86.54   13.48   82.20  138.32    8.85
   16         66G  66G N   CA    510  511   -2.77  -41.20                    8.50
   17   4092  88T  89P C   N     697  699  -51.77  -58.70   51.99    3.76  -64.50  131.39   10.49
   17         89P  89P N   CA    699  700  -82.03  -30.50                  147.20
   18   4093  89P  90G C   N     704  706  -65.75  -62.40   28.08    4.04   82.20  149.55   10.75
   18         90G  90G N   CA    706  707  -13.32  -41.20                    8.50
   19   4138 134C 135Q C   N    1043 1045  -57.34  -63.80   14.52    1.88  -73.00  166.72   11.22
   19        135Q 135Q N   CA   1045 1046  -53.31  -40.30                  140.70
   20   4139 135Q 136N C   N    1052 1054 -116.14 -119.90   70.17    3.10   55.90  174.22   18.12
   20        136N 136N N   CA   1054 1055   66.94  137.00                   39.50


report______> Distribution of short non-bonded contacts:


DISTANCE1:  0.00 2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80 2.90 3.00 3.10 3.20 3.30 3.40
DISTANCE2:  2.10 2.20 2.30 2.40 2.50 2.60 2.70 2.80 2.90 3.00 3.10 3.20 3.30 3.40 3.50
FREQUENCY:     0    0    0    0    1    8    8   52   83  140  111  142  177  187  209


<< end of ENERGY.

>> Summary of successfully produced models:
Filename                          molpdf
----------------------------------------
alignment_with_ligand.B99990001.pdb     1021.33508

In [50]:
# show model structure with ligand
nglview.show_structure_file('alignment_with_ligand.B99990001.pdb')
In [2]:
Image('with_ligand.png')
Out[2]:
In [4]:
%%bash
jupyter nbconvert --to html hw6_filippova.ipynb
[NbConvertApp] Converting notebook hw6_filippova.ipynb to html
[NbConvertApp] Writing 473759 bytes to hw6_filippova.html