Tables

Table I: GeNMR knowledge-based score (SumGA) can distinguish high-resolution X-RAY structures from lower resolution or lower quality NMR models of the same protein.

Protein Name PDB ID SumGA
Interleukin-4 (X-ray) 1RCB -9.31
Interleukin-4 (NMR) 1BCN -4.95
Barnase (X-ray) 1A2P -5.66
Barnase (NMR) 1FW7 -0.63
Calbindin (X-ray) 1IG5 -7.97
Calbindin (NMR) 1CLB -5.96
Ubiquitin (X-ray) 1UBQ -8.92
Ubiquitin (NMR) 1XQQ -4.67
BPTI (X-ray) 6PTI -7.43
BPTI (NMR) 1OA5 -2.92
Ribosomal Protein L25 (X-ray) 1DFU -7.86
Ribosomal Protein L25 (NMR) 1B75 2.72
Ribosomal protein L30E (X-ray) 1W41 -10.82
Ribosomal Protein L30E (NMR) 1GO0 -7.45
Melanoma Inhibitory Activity Protein (X-ray) 1I1J -8.18
Melanoma Inhibitory Activity Protein (NMR) 1K0X -2.22
DnaB Helicase (X-ray) 1B79 -9.37
DnaB Helicase (NMR) 1JWE -6.98
Vts1 SAM Domain (X-ray) 2D3D -9.59
Vts1 SAM Domain (NMR) 2FE9 -6.30


Table II: GeNMR chemical shift scores (SumCS) of high-resolution X-RAY structures and lower resolution/lower quality NMR models of the same protein.

Protein Name PDB ID SumCS
Interleukin-4 (X-ray) 1RCB 6.12
Interleukin-4 (NMR) 1BCN 13.44
Barnase (X-ray) 1A2P -3.19
Barnase (NMR) 1FW7 0.60
Calbindin (X-ray) 1IG5 -0.35
Calbindin (NMR) 1CLB 1.81
Ubiquitin (X-ray) 1UBQ -2.63
Ubiquitin (NMR) 1XQQ 0.91
BPTI (X-ray) 6PTI -0.56
BPTI (NMR) 1OA5 5.22
Ribosomal Protein L25 (X-ray) 1DFU 1.37
Ribosomal Protein L25 (NMR) 1B75 11.26
Ribosomal protein L30E (X-ray) 1W41 -3.87
Ribosomal Protein L30E (NMR) 1GO0 3.87
Melanoma Inhibitory Activity Protein (X-ray) 1I1J 2.59
Melanoma Inhibitory Activity Protein (NMR) 1K0X 8.62
DnaB Helicase (X-ray) 1B79 7.05
DnaB Helicase (NMR) 1JWE 6.64
Vts1 SAM Domain (X-ray) 2D3D 0.79
Vts1 SAM Domain (NMR) 2FE9 7.79


Table III: Changes in RMSD of homology models after refinement with distance restraints and GeNMR scoring function.

Protein Name PDB ID of template Template identity level PDB ID of reference structure RMSD before refinement RMSD after NOE refinement RMSD after refinement with NOEs,SumGA, and SumCS
Ubiquitin 2HJ8 32% 1G6J 2.18 0.77 1.11
Ubiquitin 1OQY 35% 1G6J 2.57 0.83 1.00
Ubiquitin 1IYF 30% 1G6J 3.09 0.88 1.01
Ubiquitin 1WY8 39% 1G6J 4.29 0.84 1.00
Serine Protein Inhibitor 1CIS 82% 3CI2 1.95 1.50 1.79
L25 1DFU 100%
(different ligand state)
1B75 4.25 2.40 3.31
Interleukin-4 2B90 94%
1BCN 1.75 1.49 2.19
DnaB helicase 2Q6T 47%
1JWE 1.22 0.99 1.41


Table IV: Changes in GeNMR scoring function after refinement homology models with distance restraints and GeNMR scoring function.

Protein Name PDB ID of template Template identity level PDB ID of reference structure Total score before refinement Total after NOE refinement Total after refinement with NOEs,SumGA, and SumCS
Ubiquitin 2HJ8 32% 1G6J 12.18 20.94 1.96
Ubiquitin 1OQY 35% 1G6J 12.57 21.00 1.96
Ubiquitin 1IYF 30% 1G6J 13.09 20.96 1.95
Ubiquitin 1WY8 39% 1G6J 14.29 21.01 1.99
Serine Protein Inhibitor 1CIS 82% 3CI2 -19.60 16.76 -17.61
L25 1DFU 100%
(different ligand state)
1B75 -22.90 35.79 2.72
Interleukin-4 2B8X 98%
1BCN -73.59 3.99 -48.66
DnaB helicase 2Q6T 47%
1JWE -3.58 -9.34 -20.98


Figure 1. Restoration of secondary structure in homology models by a refinement with GeNMR scoring function without NOEs


1. Ubiquitin modeled from 2HJ8 after the 450 iteration of refinement with GeNMR scoring function. A. Reference structure (1UBQ). B. Homology model. C. Refined model.

Ubiquitin modeled from 2HJ8



2. Ubiquitin modeled from 1IYF after the 683 iteration of refinement with GeNMR scoring function. A. Reference structure (1UBQ). B. Homology model. C. Refined model.

Ubiquitin modeled from 1IYF



3. Ubiquitin modeled from 1OQY after the 490 iteration of refinement with GeNMR scoring function. A. Reference structure (1UBQ). B. Homology model. C. Refined model.

Ubiquitin modeled from 1OQY



Table V. Performance of GENMR in generating 3D structures for 50 randomly selected BMRB files. The given RMSDs are those of the final structure relative to the known (X-ray or NMR) structure.

BMRB ID

Protein Name

RMSD (Å)

Runtime

(min)

Type

All atoms

C-alpha

BMR2208

Cyclophiliin

0.98

0.39

33

Fragment Assembly +

Homology Modeling

BMR4022

Carbonic Anhydrase

1.58

0.81

48

Fragment Assembly +

Homology Modeling

BMR4031

Ribonuclease A

1.69

0.40

9

Fragment Assembly +

Homology Modeling

BMR4035

Regulatory Protein E2

1.67

0.58

6

Fragment Assembly +

Homology Modeling

BMR4046

Drosophila Heat Shock Factor

3.21

2.02

8

Fragment Assembly +

Homology Modeling

BMR4047

HU Protein

2.24

0.84

7

Fragment Assembly +

Homology Modeling

BMR4053

Staphylococcal Nuclease

1.00

0.22

12

Fragment Assembly +

Homology Modeling

BMR4064

Collagenase

0.97

0.30

13

Fragment Assembly +

Homology Modeling

BMR4070

FimC

2.28

1.73

37

Fragment Assembly +

Homology Modeling

BMR4072

RNA-1 Mod Protein

3.99

3.13

4

Fragment Assembly +

Homology Modeling

BMR4076

Outer Surface Protein

2.21

2.04

19

Fragment Assembly +

Homology Modeling

BMR4077

FK506 Binding Protein

0.93

0.41

7

Fragment Assembly +

Homology Modeling

BMR4081

Interferon Alpha

1.14

0.38

9

Fragment Assembly +

Homology Modeling

BMR4082

Profilin

0.99

0.51

9

Fragment Assembly +

Homology Modeling

BMR4083

CheY

1.18

0.47

8

Fragment Assembly +

Homology Modeling

BMR4084

HnRNP

0.77

0.10

15

Fragment Assembly +

Homology Modeling

BMR4090

Basic FGF

3.34

1.75

6

Fragment Assembly +

Homology Modeling

BMR4091

Basic Fibroblast Growth Factor

1.64

0.41

12

Fragment Assembly +

Homology Modeling

BMR4092

Core Binding Factor

1.84

0.89

9

Fragment Assembly +

Homology Modeling

BMR4094

Interleukin 4

0.64

0.22

10

Fragment Assembly +

Homology Modeling

BMR4098

Fatty Acid Biniding Protein

3.39

1.96

10

Fragment Assembly +

Homology Modeling

BMR4116

Im9 Immunity Protein

0.81

0.23

6

Fragment Assembly +

Homology Modeling

BMR4117

Elongation Factor-1 Beta

2.18

1.35

7

Fragment Assembly +

Homology Modeling

BMR4132

Human UBC9

0.48

0.12

25

Fragment Assembly +

Homology Modeling

BMR4127

DNA-binding Domain of SV40 T Antigen

2.61

1.57

11

Fragment Assembly +

Homology Modeling

BMR4186

CRABPII

1.12

0.55

11

Fragment Assembly +

Homology Modeling

BMR4198

Syntaxin 1A, N-domain

3.48

2.66

9

Fragment Assembly +

Homology Modeling

BMR4285

Apo-S100A1

3.05

1.98

18

Fragment Assembly +

Homology Modeling

BMR4296

Cold Shock protein

1.49

0.57

6

Fragment Assembly +

Homology Modeling

BMR4297

DnaB (24-136)

1.96

0.88

9

Fragment Assembly +

Homology Modeling

BMR4307

Hamster Prion

3.22

1.67

19

Fragment Assembly +

Homology Modeling

BMR4326

DNA polymerase B

2.38

1.74

6

Fragment Assembly +

Homology Modeling

BMR4334

ARID Domain of Dead-Ringer Protein

3.26

2.53

10

Fragment Assembly +

Homology Modeling

BMR4340

Major Urinary Protein

0.75

0.33

14

Fragment Assembly +

Homology Modeling

BMR4354

Maltose Binding Protein

0.99

0.39

49

Fragment Assembly +

Homology Modeling

BMR4380

Epithelial Cadherin. N-domain

2.41

2.10

7

Fragment Assembly +

Homology Modeling

BMR4403

PyJ

2.96

1.96

56

Threading

BMR4405

KH3

1.84

0.96

6

Fragment Assembly +

Homology Modeling

BMR4430

Calcyclin

3.72

2.44

84

Fragment Assembly +

Homology Modeling

BMR4554

Dihydrofolate Reductase

1.13

0.50

12

Threading

BMR4562

Lysozyme

0.84

0.27

9

Fragment Assembly +

Homology Modeling

BMR4577

Ribosomal Protein S4 delta 41

2.35

1.42

22

Fragment Assembly +

Homology Modeling

BMR4675

Foxo4

3.58

2.61

27

Fragment Assembly +

Homology Modeling

BMR4840

Mycobacterium tuberculosis adenylate kinase

2.32

1.38

87

Fragment Assembly +

Homology Modeling

BMR4964

Barnase

0.73

0.31

9

Fragment Assembly +

Homology Modeling

BMR4968

Bovine Pancreatic Trypsin Inhibitor

1.05

0.29

4

Fragment Assembly +

Homology Modeling

BMR4974

CI-2 from Barley Seeds

0.99

0.15

5

Fragment Assembly +

Homology Modeling

BMR5220

Melanoma Inhibitory Activity Protein

2.74

1.27

9

Fragment Assembly +

Homology Modeling

BMR5316

HIV-1 Gag

1.95

1.31

21

Fragment Assembly +

Homology Modeling

BMR5387

Ubiquitin

2.37

1.03

6

Fragment Assembly +

Homology Modeling

BMR547

Calmodulin

3.24

2.81

13

Fragment Assembly +

Homology Modeling

BMR5473

Ovomucoid Third Domain

1.41

0.64

4

Fragment Assembly +

Homology Modeling

BMR5690

Cobra Neurotoxin II

1.60

0.89

5

Fragment Assembly +

Homology Modeling

BMR5899

Spo0f protein

2.58

1.88

9

Fragment Assembly +

Homology Modeling

BMR6338

At1G77540

---

---

---

De Novo (did not converge)

BMR6357

Beta Lactamase

1.25

0.28

25

Fragment Assembly +

Homology Modeling

BMR6367

YhgG

---

---

---

De Novo (did not converge)

BMR6410

Glutaredoxin 1

3.13

2.21

44

Threading

BMR6444

MMP12

1.55

0.64

13

Fragment Assembly +

Homology Modeling

BMR6699

Calbindin D9K

3.02

1.92

5

Fragment Assembly +

Homology Modeling

BMR6821

Superoxide dismutase

1.75

1.28

13

Fragment Assembly +

Homology Modeling

BMR6922

Vts1 SAM domain

2.16

1.16

7

Fragment Assembly +

Homology Modeling

 

 

 

 



Average

 

1.95

1.11

10 (median)




Table IV. Backbone RMSD between GENMR models and corresponding experimental structures for recent entries in BMRB database (as of the first week of April 2008).

Protein name PDB ID BMRB ID RMSDbb (Å) Method
TnpE 2RN7A 15701 Did not converge De Novo
Chromo domain of chromobox 2K28A 15695 1.34 Fragment Assembly + Homology modeling
SeR13 2K1HA 15678 2.31 Fragment Assembly + Homology modeling
Chromo domain of the Chromobox 2K1BA 15674 1.22 Fragment Assembly + Homology modeling
EwR120 2JZAA 15611 1.72 Fragment Assembly + Homology modeling
SgR42 2JZ2A 15604 4.88 Threading
ER541-37-162 2K1G 15603 Did not converge Threading
alpha-RgIA 1CNL 15586 1.69 Fragment Assembly + Homology modeling
Ig domain of human palladin 2DM2A 15512 1.29 Threading
Human NEMO zinc finger 2JT3A 15499 1.81 Fragment Assembly + Homology modeling
NEMO zinc finger 2JVXA 15499 2.85 Fragment Assembly + Homology modeling
Abl kinase 2HYYA 15488 1.76 Fragment Assembly + Homology modeling
YmoA 2JVPA 15486 2.15 Fragment Assembly + Homology modeling
alpha-RgIA 1CNL 15436 1.69 Fragment Assembly + Homology modeling
alpha-RgIA 1CNL 15435 1.69 Fragment Assembly + Homology modeling
apo-LFABP 2JU3A 15429 2.32 Fragment Assembly + Homology modeling
Human Ubiquitin 1UBQ 15410 1.05 Fragment Assembly + Homology modeling
RGD-hirudin 2JOOA 15405 1.32 Fragment Assembly + Homology modeling
MMP-3 2JT6 15396 1.21 Fragment Assembly + Homology modeling
MMP-3 2JT5A 15395 1.17 Fragment Assembly + Homology modeling
F153W cardiac troponin C 2JT3A 15388 3.68 Fragment Assembly + Homology modeling
Four-Alpha-Helix Bundle 2I7UA 15384 1.48 Threading
Huwentoxin-XI 2JOTA 15382 1.51 Fragment Assembly + Homology modeling
VraR DNA Binding Domain 2RNJA 15378 0.93 Fragment Assembly + Homology modeling
Human DESR1 2JR7A 15377 1.81 Fragment Assembly + Homology modelling
Cys2His2 zinc finger 2JSPA 15373 Did not converge De Novo
alpha-RgIA 1CNL 15368 1.69 Fragment Assembly + Homology modeling
alpha-RgIA 1CNL 15367 1.69 Fragment Assembly + Homology modeling
MMP20 2JSDA 15361 1.14 Fragment Assembly + Homology modeling
MEKK3 PB1 domain 2PPHA 15355 1.24 Fragment Assembly + Homology modeling
1st SH3 domain of adaptor Nck 2JS2A 15351 1.57 Fragment Assembly + Homology modeling
2nd SH3 domain of adaptor Nck 2JS0A 15349 2.05 Fragment Assembly + Homology modeling
Manduca sexta moricin 2JR8A 15323 2.66 Fragment Assembly + Homology modeling
CheW 2HO9A 15322 2.66 Threading
Discoidin Domain of DDR2 2Z4FA 15315 1.63 Threading
NusB from Aquifex Aeolicus 2JR0 15312 3.91 Threading
C2 domain of human Smurf2 2JQZA 15306 Did not converge Threading
CTX A3 1I02A 15305 2.20 Fragment Assembly + Homology modeling
Cu(I) human Sco2 2RLIA 11001 2.19 Fragment Assembly + Homology modeling


Table V. Backbone RMSDs between models generated by GENMR sub-program for ab initio folding and corresponding experimental structures.

Protein PDB ID RMSDbb (Å)
HIS15ASP 1CM2 0.98
TM1442 1SBO 2.56
XcR50 1TTZ 1.31
CSPA 1MJC 2.77
Ubiquitin 1UBQ 2.5
Apo-lfabp 1LFO 1.68
DNA Polymerase 1DK3 2.71
OMTKY3 1OMT 1.83
BPTI 5PTI 1.94
Calbindin D9k 1IG5 3.1
Rosetta 2.2.0 was used to provide GENMR with ab initio folding functionality

Table VI: Ubiquitin models built with GENMR using (a) homology modelling, (b) threading, and (c) ab initio folding. RMSDs and sequence identity values were calculated with respect to the native structure (1UBQ). Tables (a) and (b) show the ubiquitin models built as ubiquitin homologs were progressively removed from the database, so that the homolog used in a given row is the best one found by GENMR when the more closely matching homologs above it were removed. (c) Results of ab initio folding of ubiquitin by Rosetta. 50 models were produced in each of 10 runs. Backbone RMSDs of the model with the best GENMR score are shown, with those less than 2.8 Å labeled as successes. Homologous proteins were excluded from Rosetta's fragment generation step.


(a) Homology Modelling

Template RMSD (Å) Sequence Identity (%)
All atoms C-alpha
1UBQA 0.00 0.00 100.00
1UD7A 2.18 0.85 90.79
1C3TA 2.09 0.60 89.47
1SIFA 1.35 0.41 90.14
1BT0A 1.55 0.52 61.64
1NDDA 1.43 0.49 56.76
2DZIA 2.43 0.93 39.47
2JWZA 2.01 1.00 94.74
2BKRB 1.96 0.50 57.89
2NVUJ 2.03 1.07 57.89

(b) Threading

Template RMSD (Å) Sequence Identity (%)
All atoms C-alpha
1UELA 1.99 0.79 31.58
1MG8A 2.43 1.41 31.58
1IYFA 3.17 1.99 30.26
1WY8A 4.47 4.16 39.47
1WX9A 2.52 1.16 34.21
2BWFA 4.36 3.98 32.89
1OQYA 3.68 2.49 35.53
2HJ8A 2.73 1.84 32.00
1WE7A 4.85 4.16 27.63
2FAZA 5.20 4.83 36.84
1WH3A 2.14 0.86 35.53
1Z2MA 2.30 0.96 32.89
1TTNA 2.76 1.60 26.32
1M94A 2.12 0.97 23.29

(c) Ab initio

Run Lowest Score Backbone RMSD (Å) Results
1 2.80 FAILED
2 4.39 FAILED
3 3.36 FAILED
4 2.92 FAILED
5 5.26 FAILED
6 3.99 FAILED
7 2.72 SUCCESS
8 2.55 SUCCESS
9 2.82 FAILED
10 2.78 SUCCESS




Table VII. Comparison of model global accuracy achieved by GENMR and similar programs.



(a). Comparison of backbone RMSDs of models generated by CS-ROSETTA, CHESHIRE, and GENMR.

Protein Name PDB ID BMRB ID CS-ROSETTA RMSDbb (Å) CHESHIRE RMSDbb (Å) GENMR RMSDbb (Å)
TM1442 1SBO 5921 1.22 1.32 1.84
Calbindin 4ICB 390 1.39 1.47 2.41
Hpr 1HDN 2371 1.99 1.83 0.96
Ubiquitin 1UBQA 5387 0.82 1.33 1.34
FF domain 2UCZ 6711 1.46 1.12
HR2106 2HZ5 6210 1.85 2.13
Spo0F 1SRR 5899 1.67 1.88
Apo 1fabp 1LFO 4098 1.72 2.11
CspA 1MJC 4296 1.43 0.75
XcR50 1TTZ 6363 1.67 2.41
Profilin 1PFL 4082 2.26 1.8
GB3 2OED 15283 0.74 0.98


(b) Comparison of backbone RMSDs of models generated by CS-ROSETTA and GENMR for proteins that did not meet CS-ROSETTA convergence criteria.

Protein Name PDB ID BMRB ID CS-Rosetta RMSDbb (Å) GENMR RMSDbb (Å)
HI0719 1J7H 5606 14.31 1.02
MTH1598 1JW3 5165 12.17 Did not converge
HR1958 1TVG 6344 16.29 Did not converge
CcR19 1T17 6120 7.09 4.17
YwIE 1ZGG 6460 9.37 1.37
Flua 1N0S 5756 15.57 1.45
nsp1 2GDT 7014 5.6 Did not converge


(c) Comparison of backbone RMSDs of models generated by CS-ROSETTA and GENMR for structural genomics proteins.

Protein Name PDB ID BMRB ID CS-Rosetta RMSDbb (Å) GENMR RMSDbb (Å)
RpT7 2JTV 15419 0.64 3.79
StR82 2JT1 15386 0.57 3.45
RhR95 2JVM 15341 0.66 2.77
TR80 2JXT 15573 1.27 Did not converge
VfR117 2JVW 15491 0.6 6.79
PsR211 2JVA 15471 2.07 3.49



Figure 2. Dependence of backbone RMSD of GENMR models on template sequence identity for different methods of structure generation

A scatter plot showing the performance of GENMR three approaches to structure generation (subfragment assembly + homology modeling, chemical shift threading and ab initio structure generation) relative to the level of sequence identity of the matching templates or subfragments. Data from Tables III-VI in the GENMR web documentation pages was used to assemble this graph.

Header image

Energy Function

Summary

GENMR uses an energy function consisting of several statistics-based pseudo-potentials:

  • Chemical shift correlation scores for HA, HN, N, CA, CB, and CO chemical shifts. Chemical shifts for the GENMR models are predicted using SHIFTX, and, after subtraction of random coil shifts for CA and CB shifts, correlation coeffecients are determined between these predicted chemical shifts and the experimental chemical shifts submitted by the user.
  • A threading potential.
  • Bump score for backbone atoms plus C-Beta and C-Gamma atoms.
  • Hydrogen bond count.
  • Ramachandran score counting phi and psi torsion angle violations.
  • Omega score counting omega angle violations.
  • Chi score based on expected chi angles for different phi and psi combinations.
  • Radius of gyration, giving a penalty if the calculated radius of gyration deviates from the expected value by more than 10%.
  • Hydrogen bond energy.
  • Disulphide bond count.
  • Secondary structure score, based on the similarity between the secondary structure of the GENMR 3-D model and the expected secondary structure. The expected secondary structure is normally based on that predicted by a sequence homology match to the PPT-DB 2º Structure (Cytoplasmic) database. If, however, this predicted secondary structure is more than 50% different from the secondary structure predicted from the chemical shifts by CSI, the CSI-predicted secondary structure is used as the expected secondary structure.
  • Score Weighting

    Each of the initial terms in the energy function are transformed through a series of steps: (1) They are multiplied so that they behave as energy-like terms, i.e. more negative scores indicate a better structure. (2) They are scaled so that they all have close to the same order of magnitude (about ± 1 to 100). (3) All scores except for the chemical shifts scores and the radius of gyration score are normalized by dividing by the number of residues in the peptide, and then multiplying by 100 to re-scale them. (4) All scores are multiplied by weights that were determined based on training of the energy function on CASP7 decoys and decoys generated by Rosetta.

    The initial score calculations, plus the results of steps (1) and (2) are:

  • Each of the Chemical shift correlation values (0.0 to 1.0) is multiplied by -10.0.
  • The threading potential is multiplied by 0.2.
  • In the initial bump score, for all pairs of atoms that are closer than the allowed threshold distances, the magnitude of the difference between the observed distance and the threshold, in Angstroms, is added to the score. Any distances closer than 70% of the threshold receive an additional penalty of +10. The initial bump score is multiplied by 0.5.
  • The hydrogen bond count (the number of hydrogen bonds) is multiplied by -1.1.
  • The initial Ramachandran score is calculated based on Ramachandran tables with four angle classes: 3 (good), 2 (acceptable), 1 (minimal), 0 (disallowed). For each residue, this class number r is used to generate a Rama score value of (3 - r)², and the sum of all these values over all residues is calculated. To the final score is then added the fraction of residues that did not have an r value of 3.
  • For the omega score, if an omega angle is not within ±12.5° of 180° or within ±15° of 0°, then the magnitude of the difference between the omega angle and these thresholds is added to the omega score.
  • The chi score is based on the fact that chi angles cluster into three main groups for each residue. Each chi angle is assigned to the chi group to which it is closest, and the probability of observing that chi group given the phi and psi angles of the residue is subtracted from the chi score.
  • For the radius of gyration score, if the calculated radius of gyration is not within 10% of the expected radius of gyration, the magnitude of the difference between the real and expected values is used as the intial radius of gyration score. The initial score is multiplied by 2.5.
  • For the hydrogen bond energy score, the sum of all the hydrogen bond energies is calculated. The sum is divided by the number of residues and multiplied by 40.0.
  • The disulphide bond count is multiplied by -3.0.
  • For the secondary structure score, the sum of all secondary structure mismatch scores in the peptide is calculated: +4 for Beta-strand or helix matched with coil, and +8 for Beta-strand residues matched with helix residues.

  • The weightings for step (4) for the above scores, other than the chemical shift scores, were trained on decoys from CASP7 using linear regression-based fitting. The weights for the chemical shift scores were trained separately on Rosetta decoys of structures with known chemical shifts; both ab-initio Rosetta decoys and Rosetta decoys generated from relaxing the native structures. The relative weighting between the chemical shift scores and the other scores was then determined based on linear regression. The weights in step (4) are multiplied by the values in step (3) to obtain the final scores. The step (4) weights are:

    Score name Score weight
    HA 1.22
    HN 0.99
    N 0.71
    CA 0.62
    CB 1.00
    CO 0.41
    Thread 0.46
    Bump 0.07
    Hcount 0.05
    Rama 0.05
    Omega † 0.02
    Chi † 0.02
    RadGyr 0.57
    HBenergy 0.15
    Disulph 0.55
    SecStr 0.23

    † Adjusted manually to a low value because training set it to 0.

    GENMR Results Assessment

    A GENMR results summary for a structure prediction includes the mean chemical shift correlation values before and after energy minimization. The reliability of the final structure is guaged by looking at the final mean chemical shift correlation:
    0.75 - 1.00 = High
    0.65 - 0.75 = Good
    0.55 - 0.65 = Moderate
    0.00 - 0.55 = Poor
    			
    A GENMR results summary also prints information on torsion angle violations. The expected number of torsion angles in the core, allowed, generous and disallowed regions is estimated based on the following percentages:
    
    #res in phi/psi core          90%
    #res in phi/psi allowed        7%
    #res in phi/psi generous       1%
    #res in phi/psi disallowed     0%
    #res in omega allowed         99%
    #res in omega disallowed       1%
    			
    Note: The first and last residues are not use for phi/psi, and the last residue is not used for omega.

    Torsion Angle Mapping for Thrifty

    GENMR performs a homolog search based torsion similarity against a non-redundant database of all cytoplasmic proteins in the PDB. This search uses an updated version of Thrifty that assigns each residue a letter based on its torsion angle class. The torsion angle mappings for each of the 9 letters in this torsion angle alphabet are shown below.

    Torsion Angle Alphabet