siezen, 1997, subtilase
TRANSCRIPT
-
8/10/2019 Siezen, 1997, Subtilase
1/23
Protein
Science
(1997 ), 6301 -523. Cambridge University Press. Printed in the USA.
Copyright
0
1997
The
Protein Society
REVIEW
Subtilases:
The superfamily of subtilisin-like serine proteases
ROLAND J. SIEZEN
A N D
JACK A.M. LEUNISSEN2
Department of Biophysical Chemistry, NIZO, P.O. Box 20, 6710BA Ede, The Netherlands
CAOSKAMM Center, University of Nijmegen, Toernooiveld, 525ED, Nijmegen, The Netherlands
(RECEIV ED ugust 22, 1996; ACC EPTE Dovember
5 ,
1996)
Abstract
Subtila ses are memb ers of the clan (or superfamily) of subtilisin-like serine proteases. Over 200 subtilases are presently
known, more than 170 of which with their complete amino acid sequence. In this update of our previous overview
(Siezen RJ, de Vos WM, Leunissen JAM, Dijkstra BW, 1991,
Protein
Eng 4719-731), details of more than
100
new
subtilases discovered in the past five years are summarized, and amino acid sequences of their catalytic domains are
compared in a multiple sequence alignment. Based on sequence homology, a subdivision into six families is proposed.
Highly con served residues of the catalytic domain are identified, as are large or unusual deletions and insertions.
Predictions have been updated for Ca*+-bindingsites, disulfide bond s, and substrate specificity, based on both sequence
alignment and three-dimensional homology modeling.
Keywords:
homology modeling; sequence alignment; serine protease; subtilase; subtilisin family
Serine endo- and exo-peptidases are of extremely widespread oc-
currence and diverse function. Many distinct families of serine
proteases exist; they have been grouped into six clans (Rawlings
and Barrett, 1994; Barrett and Rawlings, 1995), of which the two
largest are the (chymo)trypsin-like and subtilisin-like clans. These
two clans are distinguished by a highly similar arrangement of
catalytic His, Asp, and S er residues in radically different PIP (chy-
motrypsin) and a @ (subtilisin) protein scaffolds.
In 1991, we presented a review of over 40 members of the
subtilisin-like serine proteases, termed sub tilases, which occur in
Archaea, Bacteria, fungi, yeasts, and higher eukaryotes (Siezen
et al., 1991). The mature enzymes were found to contain up to
1775 residues, with N-terminal catalytic domains ranging from
268 to
5 1
1 residues, and signal and/or activation-peptides ranging
from 27 to 280 residues. Several members contain C-terminal
extensions, relative to the subtilisins, which display additional prop-
erties such as sequence repeats, Cys-rich domains,
or
transmem-
brane segm ents. From four known crystal structures and a multiple
alignment of 40 known amino acid sequences, a core structure was
predicted for the catalytic domain of all subtilases, together with
the variations that are allowed in the main-chain length as a result
of insertions and deletions (Fig.
1).
Nineteen of these core residues
were found to be highly conserved, 10 of which are glycines.
Predictionswerealsomade for subtilases of unknown three-
dimensional structure concerning essential conserved residues, al-
Reprint requests to: Dr. Roland J. Siezen, Department of Biophysical
Chemistry, NIZO, P.O. Box 20, 6710BA Ede, The Netherlands; e-mail:
lowable substitutions, disulfide bonds, Ca2+-bind ingsites, substrate-
binding site residues, ionic and aromatic interactions, and surface
loops. Based on these predictions, strategies for homology mod-
eling and protein engineering were developed and implemented,
aimed at modulating either stability, catalytic activity, or substrate
specificity (Siezen et al., 1991, 1993, 1994, 1995a).
Since 1991, more than 100 new subtilases have been discovered,
and these are now included in this updated review. In addition to
many new enzymes from micro-organisms, numerous members of
the subtilase superfamily have now also been identified in various
eukaryotes such as slime molds, plants, insects, nematodes, mol-
luscs, amphibia, fish, mammals, and even in a catfish virus.
Structure-based alignment
The coordinates of subtilisin BPN, subtilisin Carlsberg, thermi-
tase, and proteinase K were used previously (Siezen et al., 1991)
to determine the core of structurally conserved regions (scrs;
Greer, 1990) and the common secondary structure elements, as
analyzed with the DSSP program (Kabsch and Sander, 1983). This
core of about 190 residues contains virtually all of the common
a-h elix and &strand elements, including the active site residues
D32, H64, and S221 (Siezen et al., 1991). Slight adjustments to
thesecore regions have now been incorporated (core ABC in
Fig. 2) based on a recent spatial superpositioning of seven struc-
tures that also included mesentericopeptidase, Savinase, and
Es-
perase (Heringaetal., 1995); topologically equivalent residues
were defined as those that have Ca-atom distances of less than
2.0 A . The variable regions (or vrs) nearly always correspond to
501
-
8/10/2019 Siezen, 1997, Subtilase
2/23
502
R.J. Siezen and J.A.M. Leunissen
A
connecting loops between helices and strands and generally lien
the external surface of the protein (Fig. 1).
When only the subtilisin BPN', subtilisin Carlsberg, and ther-
mitase structures were superimposed the number of structurally
equivalent Ca atoms increased to over 230or about 85% of allCa
atoms), which we refer o as the extended core (core AB n
Fig. 2). This distinction between core and extended core scrs is f
relevance
for
homology modeling, because the superfamilyf sub-
tilases can be subdivided into several families (see below).
Identification of subtilase supetfamily members
An extensive search of scientific literature and databases (EMBL,
Genbank, Swiss-Rot) was performed o dentify new subtilisin-
like serine proteases, using the programs BLAST (Altschul et al.,
1990), TFASTA, and FASTA (Pearson and Lipman, 1988). Con-
sensus sequence segmentsf 20-40 residues around he active site
residues D32, H64, and S221 were used or this purpose; different
consensus segments were obtained or different subtilase families
(see Fig. 2). Sequences from patent literature and databases are not
included because they represent synthetic or mutated genes encod-
ing engineered subtilases. The main results of these searches are
summarized n Tables 1 and 2. Further details, including reference
to 10 crystal structures, can be found in the EMBLlGenbank and
PDB databases using codes listed in the tables.
At present, over 170 complete and several partial amino acid
sequences of subtilases are known; most are derived from he
B
n
Fig.
1.
A:
Schematic representation
of
the secondary structure topology
f
subtilases, with a-he lices shown as cylinders and p-sh eet strands as ar-
rows. Solid lines indicate he conserved regions (scrs) o al l subtilases, and
dashed lines the variable regions (vrs ). Approximate location is indicated
of
the main Ca2+-b inding sites (by Ca l and CaZ), catalytic triad residues
D32,H64,
nd S221 (by
*)
and substrate-binding region (between strands
e1 and em ). B: Ribbon-plot representation
of
the secondary and tertiary
structure
of
subtilisin
(PDB
ode
2SNI),
made with MOLSCRET (Krau-
lis, 1991).Side chains of the catalytic residues are shown in ball-and-stick
representation.
corresponding geneor cDNA sequences. We caution that in many
cases it has not been established whether these genes encode func-
tional proteinsor whether the encoded protein is actually a prote-
ase. Examplesof the latter are the outer-membrane antigenhssal
of
Pasteurella haemolytica
(Lo et al., 1991), and the anti-freeze
protein af70 of Picea abies (EMBL D86598), which were not
described as proteases by the authors.
Themajority of the subtilases are synthesized as pre-pro-
enzymes, subsequently translocated over cell membrane via the
pre-peptide (or signal peptide), and finally activatedy cleavage of
the pro-peptide. A detailed comparison of the pre-pro sequences
and the putative processing sites f these subtilases has identified
two main types of pro-peptide (Siezen et
al.,
1995b). However,
there
are
numerous exceptions n which the pro-peptides appear o
be completely unrelatedor even absent. A small number of subti-
lases is intracellular (Table 1).
Table 1 shows that the (putative) mature enzymes range in size
from 266 o 1775 residues. The catalytic domain or module s
defined as the segment with sequence homology to subtilisins; it is
always located at the N-terminal end of the amino acid sequence
directly after the pre-pro region.This review is focussed
only
on
the catalytic domains.
Alignment of primary sequences
The multiple sequence alignment f the catalytic domains of over
120 subtilases is shownn Figure 2. Additional variants with 10%
-
8/10/2019 Siezen, 1997, Subtilase
3/23
Subtilases
a
B
b a s b p n
b s s l 6 B
b s s d y
b l s c a r
* b e s p r c
* b e e p r d
* b s a p r q
b16147
b a a l k p
b s c y a b
+ b s a p r s
b s e p r
*
b 6 6 e p r
* rnvapt
*
p s a p r p
* b s t a 3 9
*
p a a 1 y s
* b s t a 4 1
* b p l e p
b 6 1 6 p l
* b s l a k p
* b e l e p q
.
611p
503
10 20
30 4 0
1
~ Q ~ ~ ~ ~ ~ . p y G ~ ~ Q I ~ ~ ~ ~ ~ ~ " " " . ~ ~ ~ ~ ~ . ~ p ~ ~ H ~ Q ~ y ~ G
n Q s v . ~ ~ ~ p y c I s Q ~ r " ~ ~ ~ ~ ~ " " " ~ ~ . ~ ~ n p A ' ~ s Q c y T c s N " v l v n v r D s G I o s s H p o L ~ ~ ~ ~ ~ " " . ~ ~ ~ ~ ~ ~ " " . ~.
~ Q ~ ~ ~ p y ~ ~ p ~ * ~ ~ . ~ ~ ~
AQTV~ .pyGIp
L I K ~ ~ ~ . . ~ ~ ~ ~ ~ ~ . . .
DKVQAQGFKG I9N..VKVAVLDTGIQ sHpDL~ . . . . . . ~ ~ ~ ~ ~ ~ ~
~ Q T V ~ ~ ~ . p ~ G I p ~ I ~ ~ ~ ~ ~ ~ ~ " " . . . ~ ~ ~ ~ .
AQTVpyGVpHl
~ ~ ~ ~ ~ ~ ~
QTV pWCIpyIy ~ ~ ~ ~ ~ ~ ~ . . . . . . .D ~ H R Q G y P G N G ~ ~ v K v A V L Q T G v A ~ p H p D . ~ ~ ~ ~ ~ ~ . . . . . ~
Q ~ I " . P W G I S F I N ~ ~ ~ ~ " " " " - R R V R . ~ ~ p ~ ~ ~ ~ . " . ~ ~ ~ ~ ~ ~ ~
~ Q ~ ~ " " p W ~ I ~ R ~ Q " " " - - " . ~ p ~ R H N R G ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ Q ~ ~ . ~ ~ ~ p ~ G ~ m ~ Q " ~ ~ ~ ~ ~ ~ " " " . ~ ~ ~ ~ p ~ ~ ~ ~ ~
Q T V ~ ~ ~ ~ p W G I N R ~ Q " ~ . ~ ~ ~ ~ ~ ~ " . ~ ~ ~ ~ ~ R p I A Q S R G ~ T ~ ~ G ~ . v R
-
8/10/2019 Siezen, 1997, Subtilase
4/23
504
R.J. Siezen and
J.A.M.
Leunissen
5 0 6 7 0 S O 9000 11 1 2 0 130
basbpn
bssl bS
bssdy
bssprc
b
-?a>-
b66nr d
b l S 1 4 7
bsaprq
baal kp
bseyab
bsaprs
bscpr
bssepr
vmvapt
psaprp
paa1ys
b e t a 3 9
betai l
boleo
bsl spl
bsl akp
bsl epq
I s l a p
. .
t v L h e l
bsakl
IGLap
n a h l y 6
h n h l y s
bsrpra
dnbpr
dnavp2
dnavpS
a l a p r l
..c..t
'"p'Oa
COX. AB
Yapma
a 1 a p r 2
LrLIloi
taaqu.3
LdPlUt
r a p r o k
i r p , o r
bbpr
1
plbspl
t l i c a l p
macdpa
acalpl
.' OPSP
a t o r y z
a o o r y z
a f p l s r
anpr t a
anpepd
thprbl
d pepc
scprbl
ECy s p l
6psep'
y1xp1-2
C c y C L s
e t c y 1 a
= e p e p p
I S l a s p
bepara
s eep. ?
Il"lS?
y o 5 3 5
so r -ABC
rPpc?
1spc:
bcpcZ
h s p c 2
a c p c 1
l npcl
hspcl l
bcpc?
hspar4
h sp c 6
a a f u r
dmur l
t t f ur
actur1
cefur l
a c t u r 2
I sfur2
mmpc 4
x1 t urA
hstur
drnfur2
hakx2
c e t u r2
hslpc
yl xpr6
klkcxl
sckcxz
h v c c w
spkrpl
avprca
a sa spa
e l s s p
sCt7cpr
ernscrp
smssp1
phssal
emESP2
bsspra
bssprb
bsbpf
b 6 T T
11prt p
spscpa
l dprt b
llsp09
ageer p
lcp69
CrnCUC"
paat70
hsklaa
atserp
ddtagb
ddtagc
dmpga9
hst pp2
cet pp
emst ab
Lspl st
p wr o
I * I
- - - K V A G G A S M V P S E T N P F ~ ~ ~ ~ ~ - - - - - - - Q D ~ S E G ~ A G ~ ~ L ~ ~ ~ ~ ~ ~ - - ~ S ~ G ~ - L C V S ~ S - ~ A S L Y A V K V L G A ~ ~ ~ D G S G Q Y S W l I N G I E W A l A N - - - - - - - - - - - - - - - - N E I D V- ~~ - S A A L
K V V G G I I S F Y S G E - S Y N ~ ~ ~ ~ ~ ~ ~ ~ - - - - T D G N ~ G ~ A G T V A A L ~ ~ ~ ~ ~ ~ ~ ~ D N T T G V - L G V A ~ ~ - - V S L ~ A I K V L N S - ~ - S G S G T Y S A I V S G I E W A T Q N - ~ ~ ~ ~ ~ ~ - - ~ ~- - . . . - - -TAL
I
~~~~
W R G G I S P Y P S E T N P Y ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Q D G S S E G ~ A G T I A A L - - - ~ ~ ~ ~ ~ ~ S I G V ~ L G V A P ~ - - A S L Y A V K V L D S - - - T G S G Q Y S W I I N G I E W A I S N ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ - ~ ~ - - N
STAL
~~~~
W V G G A S F V I G E - I Y N ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ T D G N G ~ G - A G F J ~ A L - - - - ~ - ~ - D ~ T G V ~ L G V A P S - - V ~ ~ ~ A V ~ V L N ~ - - - ~ ~ ~ ~ T Y S G I " S G I E W A T T N ~
~~~~
W K G G A S F V S G E P N I I L - - - - - - ~ ~ ~ ~ ~ ~ Q D G N G E G ~ V A G T Y ~ L - - - - - - - - ~ T G V - L G V A Y N ~ ~ A D L Y A V K V L S A - - - S G S G T L ~ G I A Q G I E ~ S I S ~ - - ~ - ~ ~ ~ ~ ~ ~
~~~~
~ ~ . .
R V V G C A S F V S E E P D A L - - - - - - - - - - - - ~ G N G B G T H V R C V L S A ~ ~ ~ G G S G T L A G I A Q G I E W A I D N - - ~ - ~ - - - - ~ - - - - - - ~ D V I N H S L G G S T G - - - -- - ~
-
- ~STTL
H I R G G Y S F I S T E P T r Y ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - V D ~ G E G ~ A ~ F J ~ ~ L - - - - ~ ~ ~ ~ ~ S Y G " ~ L G V A P G - - A E L Y A V K V L ~ R - - - N ~ S ~ S H A S I A Q G I E ~ A M
~ ~ - - R I A G G A S F I S S E P S Y ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - H D N N G B G T H V I G T l A A ~ - - - - - - - - ~ S I G V ~ L G V A P S - - ~ L Y A V K V L D R - - - N G ~ G S L ~ S V A Q G ~ E ~ A I ~ - ~ ~ - ~ -
N I R G G A S F V P G E P S T ~ ~ ~ ~ ~ ~ ~ - - - - - - Q D G N G E G ~ V A G T I A A L - - - ~ ~ ~ ~ ~ ~ S I G V ~ L G V A P N - - A E L Y A V K V L G A - - - S G S G S V S S I A Q G L E W A G ~ - ~ ~ ~ ~ ~ ~ ~ ~
R i R G C A S F ~ P G E P N I - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ S D G N G ~ G T Q V A G T I A R L - - - - - - - - ~ S I G V ~ L G V A P N ~ ~ V D L Y G V K V L G A - - - S G S G S I S G I A Q G L Q W ~ ~ - - - - - - - - - - - - - - - - G ~ H I A ~ S L G S S A G - -~ - - - ~ - - -- S A T M
N I R G G Y S F Y P G E P S Y - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Q D G N C H G T H V I G T I I R L - - - - - - - - ~ S I G V - V G ~ A P N ~ ~ A E L Y A V K V L G A - - - N G S G S V S S I A Q G L Q W T A Q N - - ~ - - - - - - - ~ - - - - ~ N I H V ~
S I A G G Y S I V S Y T S S Y - - - - ~ ~ ~ ~ ~ ~ ~ ~ ~ K D D N C P C T H V I G I I G A K - - - - - - - - H N G Y G I - D ~ I A P E ~ ~ A Q I Y A V ~ L D Q - - - N G S G D L Q S L L Q G I D W S I A N ~ ~ - ~ ~ ~ ~ ~ ~ ~
K V K G G T C V I R S D C G K G Y - - - - - - - - " ) D N C H G T H V A G I I G A ~ ~ - - - - - - - D N G V G ~ - V G V A P D - - A D L Y A V ~ F D E ~ ~ ~ F G E G S T S S I T A G V D W A I Q H ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
N R V T G T N D R G T G Q W Y I P - - - - - - - - - - - G S ~ ~ G ~ V A G T I A A I - ~ ~ - - - - - A ~ E G V - K G L L P N Q W N L H I V K V F N E - - - S G W G Y S S T L V ~ I Q T C A D N ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
K V V Y C I N T L G K I . L Y K G ~ R K - - - - - - C A D R K C E G ~ V A G I I A A S L - - - ~ ~ - - - ~ S A - A G ~ P K - - V Q L I A V K V L Y D ~ ~ ~ S G S G Y Y S D I A E G I I E A V K A ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ G A L I L
W E Q C K D F T Y G ~ Y T N N S ~ ~ ~ ~ ~ ~ ~ ~ ~ C T D R Q G B G T H V A G S A L A D G - - - - - - - C T G N G V - Y G V A P D ~ ~ A D L W A Y K V L G D - - - D G S G Y A D D I ~ I R H A G D Q A T A L N ~ ~ - - - ~ ~ ~ ~ ~ ~ T K V V I N E I
W E Q C K D F r V G T N F T D N S ~ ~ ~ ~ ~ ~ ~ ~ ~ C T R ~ E G ~ V A G S A L A N G - - - - - - - G T G S G V - Y G V A P E ~ ~ ~ L W A Y K V L G D - - - D G S G Y A D D I A E A I R H A G D Q A T A L N ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ T K
- - Q I I G G R N F S D D D G C K E D A I " - - - S D Y N G B G T H Y R G T I A A N - - ~ ~ ~ ~ ~ ~ D S N G G l ~ A G V A P E ~ ~ A S L L l V K V L G G E - - N G S G Q Y E W I I N G I N Y A V E Q - - - - - - ~ ~ - ~ ~ ~ ~ ~ ~ ~ K V D I l S M S
R I I G G R N F T O D D E G D P E I F ~ ~ ~ ~ ~ ~ - - - I ( D Y N G E G ~ V A G T I A A T - - - ~ ~ ~ - - E N E N G V ~ V G V A P E ~ ~ A D L L I I K ~ L N K - - - Q G S G Q Y D W I I Q G I Y Y A I E Q ~ ~ - ~ - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ ~ Q l I D G R N F T T O D N S D P D W ~ ~ ~ ~ ~ ~ ~ ~ ~ E D S N G E G T H V C G P V A A C - - - - - - - - ~ N D K G V - l C T A P K - ~ A K L L V V K V L S G ~ ~ ~ Q G Y G D T K W V l E G V R Y A I N W R G P ~ E - ~ - ~ ~ ~ ~ ~ ~ ~ R V R V l S M S L G C R ~ D~~ ~- TP E L
- R I I G K H W T S D D C N D P E I V ~ ~ ~ ~ ~ ~ ~ ~ ~ S D Q N G ~ G T H V C G T l A A T - - ~ - - ~ - - E ~ R A ~ ~ I G V A P E ~ ~ C Q L L V V K V L S N ~ ~ ~ R G F G T T E W V V E G l R H A l N W E G P N G E ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ K VPRL
~~~~
~~~~
~~~~
~~~~
~~~~
~~~~
A G V T G S T F S G H G S W F - - - - - - - - - - T D G N G B C T H V A C T I V A L - ~ - - - - - - D ~ G ~ - ~ G ~ L P S G L V G L H N V K I F N D ~ ~ S G V ~ ~ A S D L I ~ I Q S C Q S A ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~~~~
~~.~
.~ . .
.."
~~~~
R I I G G V N L T T D Y G G ~ E T N F - - - - - - - - - S D ~ G E G ~ V A G ~ AA A ~ ~ ~ ~ ~ ~ ~ ~ E T G S G V . V G V A P K - - A D L F I I K A L S G - - - D G S G E M GW I A K A I R Y A V D W R G P K ~ E - - - - - - - - - - W I R I I T M S L G G P
~~NYHADASYDFSSNDPYP YPRY-~. . . .DTWFNSBGTRCAGEV~AAK-.....~ G V C ~ . V G " A Y ~ - - S ~ " A ~ L ~ M L D Q - - - - P ~ ~ ~ ~ l E A N A M G H M P N - ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ V l D I Y S A S W G P T O D- GKFJ DGPRNLT
~ ~ N F N A E A S Y D F S S N D P F P Y P R Y ~ ~ ~ ~ ~ ~ ~ T D D W F N S E G = C ~ G E l V A A R
~ - ~ ~ . .NGVCG- VGVAYD - GKVAGI ~L DQ...
Y M T D L l E A N S M G H E P S ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ K l H l Y S A S W G P ~
~ ~ N Y N A D A S Y D F S S N E A F P Y P R Y " - T D D W F N S B G T R C A G E V V G K I - - ~ . ~ . . ~ G L C G . V G V R Y G - - ~ R V A G I ~ L D Q - - - - P F M T D I I E A S S M G H K P Q ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ - E I D I Y S A T W G P T D D " ~ " ~ -
~ ~ N Y N A E A S Y D F S S N D P Y P Y P R Y ~ ~ ~ ~ ~ ~ ~ T D D W F N S ~ G ~ C A G E ~ S A ~ - ~ ~ ~ ~ ~ ~ ~ ~ C G ~ V G V A Y N ~ ~ S K V A ~ l ~ L D Q ~ ~ ~ ~ P F M T D I l E A S S I S ~ P Q ~ ~ ~ ~ ~
~~NYDPEASYDFNDNDEDPSPRY~.~. .~DI ~EN~G=c AGEVS MVA. .~~..-CG- TGI AFT.. KI GGV~MLD~....
H V T D R L E G D A l C F ~ H ~ ~ ~ ~ ~ ~ - ~ . - - - -
- - K Y D I Y S A S W G P N D D - - ~ ~ ~ ~ ~ ~ ~ ~ ~ C R T T E G P G V M A
~ ~ N y D A E A S y D F N D N D p N p F P R Y ~ ~ ~ ~ ~ ~ ~ D ~ ~ ~ N ~ ~ ~ c A G E l ~ Q A ~ ~ . . ~ ~ ~ D ~ K c ~ ~ v ~ V A F N ~ ~ S K V G G ~ R M L D ~ ~ ~ ~ ~ G l V T D A I E A S S ~ G F N P ~ ~ - ~ - - ~ -KFJ EGPGRLP
~ ~ N y D p E A S y D F N D N D H D p F P R Y ~ ~ ~ ~ ~ ~ ~ D L ~ E N ~ ~ ~ c A G E ~ A M Q A ~ ~ ~ ~ ~ ~ ~ ~ K c ~ ~ v ~ V A Y N ~ ~ S K V G G I R M L D . ~ ~ ~ ~ G ~ V T D A I E A S S l ~ F N P ~ - - - - - - -XFJ EGPGRLA
~ ~ N Y D P D I S y D F ~ N D D D p Q P R Y ~ ~ ~ ~ ~ ~ ~ ~ ~ T N ~ N ~ G ~ c A G E ~ A M A A ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ c ~ ~ ~ ~ ~ A ~ N ~ ~ A R ~ G G ~ R M L D~ ~ - G Y V T D I Y E I S S I G F N I Q - - - - - - - H V D I Y S R S W C P N D D - E G P E K L A
~ - N y D S y A S y D V N G N D Y D p ~ P R Y ~ ~ ~ ~ ~ ~ ~ D A ~ N E N ~ ~ ~ c A G E V A A S A - ~ ~ ~ . ~ . ~ ~ y c ~ . v ~ I A Y N ~ ~ A K ~ G G I R M L D - - - - - G D ~ ~ V V E A K S L G ~ R P N - - - - - - - ~ - ~ ~ ~ ~ ~ ~ ~ ~ Y ~ D
- -NYD LI SCDVNGNDLDPMPRY~-~-...As NEN~G=cACEVAAAA---....
S ~ ~ T . ~ G ~ A ~ ~ - - ~ K I ~ ~ ~ R ~ ~ ~ - - - - - ~ D V ~ M V E A K S V S F N P Q ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ H
~ . N Y D P I I I \ S Y D V N S I I D D D ~ M ~ H ~ . . . . . . . ~ ~ ~ ~ ~ ~ ~ ~ A G E " ~ T ~ . . . . . . . ~ ~ F ~ A . ~ G ~ ~ ~ ~ . . ~ ~ " G ~ V ~ L D . ~ ~ ~ ~ G D V T D A V E A R S L S L N P Q ~ ~ ~ ~ ~ ~ - - - - ~ - - - - -
- - NYDPI ( I I SYDYNGNDGDpMPHC ...L T O s ~ G= CA G E V ~ T A--....~ K C A . ~ G I A Y ~ - - A R V G C V ~ L D - - - - - G D V T D V V E A K S L G L N S Q ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ H ~ D ~ Y S A S W G P D D D ~
..
~ o p K ~ s y ~ v N mo ~ ~p Q p n y...... I I NS~G=CAG~VAAI A
......
~ ~ ~ A . v ~ ~ A F H . . A G I G G V ~ L D ~ ~ ~ ~ ~ G D V ~ A V E A R S L S L N S Q ~ ~ ~ ~ ~ ~ ~ ~ - ~ - - - - - - - Y ~ D l Y S A S W G P D
~ . N ~ ~ ~ R ~ ~ ~ ~ V ~ ~ ~ ~ ~ ~ ~ R ~ . . . . . . . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ G ~ V ~ ~ F . . . . . . . ~ ~ L ~ I . ~ ~ I A Y N . . A N I G G ~ ~ L D ~ . ~ ~ ~ G D V T D A V E A A S V G ~ N A D ~ ~ - - - - - - ~ - - - -
..
YDPYI \ SYDLNDHDNDPM~R~
...... ASNE*G~CAGE SAEA ..~...~-~ ~ ~ ~ ~ I A p D ~ ~ ~ ~ I ~ ~ ~ ~ ~ L D ~ ~ ~ ~ ~ ~ ~ V Y ~ A ~ ~ A A S L S F ~ ~ - ~ ~ ~ ~ ~ ~ ~
.. YDEI ( ASYD NGHD~DP~PRY...... y~E~G=c AGVV- QA.......
v ~ ~ . v ~ V A Y N . . AR I G G V ~ L D ~ ~ ~ ~ -G D V ~ S V E A Q S L G LN S Q - - - - - - - - - -- - - ~ ~ ~ - H I H I Y SA T W G P D D D " " - " " "G R F J D G P A T L A
: : ~ : ~ ~ : : ~ : ~ ; i ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ : : : ~ : : ~ ~ N D ~ ~ ~ ~ ~ : ~ ~ ~ ~ : ~ : ~ ~ ~ ~ : ~ ~ ~ ~ ~ : ~
: ~ ~ ~ ~ ~ a : ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ g ~ ~ ~ : ~ ~ : : ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ : : ~ : : ~ : : : : : ~ ~ ~ ~ ~ : ~ ~
: : ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ : : : : ~ : : : : : ~ ~ ~ ~ ~ ~ ~ ~ : ~ ~ ~ : : ~
: : ~ F ~ ~ ~ ~ ~ : ~ ~ ~ ~ ~ ~ ~ ~ ~ : : : : : : ~ : : : ~ ~ ~ ~ ~ ~ ~ ~ : ~ ~ ~
~~N~D~EASF DI NGNDSOPTP Q~~~~. . . .N~DN*G=cAGEVAAVA ..... ~ ~ ~ ~ ~ ~ . ~ ~ v A y N ~ ~ A s I G G v R M L D ~ ~ ~ ~ ~ G K ~ N D ~ E A Q A L S L N P S ~ ~ ~ ~ ~ ~ ~ ~ - ~
~~N~DPL AST DI NDHDDD~TP Q~ ~ ~ ~ . ~ . .GDN* G~=A~EVAALA .......
~ ~ ~ . ~ ~ v A F K ~ ~ A K ~ ~ ~ v ~ ~ L D ~ ~ ~ ~ ~ G A V S D S V E A A S L ~ ~ N Q D - . . ~ ~ ~ ~ ~
KTFDGPGPLA
~ ~ N ~ D Q T A S I V L N D N D N D ~ ~ ~ R ~ ~ ~ ~ ~ ~ ~ ~ D ~ D A D N ~ ~ ~ = ~ A G E A A A I A
...... ~ ~ c ~ . ~ ~ v A y N ~ ~ A K l G G v R M L D ~ ~ ~ ~ ~ G Q A T D A L E A S A L G F R G D
. . . ~ ~ ~ . - -
~ ~ ~ ~ ~ ~ " I D l Y l ~ C W G P K D D ~ - - - ~ ~ ~ ~ ~ ~ ~ G K
~ ~ N ~ S ~ ~ G S ~ D L N S N D ~ D ~ ~ P H P - - " - - D V E N G ~ ~ ~ ~ A G E ~ A A V P - - - . ~ ~ ~ ~ ~ F ~ A ~ ~ G V A Y G ~ - S R ~ A C I R V L D ~ - ~ ~ ~ G P L T D S M E A V A F N ~ Y Q ~ ~ ~ ~ ~ ~
~~SCKI APRD TRKRI FPTP .~..~......- ~ G T A C A G V A C G ~ ~ ~ ~ ~ ~ ~ ~ . N G ~ G * . S G V A ? G. K = ~~ I ~ F v......L G S Q D E A D S ~ " ~ A ~ Q ~ - - - - ~ - - . - ~ ~ ~ ~ ~ ~ ~ C A D V I S C S W G P P D G ~ - T W W D D R D P L H K Q K V P
rSYAVVSESWGCVDD-----------GAAFCDTTGNF
~~WRP ~CSKWVTGCS DP~p ~ ~ ~ ~ ~ ~ . T ~ D ~ ~ ~ ~ . .V ~G I I A A V~ ~ ~ ~ ~ ~ ~ D N ~ I ~ ~ . L G V A ~ R . . ~ Q L Q ~ ~ N ~ ~ D ~ . . . N I Q Q L Q K D ~ L Y A L C Q R R ~ ~ ~ Q P G - - - . ~ - ~ ~ ~ ~ ~ ~ L Q P E L R M S L V D P E G ~ - - - ~
~ ~ V N C V A C K P D T A D C A W R P S ~ ~ ~ ~ ~ - - - - - I \ I E S P ~ G ~ ~ G E I A A A K ~ - ~ ~ - - - - N G V G ~ - T C V A ~ G - - ~ K V A ~ I K V S N P - - - D G F F Y T E A ~ C G F M W A A E H - - ~ - ~ - ~ - ~ ~ ~ ~ ~ ~ ~ - C ~ D V ~ S Y Y T
: ~ ~ ~ , " ~ ~ ~ ~ ~ , " ~ ~ ~ ~ ~ ~ ~
: ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
y F ? r . , ~ ~ ~ y ~ F ~ ~ ~ ~ ~ ~ ~ : : : : . : : : : ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ : : : : ~ ~ ~ ~ ~ ~
~ ~ I P Y ~ K G D A F R y D G T p S Y D S D " - - - - - - - - C T L G S ~ G ~ v A A S P P A A E ~ ~ ~ - - - - - D G G ~ . H G V A F N ~ ~ A Q I l S A E N G D P ~ 6 ] I L ~ N D ~ A V Y Q A G W D A L V A S ~ ~ ~ ~ ~ ~ - ~ - - - - - - - ~ G A R I ~ ~ S W G I G ~ T ~ ~ D ~ Q K
~ - ~ Q W L G S T N L N I \ H T G I L P I T Y V ~ N V P ~ ~ ~ D S S S G E G ~ A G F J G G T G A - - - - - - M S G G K Y ~ E G V A P G - - E N L ~ G Y G S G A . . . . . ~ V V A M L D T L G G F D Y A L ~ Q Q E Y ~ ~ ~ - ~ ~ - - - - - - N I R I l ~ S W G A T S D " - - - - - ~ ~ ~ ~ A G T
- - V Q N V L G S T N L Q G I T G I L P I T Y T ~ N V P ~ ~ ~ D ~ S ~ G ~ A G ~ G G T G A - - - - - ~ M S G G K Y - ~ G A A P G - - A D L I G Y G ~ G G . . . . . ~ A L F ~ L D G ~ G G F D Y A ~ ~ ~ E Y ~ ~ ~ - - - - - - - - - D ~ R V ~ ~ S W G S S
~~NEP ENEMNWYDAVAGEASP......... Y D D ~ ~ G ~ ~ G T M V G S E - - - - - - - P D G ~ Q . l G V A P G - - A K ~ l A V ~ A F S E - - - - D G G T D ~ I L E A G E W V L A P ~ A E G ~ H P E M " - - - A P D V ~ S W G C G S G " " " - ~ ~ ~ ~ ~ ~ "
~~NFGQYKGYDFVDNDYDPI ( ET... TGDpRGEA~nG- ~~~AANGTl
...........
GVAPD~~ATLLAY RVLGP.. G 5 G T T E W I A G V E R A V Q D - - - - - ~ ~ - ~ ~ - - - - - - G A D V M N L S L G N S L N " - - - - - ~ ~ ~ ~ ~ ~ ~ " N P D
W V N D K V A Y Y H D y S I ( D G K T " - - A V D Q E B G T W S G I L S G N A P S E T ~ - - K E P Y R L . E G A M P E - - A Q L L L M R V E I V N - - G L A D Y ~ Y A Q A I R D A V ~ - - - - - - - - - - ~ ~ ~ ~ ~ - G A K V I N E I S F G N A A L - ~ ~ ~ " " " - ~ ~ A Y A N L P D E T
S C N G K I V G A Q Y F R H G A I A V ~ E ~ - N R T R D Y R S P F D ~ G E G S ~ T A S T ~ G N ~ ~ A ~ ~ ~ N G Y N F G Y A S G M A P G - - A W I A ~ Y ~ L ~ ~ - - - - F G G ~ S D V V A A Y D ~ ~ E ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - G V D I I S L S V
H C N S ~ L I C I R Y F ~ C I H A A I P - N A T F S M N S R R D T L G E G ~ T A ~ T ~ ~ N ~ N G A S ~ F G Y G K G T A R G I A P ~ - - R R ~ ~ ~ ~ ~ ~ T ~ P - ~ ~ ~ E G R Y T S ~ V L ~ G ~ ~ ~ I A D ~ ~ - ~ ~ - - ~ - - - - - - - ~ G V D V I
R C N R K I I G A R S Y H I G R P I S P G - - - - - - D ~ G P ~ D ~ G E G ~ T ~ S T ~ G G L V ~ ~ ~ L Y G L G L G ~ A R G G V P L - - ~ R I A A Y K V C W N . ~ ~ ~ D G C S D ~ I L A A Y D D A I A D ~ ~ ~ - - - - ~ - - - ~ ~ ~ ~ ~ G V D I I S L S V
K C ~ K L I G A R S Y Q L G H C . ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - S P I D D D G ~ G ~ ~ A S T ~ G A F V N G ~ F G N ~ G T A A G V A P F - - A H I A V Y K V C N S ~ ~ - - D G C ~ ~ V L ~ M D ~ I D D - - - - - - - - ~ ~ ~ ~ ~ - ~ - G V
L C N R K L I G A R F F R R G Y E S M G p ~ D E S K E S R S ~ ~ D D G E G ~ T 5 S T A A G S V V E G A ~ L L ~ Y A ~ G T A R C M L - - ~ - H A L A V Y K V C ~ L - - - - G G C F S S D l L ~ l D ~ l A D - ~ ~ ~ ~ " ~ " " " " W W L S U S L G G G M S ~ ~ ~ " "
KNVKERRI WRTL
.............
DDG~GEGT~VAGVI ASMRE~
...........QF A p D . . A ~ L H ~ F RV F ~ . ~ ~ N Q v S Y T SW F L D A F N Y A l L K -- - - - - - - - - ~ ~ ~ -~ - ~ I D V L N L S l G GP D F " " - ~ ~ ~ ~ ~ " "" U D H P F V
N C N R I ( I I C R R Y ~ S ~ ~ E D D D L K ~ ~ ~ I W P E S R T ~ ) Y Q G ~ C ~ Y T ~ T A A ~ S F ~ N ~ N G L ~ ~ ~ ~ ~ G ~ ~ A S S S ~ ~ A ~ ~ V C G L - - - ~ ~ G ~ P G ~ Q ~ L A A F D D A ~ ~ ~ - - - - ~ ~ ~ -
.~~~
~~~~ . - ~ ~ ~ f y ' : ~ ~ ~ ~ , , I ~ ~ ~ ~ : : ~ ~ ~ : ~ ~ . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ : ~ ~ ~ ~ ~ ~ : : : : ~
~ : ~ ~ : : ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ . ~ : : : ~ : ~ ~ ~ ~ ~ ~ : ~ :
~ ~ ~ ~
V I ~ R ~ D C S G ~ ~ ~ ~ ~ K K K ~ ~ . ~ ~ ~ ~ ~ . ~ ~ S S ~ ~ ~ ~ ~ S I A S G I I - . . . . . . . H S S R D V - D G V A ~ N - - ~ ~ ~ V ~ ~ T I ~ D ~ ~ L - G ~ M ~ T ~ T A L V ~ T K V ~ E L -
D V I A ~ D N G T ~ ~ ~ ~ ~ ~ N G ~ T . ~ ~ ~ ~ ~ ~ S D F H G ~ G T S V ~ ~ I A S R G R V L Y D L Y G D G ~ L ( ~ ~ G V ~ P G - - A K I A G G D A W L L - - - C N I L V L E A N L A G F N I V T E E E D G W Y L S L D P F G P H - ~ D I ~ S N S W G S ~ Y
~~~
~~~~
Fig. 2. Continues.
-
8/10/2019 Siezen, 1997, Subtilase
5/23
Subtilases
140
15:
160 170 1
. I
80
1 9 0
505
2 0 0
b a s b p n
be6168
b s s d y
b l s c a r
b 6 m r d
b = S p t Y
b a a p r q
b l s 1 4 1
b a a l k p
b s e y a b
b .aprs
b 6 F P 2
b a e c p r
T W a p L
p s a p r p
p a a 1 y s
b S C A 3 9
b e t a l l
bP16P
b 6 1 6 p l
b s l a k p
b a r s p q
L l l a p
i v L h e r
L s L a p
b s a k l
h m h l y s
n a h l y s
syos35
b s v p x a
d n b p r
d n a v p 2
dnavps
XCP'Od
. a E . l f
ear. A
l l l p r i
Y a p r o d
t r t 4 1 a
Laaqua
t a p r o c
i a p r o k
b b p r l
L a p r o r
t u h r l p
p l b s p r
m a c d p a
"Uespr
a 1 a p r i
a c a l p r
dfO'YZ
aooryz
a f e l s l
d n p r f d
d n p e p d
r h p r b l
a n p e p c
S C p F b l
SC YF PJ
"P6FPr
y 1 x p r z
scyct5
COr.AB(
e t c y 1 2
6 P P F P P
b s p a r a
1 s i a s p
h e c p l p
I l > P P
1 s p o :
c e p c 2
b c p c :
h s p c 2
d C P C 1
h e p c l 3
lapel
bCPCl
h s p a c 4
h r p c b
aa tu r
d m f u r l
f L f Y I
actur1
c c t u r ,
a c t u r 2
I s f u r 2
PC4
x l f u r i
h r f u r
d m f u 2 - 2
h a k x l
c e t u r 2
h s l p c
y l x p r 6
k l k e x l
e c k e x 2
h v c c v p
a p k r p l
avprca
m a s p a
S l S S P
s c s e p r
rmserp
S W S P l
p h e s a l
6.66P2
b s s p r a
b e s p r b
b s b p f
bsvpr
' p 6 c p '
I d p r t b
a g s e r p
l l s p 0 9
1cp69
CrnCYC
1 1 p r t p
p a a t 7 0
dCr3FI-P
h s k l a a
d d t a g b
d d c a g c
d a p g . 9
h s t p p 2
C C t P P
ems t ab
P f P Y r o
t 6 p 1 6 L
K A A ~ ~ V A S
.~~~G--G~cTs~sss.~~~...~ .......
. T V G y p G k y p
....
V I A V C A D S - - - - ..........---N Q R A S F
- - - - -
J G P E - - - - - - ~ - - - - - L D V I ( A P G V - - - - - - - - - - - ~ ~ ~ ~ ~ ~ ~ ~ ~
X ~ ~ V S S.~.C I V W - G ~ G ~ ~ ~ ~ = ~.~~..............G y p A K q p~...T ~ A V C A V N S
~ ~ ~ ~ ~ ~ ~ ~ ~ . . . ~ ~ . . .
N Q R A S F ~ ~ ~ - - S S A G S E - - - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ L D V I I A P G Y - - - - - - - ~ ~ ~ -
-
- - ~
KQAYDXAYAS..... I V W - C N ~ G ~ S G ~ Q N ............~~....~ G Y P A K ~ D ~ ~ ~ ~ S V I A V G AD~........~........XNRASF ... ~ S S V G A E ~ ~ ~ . . . . . . . .E V H A P C V - - - - - - - ... ~~~~
K Q A V D N ~ Y ~ R~.
~GV W-ACNSGSSG-
............~.....
I IG YP AK IY D. ~ ~ S V I A V c A V D S
.......~........
- A S F . . . . . S S V G A E ~ ~ ~ . . ....... E V H A P G A - - - - - - - - - - - - - ~
~ A C N N I IYNR~ I V V I A A A G N ~ G ~ ~ G -...........~~~~~.c y p - y s .... V ~ A v C A v ~ ~
~.~.............-
S F . . . . . S S V C S E . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~L E V H A P G V - - - - - - - ~ ~ ~
- -
-~
RQASDNAYNS ..... I W I ~ A ~ N ~ c ~ v L c L~.~ .. . . . .. . . . ..~ T I G y p - y D.... V l A V C A V ~ S
. . ~ ~ ~ ~ ~ ~ . . - . . . . .
M ( R A S F . ~ ~ ~ ~ S S V C S Q - - - - - - - L E V H I I P G . - - - - - - -
-~~
~ ~~~~~
Q L - - N A ~ ~ ~ ~ ~ C V L L I G I \ R ~ N ~ C Q ~ C ~ N...........~.....HGyp-yA ....VMAVGAVDQ................. NCM(ANF .....
S Y C S E - - - - L E I H I \ P G V - - - - - - - ~ - - ~ - - - - - - - - ~
EQAVNSATSR
..... v I \ R ~ C ~ ~ c A G .~.~~~~~.... . . . . . . . . . .I s y p - y A ....AMAVGA=Q ......~~~ ........-
~ ~ . . ~ ~ . S Q Y G A ~ - - - - - - - - - L D I V I \ P C V - ~ - - - - - - - - - - - - ~ ~ ~ ~ ~ ~
EQAVNQATAS ..... vLv -sI\QTsGAc ...~~~~~..............V C F p A ~ y A~ ~ . .~ A v c A = Q
......~..........
~ F . . . . . S Q y C A C ~ ~ - . . . . . . . . . . L D I V A P C V - - - - - - - - - - - - - - - ~ ~ ~ ~ ~
E L A V N Q A R I A ~ ~ ~ ~ . G V L V - T G ~ G ~ G
...~~~~..............
V s y p - q A ~ ~ ~ ~ N A L A v c A ~ Q
.......~........
NNRASF
.....
Q Y G T C ~ ~ ~ ~ ~ ~ ~ ~ ~ . . . . L N I V A P G V - - - - - - - ~ ~ ~ - - - - - -
E ~ ~ I V M L I \ N N ~ ~ C I L ~ ~ ~ ~ ~ ~ ~ ~ Q - - - - - - - . ~ ~ ~ ~ ~ ~ G ~ ~ y p ~ y ~ G ~ ~ ~ ~ ~ ~ ~ ~ Q - - - - - - - - - - - - ~ ~ ~ ~ . . . . . s ~ y ~ p E . . . . ~ ~ ~ ~ ~ ~ ~ ~ . I E I $ A P G V
WAVNRAYEQ..... G L L V I s c ~ G N G K
~~~.~~. ............
v ~ p I \ R q s . . ~ . S ~ ASAT ~ ~ ~ ~QA S F
.""
T T G D~ ~ ~ ~ ~ . . . . . . . . .E F ~ A p C T . ~ . . . . . . ~ ~ ~ ~ ~
Q N R I I ~ ~ L Y ~ ~ . ~ . C ~ L ~ I ~ ~ N s G ~ . . . . . . . . . . . ~ . . . ~ ~ ~ ~ ~ ~ ~ ~ s y p ~ s y ~ . . . . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ . . . . . . . . ~ ~ ~ ~ ~ ~ ~ ~
~ .
K S A L ~ X A Y N ~ " " . C I L I ~ ~ . ~ ~ ~ s ~ ~ ~ ~ ~ . . . . . . ~ . . . . . . ~ ~ ~ ~ ~ ~ L y p ~ ~ y ~ . . . . ~ ~ I ~ ~ c ~ ~ ~ ~ ~ . ~ ~ ~ ~ ~ ~ ~ . . " " " ~ L Q R L p ~
Q N A H R N F y Q Q " " . G H L L ~ ~ c N ~ c ~ ~ . ~ . . . . . . . . . ~ . ~ . . . ~ . . ~ . c ~ ~ y p ~ s y ~ ~ ~ ~ . ~ ~ ~ ~ " ~ " ~ ~ . . . . . . . ~ ~ . . . . . . . . s ~ ~ ~ ~ ~ . . . . . ~ Q ~ ~ ~ Q ~ ~ ~ ~ ~ ~ ~ . . . . . . ~ E~~
RDASyWAqQQ . ~ ~ . .AVQI-I\QTsGDc~pL ~.. . . . .~. . . .~~..~ ~ C y p A K y S...V I ~ - V D Q
. . . .~~~~.... . . . .
c S V p T~ . . ~ .S S D G p E~ ~ - ~ . . . . . . . ~ .D T A A P G V - - - - - - - ~ ~
-
~~ ~
~ A V N Y S Y N K - ~ ~ ~ ~ G V L I I A ~ I \ Q T S G P Y Q ~ - - - - - - - - - - - - - ~ ~ ~ ~ ~ ~ ~ S I G Y P G A L V - - - - N A ~ A V ~ L E N ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - ~ V E N C T Y R V A D F - ~ ~ ~ ~ S S R C ~ S ~ ~ C D ~ A ~ Q ~ ~ D - V ~
- - ~~~~~
T N A V D Y A Y D K - ~ ~ ~ - G Y L I I ~ A I \ G I ( S G P K P G - - - - - - - - - - - - - ~ ~ ~ ~ - - - S I ~ Y ~ ~ A L V - - - - N ~ ~ ~ ~ A ~ ~ ~ N ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - ~ I Q ~ ~ T ~ ~ ~ ~ ~ F ~ ~ ~ ~ ~ S S ~ ~~~~~
E A K ~ ~ S . Q I L W ~ - ~ ~ c ~ ~ D~ ~ ~ ~ ~ . . . . . . . ~ . . . . . . ~ ~ ~ ~ ~ ~ ~ ~ . ~ . .V I S V G A I N F ~ ~ ~ ~ ~ ~ ~ ~ ~ . . . . . . . .~ A S E F ~ ~ ~ ~ ~ S N S ~ E~ ~ ~ ~ ~ ~ ~ ~ V D L V A p G E . . . . . .
.
~ ~~~
~ ~ ~ ~ N I \ V ~ N . G V L V V C ~ C ~ ~ D C D E R T E~~..............
L ~ y p A A~ . . .E V I A V G S V S V~ ~ ~ ~ ~ ~ ~ ~ ~ . . . . . . . .R E ~ S ~ F ~ ~ ~ ~ ~ S N A N K E.......... ~ ~ ~ L V A ~ G ~ . . . . . . . ~
.
~~~~~
I ( ~ R V X y A V S N " " " 1 S v v ~ ~ ~ ~ c D c ~ ~ D~~~. . . . . . . .~ . . . . .~ A y p A A y N V I A V G A V D F ~ ~ ~ ~ ~ ~ ~ ~ ~ . . . . . . . .L R L s D F . ~ ~ ~ ~ p - E E ~ ~ ~ . ~ .D I V A P G . ~ . .
H Q ~ I R W \ ~ ~ E . ~ ~ " D I L V ~ ~ ~ ~ ~ ~ ~ c ~ ~ ~ ~ ~ . . . . . . . . ~ . . . . . ~ y ~ y p c ~ y p . . . . ~ ~ ~ Q ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ . .
WDRIKEAVAS"".GRLVV=-G~cDcNEE ~~~~... . . . .~ ~ ~ ~ ~ ~ ~ F A y p G A yE VVQVGSVSL . . . ~ . . ~ ~ ~ . . . . . . . . ~ ~ ~ ~ ~ ~ ~ . . . . .N S N C K ~ ~ ~ ~ ~ ~ ~ ~ ~ ~D L V I \ * G E . ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ .
Fig. 2.
Continues.
-
8/10/2019 Siezen, 1997, Subtilase
6/23
506 R.J. Siezen
and
J.A.M. Leunissen
basbpn
bs6168
bssdy
bl scax
bsti pl c
b s s p x d
bsapxq
b l s 1 4 7
baa1
kp
bseyab
b s a p r s
bscpr
bsseor
p=ap=p
Ymvapt
p d a lY6
b s f a 1 9
b ~ t . 3 4 1
b p l s p
b s i s p l
bs l akp
b s l s p q
L l l a p
LvLher
b s a k l
t s t a p
hmhl ys
n a h l y s
s y o 5 3 1
bsvpra
dnbpi
dna"pc.
dnavp?
a l a p r l
xrproa
. .c .st
c o r .
AB
"aproa
L r t 4 1 ,
a l a p l 2
L a a q u a
t a pr o t
L a p r a k
Lapro .
bbpr l
p l bs p,
t u ea1p
macdpa
a o c s pr
.CdlD.
a t a r y z
d 0'yZ
i l f C 1 . t
anprta
anpepd
Lhi l rbl
= pep=
s;cprbl
ECYspl
spsepr
y l X pr 2
scyct5
cor-C
efcyla
s ep=pp
b s p a r a
161asp
Geepl p
l l n16 p
1
PC2
ccpcz
bcpr2
hPpc2
dcpcl
hspc l3
l a p r l
bcpc3
hspac4
hspc6
a a t u r
d m u r i
t t t u r
c c t u r 1
a cf u r l
a c t u r 2
1 e t u r 2
mmpc 4
x l t u r R
h s f u l
d m u r 2
ce tu r2
h a k x Z
h s l p c
k l k e xl
ecker2
y l r p r 6
e pk r p l
hvccvp
avprca
asaspa
s1s sp
6c6epT
srnscrp
6msspZ
6mSSpl
ph sea l
bs e pr a
bs a pr b
bsbpf
b s G r
spscpa
1 1pr t p
l d pr t b
1 1 6 ~ 0 9
a gs c r p
l e p69
paat70
cm.Ic"c"
at s crp
hsk l aa
ddtagb
ddt a gc
dmpga 9
hs t pp2
=t PP
P f P F O
smst ab
t s p1 s t
~ ~ N l L S T W I C S ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
- - - - - - - - - - - - N Y A R I I I - S G T S H R S P H I A C L L A Y F V S L Q P S S D S A ~ A V ~ ~ ~ ~ ~ ~ E E L T P A K L K K D I I A I A T E ~ A ~ ~ - - - - - - - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~6 1
~~N I L S T Y I GS - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - ~ - - - - - - - O D I T A T L - S G T S H I S P R Y I C L L T Y F L S L Q ~ C S D S E F F E L G Q ~ ~ ~ D S L T P Q Q L K K ~ ~ ~ ~ ~ ~ T ~ ~ ~ ~ ~ - - - -5 1
~ ~ N l M S T Y I C S ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ ~ - ~ - ~ - - R N A T L S L - S G T S M I S P R Y l G l L S Y F L S L Q P R P D S E F F N ~ ~ ~ ~ ~ ~ D A P S P Q E L K E ~ " ~ ~ ~ ~ = ~ " L G
- - O I I S A S Y Q S - - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ D S G T L V Y ~ S G T S M A C P H V A G L A ~ Y Y L ~ l ~ - - - - - - - - - - - - - - E V L T P A Q V E A L I T E S N T G V L P T ~
- N I L S T W I C S ~ ~ - - ~
~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ N T S T N T I - S G T S M A T P H V A G L S A Y Y L ~ L ~ ~ . .
- - -
~ ~ - - - ~ - I \ A S I I S E V X D A I I K M G I H D V L L ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ S I P V G S S
- - EI ESLSHLN~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ Y N D T L I L - S G T S M S T P l V T G V ~ ~ L L ~ K C . . . - - ~ - - - - ~ - ~ - - - - I E P E M I A Q E I E Y L S T R ~ F H R R T L - F F I ( P5 1
- - - m
- - Y Y P T S L V S P L G K A A D F ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ P D ~ Y T L S - F G T S L A T P E V S A A L A ~ l ~ ~ ~ ~ - - - - - - ~ - - - - - - D ~ ~ ~ D S N ~ V ~
- RNSHL KY KEV RI I
~ E I T T M l V A N 7 R L V G K I S D ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ P I G Y T L N - M G N S I A T S Y R S G C N - - -- - -
- I R Y P S I N E I I S L l S r Y O D K E R N L ~ ~ ~ ~ ~
- I E I T K R VI E DE I V
~ - E I I T T I G T D A I W I D F Q F I E N V P R G F I l n - I G T S L I T G L F ~ I ~ - - - - - - - - - - - - - - - - - -- - - S L QR F KS A NF Y
K Q S V L S T S S ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - ~ - ~ ~ - - - - N G R Y I Y Q - S G T S L R ~ P I Y S G I \ L R L E I D I ( Y Q - - ~ ~ - - ~ ~ ~ - - -L ~ O Q P E T A I E L F r K r c l E K E r Y H D R X B Y G N C r L D V Y K L L K E
KDWLFTTAN ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ T G W Y Q Y V ~ Y C N S F A T P K Y S G A L ~ L ~ ~ D K ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
DA GV AT T DL Y - - - - - - -
- - - ~ ~ ~ - ~ ~ ~ ~ - ~ ~ ~ ~ ~ M I C T A S H ~ S ~ T S A A A P E A A C Y P R L R L E A ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ N L T W R D M Q ~ L N L T S K ~ N ~ ~ ~ D ~ - - - ~ ~ - - - - N ~
~ ~ E " L A l D K ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ . ~ " " " " " " " . Q S E I T I Q . S G T S F I \ T P ~ " ~ ~ " ~ ~ L y l E D C E " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ S I D L D F L R S I ( S E D L G " " " " ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
E T C V A T mL y - ~ ~ ~ ~~............... R C T R S H - S G T S A A R P E A A G \ I F R L A L A L ~ A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ S L T W R D L Q H L N L T ~ ~ ~ N ~ ~ ~ D ~ ~ C ~ F I I ~ l N C S H F E U ~ N G V G L E Y I D M ( L F G
E A C V A T T D L y - ~~ ~
................
N C T L ~ R - S G T S ~ A P E A A C Y F R L A L A L Q A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ N L T W R D ~ Q " L T " L T ~ ~ ~ N ~ ~ ~ ~ ~ V H E - - - - - - - ~ - ~ ~ - W ~ N G V G L E F I D M (
E A G V A T T D L Y ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ G N C T L R H - S G T S A A A P ~ ~ ~ ~ ~ F ~ L ~ L E A N L - - - - - - - - - - ~ - - - - G L T W R D M Q H L T ~ L T ~ K ~ N Q L H D E V H Q - ~ - ~ ~ ~
D P R I T S A D L H ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - N E C T Q T H - T G T S A S A P L A A ~ I F A L A L E Q N P - ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ L T W R D L Q H I V V W T S E F D P L A ~ G - - - - - - - - - - - - -
E C R V T S A D L H ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - G K C R I S R - ~ ~ T ~ ~ A A ~ ~ ~ A ~ L ~ A L L L E S N P ~ - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ N I T W R D A Q ~ ~ ~ A H T S R M E
D Q R I T S A D L H ~ ~ ~ ~ ~ ~ ~ - ~ - ~ ~ - - - - - - - - - - - - - - N D C T E T H - T G T S A S A P L A A G I F A L A L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ N L T W R D M ~ H L ~ ~ ~ T S E Y D P L A M I P G ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - - W
D Q K I S S ~ L H ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - - - - - - - - H E C T D S H - T C ~ S A A A P L A A G ~ L A L A L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ N L T W R D V Q ~ L I V W T S E Y D P L S S ~ G ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - -
D K K I I ~ D L R ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - - - - - - - Q R C ~ I D M ( ~ T C T S A S A P M A A G I I A L A L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ F L T W R D V Q H V I V R T S R A G H L N A ~ ~ - ~ ~ ~ ~ - ~ - - ~
E R K I V ~ D L R ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - - - - - - - Q R C m C H - T G T B V S R P M V A C I I A L A L E A N S ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Q L T W R D V Q H L L ~ K T S R P A H L K A S D - - - - - - - - - - - - - - ~ ~ ~ N ~ A ~
E K Q V I ~ L H ~ ~ - - - - - - - - - - - - - ~ - - ~ - - - - ~ ~ - H S C T S S H T C T s R S R P L A A G I A A L V L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ N L T W R D L Q ~ I V V R T A K ~ G N L K D P T ~ ~ ~ ~ ~ ~ ~ ~~ ~ ~ ~ W S ~ N G Y G R R V S H S F G Y G L M D A A H V I L A O
E K Q V V T ~ L H - - - - - - - - - - - - H S C N S H T C T B R S R P L R R G I A A L V L Q S N Q ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - ~ - ~ N L T W R D L Q ~ I V V R T A K P A N L K D P S ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ S R N G V ~ R R V S H S
E R E I I T S D L H - ~ - - - - - - - - - - - - - - ~ - ~ - ~ - ~ ~ ~ ~ H S C T T Q H ~ T G T S A S A P L A A G I C A L A L E A N K ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Q L T W R D M Q H I V V R T A R L A N L Q S S D ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - ~ ~ T N ~
E K ~ I L T T D L H ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - - - - - - - - - H A C ~ H - T G T S A S A P L A A G I V A L A L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ N L T W R D L Q H ~ V I R T A K P I N L ~ G D - - - - - - - - ~ - - - - - W T T N G V G R
E K Q I V T ~ L H ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - - - - - - ~ ~ ~ ~ ~ ~ - ~ ~ T ~ A S A P I V ~ ~ L L A L A L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ S L T W R D L Q ~ I I ~ E T A K ~ D ~ L ~ ~ D ~ - - - -
E R Q I A T T D L R ~ ~ ~ ~ - ~ ~ ~ ~ - - - - - - - - - - - - - - - - - Q R C T T I 1 I - T G T S l S A P L A A ~ I ~ A L ~ L E A ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ D L T W R D V Q Y I T L M T S R S D P I ~ D G Q - - - - - - - - - - - ~ - - W I V N G V C R
O P Q I V ~ L H - - ~ - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ H Q C T D K H T G T S A S R P L A A G M I A L ~ L E A N P - - - - - - - - - - - - - - - L L T W R O L Q H L V V R A S R P A Q L Q A E D ~ ~
E K C I A S T D L H ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ - - - - - - - - - - - - E K C T ~ - T G T S I \ S I ~ ~ ~ ~ ~ ~ ~ E - - - - - - - - - - ~ ~ - ~ W V T N C V G R Q V S L R Y G Y
E I ( Q I V T T D L R - - - - - - - - - - - - - - - ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ Q K C ~ S H ~ T C T S A S A P L A A G I l A L A L E A N K ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ - N L T ~ R D M Q H L V V Q T S N P A G L N A N D ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~
E K Q I V T T D L R - - - - - - - - - - - - - - - " - Q I [ C T E S H T C T E S H ~ T G T S A S A P L A A G I I A L T L E A N K - ~ ~ ~ - ~ ~ ~ ~ ~ - - - - - N L T W R D M Q H L V V Q T S K ~ A H L N A ~ ~ ~ ~ - ~ ~ ~ ~ ~ - ~ ~ - - W A T N G V G R K V S H S Y G Y G L L D
Q P A I V N D V P - - - - - - - - - - - - - - - - - - - - ~ ~ ~ - ~ ~ G G C ~ K H ~ T G T S A S A P L A A C I I A L A L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ E L T W R D M Q H L V L R T A N ~ K P L E ~ G - - - - - - - - - - - - - W S R N G V G R M V S N
D K S V A N D H D G S L R P D - - - - - - - - " - H I C ? 1 I E H T C T E R S A P L A A G I C A L A L E A N P ~ ~ ~ ~ ~ ~ ~ ~ - - ~ ~ - ~ ~ E L T W R D M Q Y L V V Y T S R P A P L E ~ E N C ~ ~ ~ - ~ ~ ~ - - - - - - ~ T L N ~ V K R K Y S N K F G Y G L M D A G A
E N ~ H Y ~ L Y - - - - - - - - - - - - - - - - - - - - - - - - ~ - H ~ T E E F ~ K G T S A S A P L A A G I ~ A L T L E A N P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ L L T ~ R D V Q A L I V H T A Q I T S P V D E ~ - - - - - - - - - - - - - - W ~ R N C R C F H F ~
L R S I V T T D W D L Q K G - - - - ~ ~ - - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ T ~ C T E C H ~ T ~ T S A A A P L A A ~ M I A L M L Q V ~ P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - C L T W R D V Q H I I V F T A T R Y E D R R A E - ~ ~ ~ ~ ~ ~ - - - -
- - Y I Y G T D I N A I D D K S R R - - - ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ P R C Q N Q H ~ G G T S A R A P L A A G V F A L A L S V R P - - ~ ~ - ~ ~ ~ - - - ~ - - - D L T W R D M Q Y L A L Y S A V E I N S N D D G ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - - ~ Q D T A S G Q
- - Y I I T T D L D - - - ~ - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - E K C S K I 1 ( - G C T s l U \ R P L A A G I Y T L V L E R N P - - - - - - - - - - - - - - - N L T W R D V Q Y L S I L S S E E I N P H D G K ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - W Q D T ~ G
- - Y ~ H S S D I N ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ - - ~ ~ ~ - - - G R C S N S H - G G T S ~ A P L A A G ~ Y T L ~ L E A N ~ - - - - - - - - - - - - - - - N L ~ W R O Y Q Y L S I L S A V G L E I W A D G D ~ ~ ~
~ ~ S I L ~ P E ~ ~ - - - - - - - - - - - - - - - - - - - - - - - - G T C T R S H - G G T S A A A P L A S A ~ Y A L A L S I R P - - - - - - - - - - - - - ~ ~ D L S W R D I Q H ~ ~ Y S A S P F D S P S Q N A E - - ~ - ~ - - ~ ~ ~ ~ ~ U Q K T P A G F Q F S H H F G F G K L D
~ ~ P N E ? V U Y D ~ - - - - - - - - - - - - - - - - - - - - - - - - - G K C G F I P - S S S S A R P P I L G ~ L L A L I R A H P - - - - - - - - - - - - - - ~ T L T L ~ I Q R I L ~ R A A ~ ~ V ~ T ~ ~ G R G W ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ W L M l V ~ R ~ ~ R N F
T P G I W T R D R T G V - V G Y N S G N L G D Q A - - - - - - - - - G N Y R I ~ ~ - ~ ~ T S ~ A C P ~ - ~ " ~ ~ L I L S ~ N ~ - - - - - - - - ~ - - - - - - ~ ~ ~ ~ D ~ " ~ D I I K R ~ C D R I D P V G G - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - ~ N A E
81
A P A ~ V T ~ L P G C D U G Y M l V D D P S T N R L H M I P Q L D l S C D Y N G ~ ~ ~ - - - - - - - - - - - - - - ~ ~ D L S Y R D L R D L L I ~ N I T R L D A N ~ P V Q I N Y I 9 ) V T G L E C W E R N A A G L W Y S P S Y G F G L V D V N K T Q P C S I I l
- - L I L G T L P - - - - - - - - - - - - - - - - - - - - - - - - - - - G G K ~ G Y M - A G T S M A S P H V A G V A A L l K S ~ P - - - - - - - - - - - - - - - H A S P A M V K A L L Y A ~ A D A T A C T K P Y D l D G D G K V D A V - - - - - - ~ E ~ P K ~ ~ C F Y G ~ G M A D A L D A V T W
- - D I Y S T Y P - - - - - - - - - - - - - - - - - - - - - - - - - G C G Q C T Y P - - - - - - - - - - - - ~ ~ ~ D ~ T P A Q I ~ T R I E ~ T A E R S V N G ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - ~ - - - - - - - - - H D D F V G
0)
~ ~ D I Y S N C R L E S ~ G C A V M ( E A Y N K G E L S L ~ ~ - ~ ~ N P G Y G N K - S G T S M A A P H V T G V A A V L M Q R F P ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Y ~ S A D Q I S A V I K ~ A T D L G V A - - - - - - - - " G I
- ~ R V Y S S I I E G T S V E N L - - - - - - - - - - - - - - - - - - - T T G Y A K Y - S G T S M A A P H V A G S V A V L M E R F P - - ~ ~ - - - - ~ - - - - - - Y L N G A Q V A E V L K T T A ~ M G A P ~ ~ ~ ~ - - - - - - - - - - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ G I D A L Y G W G M
- - I ( I Y S N R N G S D P ~ - - - - - ~ - ~ - ~ ~ - ~ - - - - - - ~ S D Y G N K - N G T S M A T P H V T G A V A ~ L L Q R F P - - - - - - - - - - - - - - - ~ ~ S S A Q I A D V L K T T A ~ M G A P ~ ~ ~ ~ ~ - - - - - - - - - - ~ - - - ~ ~ ~ ~ ~ ~ C I D A L Y G W G I I I
- - L I G V A D E H K K P - - - - - - - - - - - - - - - - - - - - - - - Q Y G L T K E - ~ T S F S A P A I T A S L A V L K E ~ ~ D - - - - - - - - - - - - - - - ~ ~ T A T Q I R D T L L T T A ~ L G E K - - - - - - - - - - - - ~ - ~ ~ ~ ~ ~ ~ ~ ~
- - O I Y S A R Y F T P L S A L S A Q I L E Y I S P R H - - - - - - - L P Y Y T T F - S G T S M A A P H V A G I l A L M L E ~ ~ - - - - - ~ - - - - - - - - - ~ ~ ~ ~ L E ~ K E I L E G T A l P M E G Y - - - - - - - - - - - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ A l W E T
~ - D I Y S ~ R V L A P L S A L M E I A ~ L l ~ P Q H - ~ - ~ ~ ~ ~ L P Y Y T ~ ~ S G T S M A T P ~ A G l V A L M L E A D P - - - - - - - - - - - - - - - T ~ ~ P D Q V K E I ~ Q H T A ~ P G Y ~ ~ ~ ~ ~ - - - - - - - - - - - - - ~ ~ ~ ~ ~ ~ ~ ~ E A
~ ~ N I R S S V P ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ ~ ~ ~ - - - - - - - - - - ~ ~ G Q T Y E D G ~ D G T S M A G P ~ V S A V A A L L K Q ~ A - - - - - - - - - ~ - - - - - S L S V D E M E D I L T S T A E P L T D S T ~ ~ ~ - - - - - - - - - - ~ ~ ~ - ~ ~ - ~ ~ P D S
~ ~ N I V S T I P ~ P D H ~ ~ ~ ~ - - - ~ - - - - - - - - - - - - P Y C Y C S I ( Q C T S M A S P H ~ A G A V A V I K Q A K P ~ - ~ ~ ~ - ~ ~ ~ - - - - - - K W S V E Q 1 K A A ~ M ~ A V T L K D S D ~ ~ ~ ~ ~ ~ - - - - - - - - ~ ~ ~ ~ ~ ~ ~
~ ~ D I L S S V A - ~ ~ - - ~ ~ - - - ~ - - - - - - - - - - - - - - - - M I K Y A K L ~ S G T S M S A P L V A G I M G L L Q K Q Y E ~ ~ ~ ~ ~ ~ ~ T Q Y P D M T P S E R L D L A K K ~ L M S S A T A L Y D E D ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - ~ ~ E ~ Y F
~ ~ N I W S T Q N ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - ~ ~ ~ M I C Y ~ ~ S C T S M A S P ~ l A ~ ~ Q ~ ~ ~ K Q A L M I K M I ~ ~ Y A ~ ~ ~ Q ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ? V
- - ~ l Y S L I \ ~ - - - - - - - - - ~ - ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ - ~ ~ ~ - D N K Y Q Q M ~ S G T S M A S P F ~ A ~ S E A L ~ L Q G ~ ~ - - - - - - - ~ Q ~ ~ N ~ ~ ~ ~ ~ ~ ~ Q F ~ ~ ~ A ~ ~
- - S I W R A W S S N S T E - - - - - - - - - - - - - - - - - - - - - - G E N F A L ~ - S ~ T S M A T P ~ A G ~ A ~ ~ ~ K Q ~ H P - - - - - - - - - - - - - - - N W S P ~ l A S A l M ~ A Q ~ D ~ ~ ~ L L - ~ ~ ~ ~ ~ ~ ~ ~ ~ A Q Q A T ~ P S T A T
~ ~ L Y L A S W I P N E A T A Q I C R i Y Y L - - - - - - - - - - - - - ~ ~ H ~ ~ ~ - ~ ~ T S M A ~ P H A ~ G V ~ A L L ~ A H P ~ ~ ~ - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ E W S P A R l ~ ~ ~ ~ ~ ~ ~ N ~ ~ ~ N T L N P l ~ ~ ~ ~ ~
~ ~ N I L A A W P T S V D D N ~ - - - - - - - - - - - - - - - - - - K S T F N I I - S G T S M S C P H L S G V R A L L ~ S ~ P ~ ~ ~ ~ ~ ~ ~ ~ ~ - - ~ ~ ~ - D ~ ~ P ~ ~ K S ~ M ~ R D T L N L A N S P I - - - - - - - - - - - - - - L D E R L L P A D ~ Y A I
- -
EI LAAWPSVAPVGGI R
- - - - - - - - - - - - - - - - -
TLFNII-SGTSMSCPHITGIAT~KTY~P ...~ T W S P A A l K S A L M ~ A S P M N ~ - - - ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - F N P Q A E F A Y C S G H V N P L K A V R ? G l
- ~ N I I A A W N P P N Q S D E D T W S E H T - - - - - - - - - - - - P S T F M L L ~ ~ ~ - ~ - - -- - - - T N ~ S D ~ P G T P F D F G A G W N P I C R L P P C o
~ ~ N ~ L A A U T G A R G P T G L A S D S R - - - - - - - - - - - - - - R V E F N I I ~ S G T S M S C P H V S G L A A L L K S V H P - - - - - - - - - - - - - - - ~ ~ ~ P ~ ~ R S A L M ~ A Y K T Y K D G K ~ ~ - - - - - ~ ~ ~ ~ ~ ~ - - L D ~ A T G K P S T P F D H G A G H V S P T T A
- - GVRGSGV~ - ~ - - - ~ ~ ~ ~ ~ ~ ~ ~ - ~ - ~ ~ ~ . ~ ~ ~ - ~ -GGCRAL. ~ G T ~ ~ A ~ ~ ~ A ~ A ~ T L L ~ ~ ~ Q . . . . . . . . . . . . ~ E L ~ N ~ A ~ ~ K Q A L I A ~ ~ ~ ~ ~ ~ ~ . . . ~ ~ ~ ~ ~ ~ - - - - - -
- - Y ~ ? . ~ ~ ~ ~ N ~ E N S T o Q C G D G S L P N - - - - - - - - - - - R N ~ ~ ~ ~ ~ - ~ ~ T S ~ A T P L A T A A T T l L R Q Y L V D G Y F P T G E S V E E N K L ~ P ~ ~ ~ ~ ~ ~ ~ L ~ l M I A Q L L N G T Y F W S A S S - - ~ - ~ ~ T N P S N A ~ F E Q l N C A N
- - Y I T S ~ S N G ~ ~ Q C GD G S L P N - - - - - - - - - - -~ A L L A l - S G T S M A T S F AA A A ~ l L R Q Y L V D G Y Y P TG S l V E S ~ L Q P T G S L L K AL M l M I A Q L L N G T F Q L lT S S S l ~ ~ ~ - T Y P S N Q V FE N F A G A S L V ~ W G A I ~ S NW L H V l l O ~ 2 l
~~A I A S V P QF T ~ ~ ~ ~ . ~ ~ ~ ~ ~ ~ ~ ~ . ~ . ~ . . . ~ . . .S K S Q L M . N G T ~ M - ~ ~ A = A " * ~ ~ I S = L K...
~ ~ ~ ~ ~ ~ ~ ~ N I E ~ ~ ~ ~ S I K R ~ ~ ~ V T A T K L G ~ V
. ~ . . . . . . . . ~ . ~ ~ ~ ~ ~ ~ - - - - - - -
~ ~ A ~ H G L L ~ ~ = ~ ~ ~ H L 1 2
~~AI ASVPNWT ~ ~ ~ ~ ~ ~ ~ ~ . ~ ~~ ~ ~ Q ~ H . N ~ ~ ~ ~ ~ S ~ N A C ~ ~ ~ A L ~ L S ~ L K
. . . . ~ ~ ~ ~ ~ ~ -D ~- ~ V R R A L E ~ A V ~A D N ] ~ . . . . . . . . . . . ~ ~ ~ ~ ~ ~ ~ - - - - - -V F A Q G ~ G l ~ Q V D K A Y D Y L 1 7 3 ~ 1
.. FAGYPQYC......................... R Q ~ M - ~ . N ~ T S ~ S ~ ~ N - G - A C M L ~ G L K. . . . . . . . . .Q ~TPY? VRMALE~AYMLP I . . . ~ ~ ~ ~ ~ ~ ~ - - - - - - - - - - -~ ~ ~ E S F S Q G ~ U l K l A T A Y E K L i ~ l 3 l
. F E U A S ~ T I D C R G Y ~ ~ ~ . . . . . . . . . . . . . . ~ . ~ ~ ~ A Q p D V F - ~ ~ T ~ ~ A T P y T S ~ T ~ A L ~ ~ Q A Y K E - - - - - ~ ~ ~ - ~ V y ~ T p D p ~ T A ~ ~ ~ L K S S A K D I W Y ~ ~ - - - - - - - - - ~ - - ~ ~ ~ ~ ~ ~ - - - - - - ~ P A F S Q C S G R M A L K A R
.. YSSLPMW
.........................
1 G ~ ~ F M . s G T s M ~ T p ~ V S G ~ A L L I s G p K.......... ~ ~ ~ y ~ p D ~ ~ ~ ~ v L E s ~ A T ~ L E G D P ~ ~ ~ ~ ~ ~ ~ - - - - - - - - ~ ~ ~ ~ ~ T G Q K Y T
~~HI~.SSLPLWYTV-S ~ ~ ~ ~ ~ ~ ~ . ~ ~ ~ ~ ~ ~ ~ . . -~ ~ ~ . ~ ~ ~ ~ ~ A ~ ~ ~ ~ ~ ~ ~ ~ A L ~ I ~ ~ A K ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Q ~ ~ ~ ~ ~ ~ ~ ~ ~ A L ~ L ~ ~ K
I
Fig. 2.
Continued
(see
facing page for
caption).
-
8/10/2019 Siezen, 1997, Subtilase
7/23
Subtilases
sequence difference are not shown in Figure
2
but are listed in
Table 2. Amino acid numbering used throughout this review cor-
responds to that of mature subtilisin BPN (acronym basbpn), our
reference sequence. Residues in inserts relative to this reference
sequence are numbered in square brackets; for instance, residues
inserted between positions 12and 13are numbered 12[+ 1],12[+2],
etc.,
or
13[ -21, 13[- 11 if more appropriate.
The conserved catalytic residues Asp 32, His 64, and Ser221 are
highlighted in Figure 2, as is the oxyanion-hole residue Asn 155.
Conserved core elements (black bars) and secondary structure are
indicated (Siezen et al., 1991,Heringa et al., 1995).This structural
framework can be used for homology modeling of subtilases of
known p rimary structures but unkno wn three-dimensional structures.
In some of the most highly diverged sequences there are regions
with very weak sequence homology, even in the core, which re-
sults in alignments that are not unambiguous. In those cases, al-
ternative alignments to those in Figure 2 may need to be considered.
These regions are found on the surface of the molecule and contain
numerous olvent-exposed residues, allowing orgreater ide-
chain variation. Examples are (a) the exposed regions 43-58 and
182-21 8, which contain structurally conserved P-strands and turns;
and (b) the exposed amphipathic helices 104-1 16, 133-144. and
243-252. In the latter case, the sequence alignm ent of amphipathic
helices is also based on the requirement that at certain positions
non-polar side chains are conserved that point into the interior of
the molecule, while polar residues face outward. When necessary,
correct three-dimensional positioning of Cys residues to form
putative disulfide bonds was used as an aid in proper sequence
alignment.
Sequence homology and family division
In Figure 3, the pairwise sequence identity within the catalytic
dom ains s plotted graphically for all members of the subtilase
superfamily aligned in Figure 2. It is clear that clustering occurs
into groups or families, in which members show higher sequence
identity to each other.
Figure 4 shows the parts of a family tree or cladogram, a mea-
sure of the sequence homology between superfamily members,
constructed from the sequence alignment of the catalytic domains
in Figure 2. In our earlier paper, a less extensive tree identified two
main classes and some subclasses (Siezen et al., 1991).This ex-
panded sequence information now allows a new subdivision into
six families, which are summarized below. The dendrograms in
Figure 4B illustrate the sequence homology within these families
and further subdivision into subgroups (or subfamilies). Many of
these subgro ups are also apparent from the color patterns of se-
quence identity in Figure 3.
507
Subtilisin family
Only found in micro-organisms as yet. Includes mainly enzymes
from Bacillus, with subgroups of true subtilisins (>64 identity),
high-alkaline proteases
(>55%
identity), and intracellular prote-
ases (>37 identity). Numerous minor variants of true subtilisins
and high-alkaline proteases have been identified (Table 2). Long
C-terminal exten sions are rare. Several 3D structures are known
(see Tables 1 and 2).
Thermitase family
Enzym es found only in micro-organisms, including some thermo-
philes (>55% identity) and halop hiles. The c haracteristic N-terminal
sequencewas also ound in severalother Bacillus proteases
(Table 3). Only one 3D structure is known (thermitase).
Proteinase K fam ily
Large family of secreted endopeptidases found only in fungi,
yeasts, and gram-negative bacteria as yet; the bacterial subgroup
has >55 sequence identity. This family is characterized by a
high degree of sequence similarity (>37 identity), only minor
insertions and deletions and the absence of the Ca2+ -bindi ng oop
residues 76-81. Only a few of these enzymes have a significant
C-terminal extension beyond the catalytic domain. One 3D struc-
ture is known (proteinase K).
Lantibioric peptidase family
A small number of highly specialized enzymes for cleavage of
leader peptides from precursors of lantibiotics, a unique group of
post-translationally modified, antimicrobial peptides (Sahl et al.,
1995). Theseendopeptidases have only been found in gram-
positive bacteria, and several are intracellular. Only llnisp has a
C-terminal extension, which acts as a membrane anchor. Charac-
terized by low sequence similarity with each other and other sub-
tilases (Fig. 3), and by numerous insertions/deletions. The most
recently reported protein bspara from Bacillus subtilis is described
as a putative protease required for plasmid stability; we speculate
that it may also play a role in lantibiotic processing.
A few 3D structures have been predicted by homology modeling
(Siezen et a]., 1995a;Booth et al., 1996).
Kexin family
A large group of proprotein convertases (PCs) have been iden-
tified, all involved in activation of peptide hormones, growth fac-
tors, viral proteins, etc. (Barr,
1991;
van de Ven et al.,
1993).
High
specificity is seen for cleavage after dibasic (Lys-Arg or Arg-Arg)
or multiple basic residues. Nearly all are eukaryotic and have high
sequence homology (>40 identity), while two more distant mem-
bers from Aeromonas and Anabaena provide links to other subti-
Fig. 2. ( f ucing page ) Alignment of amino acid sequences of catalytic domains of subtilases. Multiple sequence alignment was initially
performed using the PILEUP program (Devereux et a l., 1984). Nex t, improvements were made manually by taking into accoun t the
structure-based alignm ent (Siezen et al., 1991;Heringa et al., 1995).Inserts were udged to occur m ost likely in turns in external loops.
Fam ilies A to F are indicated on the left. E nzym e acronyms are given in Table 1 . (*) New entries, and (c) corrected entries since S iezen
et al.
(1991).
Residu e numbering at the top corresponds to that o f mature subtilisinBPN (basbpn). Catalytic residues A sp 32, His 64,
and Ser
221
are in bold (hig hlighte d red), as is the o xyanio n-hole residue Asn 155. Green
=
highly conserved residues from Table 4;
yellow = Cys residues. Structurally conserved regions of the coreABC and extended coreAB are shown as solid bars; common
secondary structure elemen ts are show n as: h = helix,
e
= extended p-sh eet, b = bend and t =
p-turn
(see also Fig. I ) . The number
of additional residue s in arge nserts n
the
catalytic domain, and n N- andC-terminal extension s, are show n in brackets.Each
seque nce begins at the mature N-term inus; an N-terminus based on the predicted pro-peptide cleavage site is indicated as().
Resid ues 146-156 of bspara are from a different reading frame than
proposed by the authors.
-
8/10/2019 Siezen, 1997, Subtilase
8/23
-
8/10/2019 Siezen, 1997, Subtilase
9/23
Subtilases 509
k4
E
o * ~ m g o o - t ~ 8 ~ g q m - m ~ w
w
s.pmvi %
B P % s q g g E g s % ; z g g p gm t -
m 8
NOS %
m
% 2 @ z x ~ ~ ~ E
- v i - t - P - 0 s s g . I = I - I
gGr??zEgggY-
~ i d x E o o x a ~ Z E a ~ ~ ~ ~ a
2 o a m >
s X E E ~ G E E Z S Z E Z
2
N .-
2
e
w m m
+ + + + + + + + + + + + + + + + + e . e . + + + e . + + ++ - + + + + e . + + + +
t - v i - * m - m t 2 - Z m t - B m m
h
8
- m c m o ; z :
o o * m ' D s F i % $ z z g g & g z z
? 2 2 g g g g z s & w g m g % n e ' *vi z * z z w -
% z s ~ s g g ~ ~ g ~ g g z g ~ ~ e .
' =
o m m m
l o , , p ? m m O N N m e N m N
m
ar=i
8 6 Z Z : 3
m m - - o - t - * v i m m m
A C
Q ' z g m t 2 G l o w
m m t - w m - o - -
v
x z g v
e . r n r n 2 m z z l 3 z
- A
V u
m v i m O v i w e ~ m m - b P - m N ~ O
e r - m ~ m m m t - ~ m m m
q q l o * P - w v i m w m * v i m
w w
v i v i m m ~ m m -
m m m \ g t - w m w
0 0 0
m m N m m m l o m * * e d
" -
A - A -' A
e e e e e e e g e e . s g s s g g s r . E L
s s e . ; s g
E z g g s s g e . e g z g
3 GLG L G L G L G
w w w w w w w w w w w
Z Z Z Z Z Z X Z X X X X X X X
w w w w w w w w w w w w w w w w
-
.d
X Z X X X X X Z X Z X
8 B
8 8 , %E 1
P H
E
v i N g g
g g u . 5
3
B
2 2 $ & w P m m m
z 3 3 2 2 g g - W L L z g . 2 . S g E . 2
g g 3
q H . 4
.e
m m g 2 2 2 2
-
- - s a k k . s f g P < P g
5 3 2
2 %
E 2
-
3.
e , e , w & a % ' s s n n a n
g g g z z
x n a S g e,
a k z
a
e,,
. = . e % . e % w
.:-.o, ~ g + + = < , g f @ . g ~ ~ $ , 4 2 . g> z z p e z
>
2 . i :
e,,, 2 % 2 g s
g g
- - %
e , M
P P P
k 4 r z l " g k w e , E 2 E E
g g g 0
e =
$ L e ,
w e,
m o o . a % & Z z ' &
r n < < w w m ~ < + m 7 ~ ~ g g S & gC:
$ $ 2 & 2 %
& & & & 2 u m m 7 < <
*
g B b p 5 P . 2
3
x
. - N
N r n -
. ' z ' N * f E , & m
g g z g g g 2 z g g g g isgggg gz
c
C e,
3
c
e,
.-
N x . S w x -
N C
2 & 2 2 z -
2 2
I\
+ i = ' z , + x x + 8
+ + +
z , + z : + + + + +
5
.- .-
z
.-
.$
2
Q U U Q
5
g,g z
73%
E : : E
E 3 9 . 5
.z .z
.$
.oo .o
r:
8 % g
eo .>>>> 5 $ 5
z g . 5
s e Q Q Q 2
9 9
2 9 s
D U D 2 2 Q * :
p ? 3 $ $ 9 $ t n E $ $ . - E q Q . Q ~ z z x x E z s s z ~ $
E E
r : 2 C C Q C -
p D
$ 2 Q . S . Z i s s z g 5 5 . 3 - 9
8
E c e, 5
E c c
w
z
r:
z . G
x * * * * * 4 * 4 * * & x * *
E
* * x * *
u *
5 *
rn '-
= F E ~ ~ P P ~ S C , B P ~ ~ ~ ~ ~ B B B B % B E E E ~ ~m m m m y i
N T p y Z L E 2
N m
* m * w
& e Z
2
O Z N
x m
g g z B
-
g 2 p ; g
5 5 2 8 2 * s s s E s 2 2 2, p & E g - 2
P
511
E
8
B
E
m
i
c i s
.
E d
u 3 >
w = l
b g
3 ": 2
E Z L
s a , -
2
2 9 &
g s z
r i a
-
8/10/2019 Siezen, 1997, Subtilase
12/23
512 R.J.
Siezen and
J.A.M. k u n i s s e n
~
0 )
m
Y g
e +
z 2
+ E 2 +
F
1
:
a
8
z g s z :
R - C C S
m m m
z s g s 8 8 2m 5 8
p 2
s c 3 %
s 3 z z z g g z g Q $ $
; q
m
m s g
N % Z % g s 2 %
Z a E o E
g m x x x
E a a a
Q n n
z n
e x 2
, E x 2
133
E E
0
.-
m
Y m
m
e,
s
e
3
e
0
z
2 8 ; N
N N N m
* m v l z - * w m m m 0' 00
Q
y1
$ .?i
g
2 j
n .
3
2 2 s
2 %
D
:@R
u
, < , z g
c ,E:
c
c
'C
s s a s l
s s s s : E
j; vl j;
j;
2
.-
.- - - -
2z2z2z
e
m e
* m m a
m
E.
E
g o
2 ' 5 z 2
P
P P P S
g j :
S Q
4
4 4 2
Pi w- E s
.-
2
.E:E: .E:
.E: Ee, 3 E .9 .E:
e,
2 . Z g $ $
z . s g z
m
c z . 2 . 9
S s % 2222
5 5 5 % \
u
, v u u
a
c m
t n
a
%
E $
a s 2
s 4 : s s
s 2 z s
$ 3 7
.-
L L
$ 3
g g
a
0 0
- .-
P P
a m
.E:
3
g .s
e , % %
009
E E E E
s
3
$ 2 . 3 . G e,
g g g K c K K
9 g p 3
I
Q Q Q
$ 2 :
e:
, M A
u u u E E E E
P P
m
0 0
& &
u u
.- .-
3 3
m a
P 8
u u
m m
0 0
& &
* u
c c
0 0
;;
&
u u
I
r - m *
$ 2 2
r - z s
I P X r ,
213
3
1
23
e e e
Z Z Z
K K K
e 9 9
n n n
-
X X
8 5 %
g Z g
wN s
x 3 x
m r n
'5
.s .5
c c
L L L
e,
5
E
._
2
$
2 %
$L1L141
F
* ~ ~ 2 r n
m
4
B + B B + + s B B ~ + + + I + B + + ~ , ~ : ~ : : : ~ ~ ~ ~ ~ Q
Q n n n % % t : %
g 9 5
0
._
*r
&
_a
B
.\
.e'
0
* a
00
N
r-
E
._
h
*r
.E E 20
\o
% 13
3
a -
u
$ 5 2 E b 2 " *
2 2 - 0 2 2 %
*i -,. g N n . G
P
x
3
9
3 .y
.y .y
.y
.y *
2
2
8
Z2ES.Z
E f f E E S
4:
-
3
5 E &
2 3 2 . 8
2
2
, p
~
2
2 2 7
. y . y , y p
$ 2
2 8
8
A
22 - ~~ . -~;~~~ ~ gg- ~
g g g
3 x 2 9
5
2 2 2 2 2 . 2 2 2 2 2 2 2 2 2 2 2 2 2 . 2 S
e e e s
9 s
2
2 9 %
u u u u u u u u u u u u u u u u u " C ' G Q Q Q S S H < < % %
5 E E S
G G G G G G G G G G G G G G G G G
G G & j d i ; j
9 9 9 9 6 6 4 4 X I P d X X E
5
-
-
' i : u
0
E
5
3
s
*
%
y
z o m m
a z u G k p = 8
.-
s s z
z a 2 b , = E 0
u u *
E M U *
- *
N
0
- -- I :
- 3 g g g g g
4
i: 0 0 'x 3.2
Q Q % C Z C Q
= E
.:a
s s Q ~ j z < ~ ~ $ ~ 9 p g m ~ ~ ~ ~
2 2 * * - x s s ? g
e % r: 9 9 9 4 L L
-
Z Z u u
x 2
Q Q
._
3
-
z z x : - z z z = l z z x z z = z - -
'1
E E E
a O P a m 0
-
_
6
Y)
-
Q
. - E E
Z ~ N -
N
*1
. s $ a
k s
y
g
m r n E 5 2 8% L Z
.
s 2
a
D s
2 %
e 0
P B 3
p
j z n
a s s s a p4 2
+ =
% g 9 s 2 : s 4 ? E E 8 J E
i i ia
2 6
D
.-
9
ci
Y
00
.5
z 2
n r- * mN
r
L.
x
9
Q $ a
ra
E
%
g s
CI
-
8/10/2019 Siezen, 1997, Subtilase
13/23
Subtilases
0 - m
Z g z ? N s g g P N r n
;=F m m
mo o
m o p W W P - F :
m m F W
X X N X P E D
E E E ~
Z
2 8 3
8 2
5 : s
E E
2 %
5 5 5
g g g p % m zwg.0
g
8 8 8 5
$ $ $
g g g m m W m . lz g E %
m u 2 0
W d O
r - m m 0 0 0
m
N m P
r
2 ; ;
N N N
-
M
.2
N N N N
al
$%S$
C
8 5 6 6 5
8
2
8 8 8 8
2
u o o a .5 .5 0 c
C C C C 2 2 E Z
m - m m m
a a a a
$ $ ?
E Z E E
B B B S
P)
> > > >
C C C C
:
3 %
C 3 8 8 3
6 5 5 .5 E
C
g g & E E
0
o a l o a l
C C C C
C C
> >
.c
'C C
.c 0 0
z g $
2 2 k 3
E
0 .-
I
g g a a a W - k b a c~ c
C C , C .... a
'C'C C 666 ) N u u P - & d > > K . g . g . g . s
2 2 2
E g g E
E g g 2 E g 2 2 g E 2
E E E
2 2
A & & -G G
g,
u
" O G N d 5 5
2
2
+ + + S B + + B S F + +
+ + +
+ : : + + + x * z + + + + +
L
s
2 - 8
U 3 k
.y
.2 E
.@A
Qo>
s s
Q F
9
2 :
2 8
,y ,y 5 3
5
E * =
. Y
3 f %
f %
0E fal
2 5 %
8 g
3 5 2 2 %
z g s s g z . 9
s g z . 9
g u u
= : E & &
r E e : a
&.,a E i X E e :
$ 4
Q + = & E *
E & + g z g g z
E L :
s . 2 g z g z
u
3 .
-
m
- N
d
W
m
kz?
2
N
P *
$ 2 2 8:: 22% 8 @ g g gZ
g
4 % % ,x
E S D E %E , P E E x
2
N
8 %
B B
i v
c c
- S
E Z
V u 2 N
m
~ E Z0
513
i
E
8
%
v1
$
6
5
6
.d
-
8/10/2019 Siezen, 1997, Subtilase
14/23
514 R.J. Siezen and J A M . Leunissen
A
A B
C D
E
F
Fig.
.
Pair-wise sequence dentity matrix. Sequences are plotted vertically and horizontally n the same order as in Figure 2; the
incomplete sequenceof hvccvp is not included. Subdivision into families to F is indicated.A color codebar for percentage sequence
identity is shown.
lase families.Asubgroup of yeastenzymes is evident,asare
subgroups ofPC1
( 2 3 %
identity), PC2 (>73% identity), and
furin ( X 5 5 dentity). In catfish herpes virus 1 a related but in-
complete amino acid sequence has been found that is presumedo
have been captured from a host (Rawlings and Barrett, 1994).
Several 3D structures have been predicted bymodeling (see
below).
Pyrolysin family
Heterogeneous group of enzymes of varied origin and low se-
quence conservation (most 37% identity) are distinguished; the former are
of
higher
eukaryotic origin, but onlyhe human and mouse enzymes have ac-
tually been identified biochemically as tripeptidyl peptidases.
Several 3D structures have been predicted by modeling (see
below).
Several other subtilases have been identified or which only the
N-terminal or other partial sequence of the purified enzyme s
available; based on sequence alignment with Figure 2, these sub-
tilases presumably belong to families A,
B,
and C (Table 3).
Conserved residues
Highly conserved residuesare listed in Table 4 and highlighted in
Figure 2. Only the essential catalytic triad residues D32, H64, and
S221 and a single glycine residue (G219)
are
totally conserved n
all sequences.
Four
other glycine residues34,65,83, and 154) are
varied only once or twice; G34 and G154 have main-chain torsion
angles that do not allow for
amino
acid residues with side chains.
At several other positions the variation s limited to two
or
three
residues,whichareusuallystructurallysimilar. In general, the
residues of the two internal helices hCnd hFare the most highly
conserved in
all
subtilases. Three
amino
acid sequences (lslasp,
sepepp, and asaspa) are particularly poorly conserved; although t
seems questionable whether these enzymes are functional, a mu-
tation analysis of the
pepP
gene suggests that
it
indeed encodes a
functional protease (Meyer et al., 1995).
Many more residuesare totally conserved within each of the six
families
A to F,
and hese can
be
used to identify new family
members. In particular, families and C are most conserved, with
a total of 32 and 41 invariant residues, respectively, while family
E
has 63 invariant residues if the
two
more divergent sequences
(asaspa and avprca) are excluded.
Residue N155 (in a conserved segment 152-155), which helps
to stabilize the oxyanion generated
in
the tetrahedral ransition
state (Carter and Wells, 1990), is not fully conserved. The only
accepted substitution here is N155D, as is found in the PC2 sub-
group of the kexin family. The effect of
this
substitution on the
-
8/10/2019 Siezen, 1997, Subtilase
15/23
Subtilases
Families
E T d
1
Lantibiotic
A
peptidases
B
Fig. 4.
Family tree
or
dendrogram analysis of the sub-
tilase superfamily, based on sequence alignment of the
catalytic domains only (Fig.
2).
A: General layout of
the relationship between families A to F.
B:
Detailed
dendrograms of the individual families, in which branch
lengths are in inverse proportion to the degree of se-
quence similarity. Not includedare members with >90%
sequence identity to one of the listed enzymes
(see
Table
2).
Trees were constructed using the neighbor-
joining method of Saitou and Nei (1987), as imple-
mented in the programs NEIGHBOR (Felsenstein,
1993)
and GROWTREE (Devereuxet al.,
1984).
The dis-
tance matrices that were used
as
input for the programs
were calculated using DISTANCES (Devereux et al.,
1984),
PROTDIST (Felsenstein,
1993),
and
HOMOL-
OGIES (Leunissen. unpubl. obs.). Positions containing
gaps were ignored,
as
were the large insertions indi-
cated between brackets in Figure 2. Whenever appro-
priate, the distances were correctedormultiple
substitutions (Jukes and Cantor. 1969; Kimura, 1983).
All
methods used delivered in principle identical to-
pologies, except for the branch lengths; these may vary,
depending upon the method used to calculate the dis-
tances between the proteins, and correcting for multi-
ple substitutions.
subtilisin
true
igh-alkaline
7 1
ntracellular
515
FAMILY
Subtilisin
c
catalytic efficiency of these proteases has been investigated by
protein engineering (Benjannet et al., 1995; Zhou et al., 1995).
Homology modeling
The procedure for homology modeling and protein engineering of
the catalytic domain of subtilases of unknown 3D structure based
on known crystal structures was described in our previous review
(Siezen et al., 1991), and can be applied to any of the enzymes
listed in Tables 1 and 2 .
Modeling should be based on the known crystal structure of the
most related enzyme, and this will be straightforward for members
of the families A-C, because 3D structures are known in each
family. For the families D-F, with
no
known 3D structures, mod-
eling will be less straightforward and can be based on any
known
structure from families A-C or a combination of these. Problems
will arise where large insertions occur, because these are still im-
possible to model reliably. It would be extremely helpful for mod-
Proteinase K
1
antibiotic
peptidase
Kexin
7
7
gram-negative
bacteria
gram-positive
bacteria
3
lant
7
ripeptidasr
2 hermophile
-
Pyrolysin
eling purposes to determine the crystal structure of at least one
member of each of the D-F families, preferably those with large
inserts.
This homology method has since been refined and applied for
modeling and engineering of (a) the cell-envelope proteinase llprtp
of
Lactococcus lactis
(Siezen et al., 1993; Bruinenberg et
al.,
1994a,
1994b); (b) the lantibiotic leader peptidases llnisp of
Lactococcus
lactis
(Van der Meer et
al.,
1994; Siezen et al., 1995a), and efcyla
of Enterococcus faeca lis (Booth et al., 1996); (c) the kexin family
members furin (hsfur: Creemers et al., 1993; Siezen et al., 1994)
and PC2/PC3 (Lipkind et al., 1995); and (d) the heat-stable pro-
teases pfpyro and tsplst of the hyperthermophiles Pyrococcusfu-
riosus and Thermococcus stetteri (W. Voorhorst, A. Warner, W. de
Vos, R. Siezen, in prep.). These studies have provided predictions
and evidence for inserted and disposable loops, disulfide bridges,
&'+-ion binding sites, surface salt bridges and networks, aromatic
surface clusters, and residues involved in enzyme-substrate inter-
actions. Some examples are discussed below.
-
8/10/2019 Siezen, 1997, Subtilase
16/23
516
R.J. Siezen and J.A.M. Leunissen
Table 3 . Incomplete amino acid sequences of subtifuses
Organism Enzyme
Acronym
Residues determined
N-term. Other Family References
BACTERIA
Gram-positive
Bacillus
subrilis
A50
Bacillus sp. (3x6644
Bacillus
sp. Y
Bacillus thuringiensis israelensis
Bacillus thuringiensis finitimus
Bacillus thuringiensis kurstaki
BaciNus cereus
Bacillus intermedius
3-19
Nocardiopsis dassonvillei (prasina)
Gram-negative
Streptomyces rutgersensis
Thermus
Tok3A
1
Vibrio metschnikovii
Cochliobolus carbonum
EUKARYA
Fungi
Agaricus bisporus
Malbranchea suljurea
Ophiostoma piceae
V e r r l c l l l iu ~ ~ h l a m y d o s p o r ~ u m
Scedosporium apiospermum
Intracell. serine protease
Subtilisin GX
Protease BYA
Extracellular serine protease
Extracellular serine protease
Extracellular serine protease
Extracellular
serine
protease
Alkaline serine protease
Alkaline serine protease
Proteinase D
Caldolysin
Alkaline protease VapK
Extracellular protease
Extracellular serine protease
Thermomycolin
Extracellular protease
Extracellular protease VCPl
Extracellular protease
bsia50
bssugx
bspbya
btisra
btfini
btkurs
bcespr
biprot
ndapII
srespd
tscald
vmapk
ccalp2
abexpr
msthmy
opexpr
vcexp
1
saalpr
1-54
1-16
1-2 1
1-14 223-243
1-15
6-20
1-15 223-243
1-15
1-26
1-23
1-15
1-36
1-29
1-19
1-28 217-222
1-18 170-193
1-20
1-13
Strongin et al., 1978
Durham, 1993
Shimogaki et al., 1991
Chestukhina et al., 1986
Chestukhina et al., 1986
Kunitate et al., 1989
Chestukhina et al., 1986
Balaban et al. , 1994
Tsujibo et al., 1990
C Lavrenova et al., 1984
C Freeman et al., 1993
A Kwon et al., 1994
C Murphy
&
Walton, 1996
C Burton et al., 1993
C Gaucher
&
Stevenson, 1976
C Abraham
&
Breuil, 1995
C Segers et al., 1995
C Larcher etl., 1996
Table
4. Highest conserved residues in subtilases v = variability)
Residue
u = I u =
2
u = 3
Context/function Exception
32
34
64
65
68
69
70
83
90
125
152
154
155
189
193
20
1
219
220
22
1
223
225
229
G
S
G
N
Catalytic triad residue
Bend;
4,
@
=
99 , 179
Catalytic triad residue
Buried helix, close packing
Buried helix,
close
packing, directly under catalytic triad
Buried helix, close packing
Buried helix, close packing
Helix/turn, close packing
Buried fi-strand, hydrophobic packing to helix C
Bend, directly adjacent to catalytic triad
Lines
S 1
pocket
Lines
S I
pocket; 6,
=
114 , 163
Oxyanion stabilization
Turn
at surface, side chain turned into pocket
Begin turn
Bend at end &strand, hydrophobic ring stacks with H226
Bend between e9 and hF;
4, @ =
147 , 160
OD1 H -bonded to backbone NH -154
Catalytic triad residue
Buried helix, close packing
Buried helix, close packing
Buried helix,
close
packing
N (Islasp), A (smserp), P (smsspl, smssp2)
del (asaspa)
M (Islasp), I (sepepp, ddtagc)
G
(nahlys), T (bsb pf),
I
(sepepp)
T
(smstab), A (smsspl, paaf70)
A (Islasp),
T
(efcyla)
W (bsbpf), M (seepip)
P (Islasp), C (hakx2), T (acfurl, bcpc2)
M (sepepp), del (bssepr)
D (Ispc2, bcpc2, cepc2, hspc2)
del (sepepp), S
(smserp),
L (bspara)
Y (=pe w), D (dmpga9). T (vmvapt)
I (seepip, smstab)
N (sepepp, Ilnisp)
G (Islasp), S (sepepp, ddtagc)
T
(bssepr)
-
8/10/2019 Siezen, 1997, Subtilase
17/23
Subtilases
517
Large insertions and deletions
peptidases, and include large
N-
and C-terminal deletions. All but
one of the internal deletions can be readily accommodated by
The 190 residues that constitute the scrs, as defined from the connecting residues that are spatially adjacent in the 3D structures
known crystal structures (Siezen et al., 1991) and shown in Fig- of subtilisin/thermitase. Particularly interesting in this respect is
ure
2
are present in nearly all the subtilases. Som e unusual dele- the natural deletion of the Ca l-i on binding loop, residues 74-82,
tions are found, however, as listed in Table
5,
and this implies that in the
Enterococcus
subtilase (efcyla), thereby presumably extend-
not all of these core residues are essential for proper folding. Most ing helix
C
by another four residues (Booth et al., 1996); this is
of these deletions occur in subtilase family
D,
the lantibiotic leader precisely the loop deletion that was engineered into subtilisin to
Table 5. Large or unusual deletions and insertions
Unusual deletion
Missing residues Context Family Enzyme
1-13
65-66
14-82
96-102
180-189
257-215
N-terminus, hA
Part hC, adjacent catalytic His
Ca-binding loop + hC extended
Turn, substrate-binding region
Turns
C-terminus, hH
sepepp
asaspa
efcyla
smserp
sePePP
lslasp
Large insertion
Inserted residues
Position Number Properties Family Enzyme
vr5
vr6
v r l
vr8
V I 9
v r l l
vr13
vr15
vr16
vr18
vr19
N-term. Up to 98
59
34
vr
1
1 8
vr4 30-33
28-30
26-3 1
23
147-213
30
42
16
51
34
18
22
16-18
134-169
13-15
21
20-22
149
2 1
22
20
19
38
34
25
22-24
21
No homology
Highly charged
Highly charged
Weak homology
High homology
Medium homology, conserved S-S bond
?
High homology
Weak homology, see alignment in Fig.
5
Highly charged (50%)
Highly charged
High homology
Weak homology
Weak homology in central section (Fig.
5 )
High homology
Weak homology
S - S
bond ?
High homology
S-S bond ?
High homology
S-S bond?
E
F
C
C
F
B
F
F
F
F
F
C
F
F
F
F
D
F
F
F
D
A
E
F
F
B
E
E
B
F
F
Most family members
spscpa
scyct5
scyct5
spscpa, Ilprtp, ldprtb
dnbpr, dnavp2, dnavp5, xcproa,
alaprl
llspO9, atserp, cmc ucu, agserp,
lep69, paafl0
smssp l, smssp2
pfpyro, tsplst, dmpga9, hstpp2,
CetPP
PfPYro