Doi:10.1016/j.sbi.2004.03.01
Structure, function and evolution of multidomain proteinsChristine Vogel, Matthew Bashton, Nicola D Kerrison,Cyrus Chothia and Sarah A Teichmann
Proteins are composed of evolutionary units called domains; the
residues in the proteins of completely sequenced genomes
majority of proteins consist of at least two domains. These
using homology-based methods. These include the profile
domains and nature of their interactions determine the
hidden Markov models in the SUPERFAMILY database
function of the protein. The roles that combinations of
the structural profiles of the PSSM server the
domains play in the formation of the protein repertoire have
PSI-BLAST profiles in the Gene3D database or com-
been found by analysis of domain assignments to genome
bined approaches . From the assignment of structural
sequences. Additional findings on the geometry of domains
domains to genome sequences, it is clear that some two-
have been gained from examination of three-dimensional
thirds of proteins consist of two or more domains in pro-
protein structures. Future work will require a domain-centric
karyotes and an even larger fraction in eukaryotes
functional classification scheme and efforts to determinestructures of domain combinations.
As most proteins consist of multiple domains, anddomains determine the function and evolutionary rela-
tionships of proteins, it is important to understand the
MRC Laboratory of Molecular Biology, Hills Road,
principles of domain combinations and interactions. In
Cambridge CB2 2QH, UK
this review, we discuss how domain superfamilies form
the repertoire of multidomain proteins via duplicationand recombination (We then describe the
Current Opinion in Structural Biology 2004, 14:208–216
principles and extent of conservation of the N- to C-terminal order of domains, their three-dimensional geo-
This review comes from a themed issue on
metry and their functional relationships. This will illus-
Theory and simulationEdited by Joel Janin and Thomas Simonson
trate the importance of domain combinations to anunderstanding of protein evolution, structure and func-
0959-440X/$ – see front matter
tion, and to target selection in structural genomics.
ß 2004 Elsevier Ltd. All rights reserved.
Proteins are formed by duplication,divergence and recombination of domainsIn order to understand how multidomain proteins func-
Protein Data Bank
tion, it is useful to know how they are created in evolution
root mean square deviation
and how they are related to each other. Duplication is one
Structural Classification of Proteins
of the main sources for creation of new whole genes
and this is also true at the level of domains: at least 58% of
Winged helix domain
the domains in Mycoplasma and 98% of the domainsin humans are duplicates. The domains of
different superfamilies are duplicated to different extents
There are various uses of the word domain with respect to
and this results in a distribution of superfamily sizes in
proteins. Here, we define a protein domain as an inde-
genomes that follows a power law . This means
pendent, evolutionary unit that can form a single-domain
that there are a few highly abundant superfamilies, for
protein or be part of one or more different multidomain
example, the P-loop NTP hydrolases, NAD(P)-binding
proteins. The domain can either have an independent
Rossmann domains and certain kinase families. The
function or contribute to the function of a multidomain
expansion of superfamilies in a particular phylogenetic
protein in cooperation with other domains. The definition
group can deliver one explanation for the characteristics
of a domain as an evolutionary unit is used in the Struc-
of the organisms in that group (e.g. the immunoglobulin
tural Classification of Proteins (SCOP) database .
proteins in metazoa) . Once a domain or proteinhas duplicated, it can evolve a new or modified function
In SCOP, domains that have a common ancestor based on
either by sequence divergence or by combining with
sequence, structural and functional evidence are grouped
other domains to form a multidomain protein with a
into superfamilies. There are more than 1200 domain
new series of domains. The N- to C-terminal series of
superfamilies in the current version of the database
domains in a protein is its ‘domain architecture'.
though estimates of the total number of superfamilies varyfrom a few to several thousand . Domains from the
We will consider the recombination of domains in order
superfamilies in SCOP can be assigned to 40–60% of the
to form different domain architectures in more detail.
Current Opinion in Structural Biology 2004, 14:208–216
Multidomain proteins Vogel et al.
The major molecular mechanism that leads to multi-domain proteins and novel combinations is non-
The repertoire of domain superfamilies.
homologous recombination, sometimes referred to as‘domain shuffling'. In eukaryotes, there is evidence thatthere is a tendency for exon boundaries to coincide withdomain boundaries, which suggests that proteins may beformed by intronic recombination (e.g. Another
important recombinatorial mechanism is the fusion ofgenes which is more common than the splittingor fission of a gene .
The properties of the repertoire of domain
.duplicates and recombines to form single and multi-domain
Our knowledge of the domain architectures of proteinsstems from the assignment of structural domains to thewhole or part of 40–60% of the predicted proteins fromcompletely sequenced genomes From examinationof these proteins, it has become clear that the formation ofnew domain combinations is an important mechanismin protein evolution. The proteins from more than 100
different organisms contain several thousand differentcombinations of two superfamilies , but this isfar fewer (less than 0.5%) than would be possible given
The same combination can adopt different geometries…
the total number of superfamilies or the number ofmultidomain proteins per proteome . This number islikely to decrease even more if membrane proteins areincluded . The limited repertoire of domain combi-nations that are observed in proteins indicates that allcombinations have been under strong selection.
A few domain superfamilies are highly versatile and haveneighbouring domains from many superfamilies, whereasmost superfamilies are little versatile Thedistribution of the number of partner superfamilies
.and/or different functions.
per superfamily follows a power law, like the distributionof superfamily sizes mentioned above. Despite thesegeneral principles, each domain superfamily has itsown story. Some superfamilies are highly versatile, someare highly abundant and some superfamilies are bothIt is the structure and function of the domainsand domain combinations that determine why they have
Catalytic site or ligand
been selected.
The properties observed for single domains are similar to
Current Opinion in Structural Biology
those for combinations of two or more domains A few two-domain combinations, for example, are highly
The role of domains in protein evolution. Overview of different
versatile and occur with many different additional do-
aspects of multidomain proteins: the repertoire of domainsuperfamilies and their role in the formation of multidomain proteins by
mains, but most two-domain combinations occur in only
duplication and recombination, and the geometry and functional
one or two different protein contexts. Important examples
relationships of domains within these combinations. Domains belonging
of the reuse of particular domains and domain combina-
to the same superfamily are represented as rectangles of the same
tions come from signal transduction. For instance, the
colour. Supradomains are two- or three-domain combinations thatoccur in different domain architectures with different N- and C-terminal
combination of SH3 and SH2 domains recurs in several
neighbours, as shown in the second panel. These short series of
different signal transduction proteins and this
domains form functional units that are reused in different protein
versatility of recombination qualifies the SH3–SH2
domain pair as what we have called a ‘supradomain'Supradomains are two- or three-domain combi-nations that occur in different domain architectures with
Current Opinion in Structural Biology 2004, 14:208–216
Theory and simulation
different N- and C-terminal neighbours, a concept illu-
there is a tendency for the geometry of interaction of
strated in .
protein domains to be more conserved the more similarthe domain sequences are. They assessed similarity of
The sequential order of domains is
domain geometry by comparing the average shift of a group
of points in each of two domains. In order to understand
If the same domain combination is observed in two
the changes in geometry of domain combinations in more
different proteins, one possibility is that they have evolved
detail, Kerrison et al. (see also studied the rotation,
by duplication rather than assembled independently by
shift, interface size and residue contacts of related two-
different recombinatorial routes. Most instances of the
domain combinations (N Kerrison et al., unpublished).
same two-domain combination or domain architecture
They analysed 143 pairs of homologous proteins with
have evolved from the same ancestor; there are several
two domains, extracted from SCOP version 1.63 They
lines of evidence that support this. First, three-dimen-
found that, when one pair of homologous domains are
sional structural analyses of individual protein families,
superposed, the positions of the two second domains
such as the Rossmann domains have shown that
mostly differ by shifts and rotations of less than 5 A
proteins with the same domain architecture are relatedby descent (i.e. evolved from one common ancestor).
Although automatic approaches are useful to gain first
Unpublished data by Kerrison, Chothia and Teichmann
insights into general relationships, manual inspection
has shown that this is true for most two-domain protein
and alignment of the residues at the interface of domains
families of known structure in the current databases
can reveal the precise nature of the changes. This is
Second, with only a small fraction of exceptions (less
illustrated by the examples in As shown in
than 10%), two domains occur in only one N- to C-
, homodimeric transcriptional repressors of the
terminal order in structural assignments to genome
Iron-dependent repressor protein superfamily consist
sequences This conservation of domain order is
of a Winged helix domain (WHD) and a dimerisation
likely to be historical instead of functional, as a very
domain. The two proteins shown have conserved inter-
similar interface and functional sites could be formed
domain geometry. In contrast, shows two
by two domains in either order, for instance, given a long
different proteins for which the domain geometry has
linker The conserved order of domains can thus be
changed. Both proteins consist of a Homeodomain-like
exploited to improve domain assignments to protein
DNA-binding domain and a Tetracyclin repressor-like
sequences Last, proteins sharing the same series
C-terminal domain. The rotation and shift of the latter
of domains tend to have the same function , which
domain become clear when the interface residues of the
is rarely the case if domain order is switched (
N-terminal domains are aligned
J Gough, personal communication).
Functional relationships of domains in
The geometry of domain combinations
multidomain proteins
Above, we discussed how the sequential order of domains
In order to delineate the functional relationships of
within multidomain proteins of the same domain archi-
domains in single- and multi-domain proteins, one needs
tecture is largely conserved and suggests homology. We
a systematic understanding of the domain functions in
will now describe the combinations and interactions of
different contexts, that is to say, the range of functions
domains on a different level, that is, with respect to their
of a particular domain depending on its different partner
three-dimensional arrangement or geometry. The geo-
domains. Existing functional classification schemes, such
metry of Rossmann domains and their partner domains
as GenProtEC GO , MIPS that used in
on one protein chain is conserved whenever the partner
COGs and the EC (Enzyme Commission) classifica-
domains are from the same superfamily Studies of
tion operate at the level of the whole protein and are
small numbers of families of protein complexes, for which
thus inadequate to describe the contribution of the indi-
the interdomain geometry occurs across different poly-
vidual domains to protein function.
peptide chains, have also revealed extensive conservationof geometry These results suggested that, for
In order to understand the molecular roles of individual
proteins of unknown structure, their quaternary structure
domains, it is vital to know their three-dimensional
and complex geometry can usually be modelled based on
structure. Todd et al. and Bartlett et al.
homologous polypeptide(s) of known structure. Impres-
have studied domain function and evolution at a detailed
sive examples of such structural models of complexes
structural level, but were primarily concerned with indi-
include the yeast ribosome and exosome
vidual superfamilies as opposed to domains in the contextof their combinations. The role of ‘ancillary' or ‘accessory'
However, in order to thoroughly understand the evolution
domains is alluded to briefly in their work.
of interdomain interfaces and assess the true extent ofconservation, a survey of all proteins of known structure is
Bashton and Chothia (M Bashton, C Chothia, unpub-
necessary. Aloy, Russell and co-workers found that
lished) have developed a domain-centric scheme that
Current Opinion in Structural Biology 2004, 14:208–216
Multidomain proteins Vogel et al.
emphasises domain function in the context of domainneighbours in multidomain proteins, providing func-tional annotations for a subset of SCOP domains. The
annotation is based on detailed examination of the pro-tein structures, which is essential for understandingthe precise molecular function of the domain and itscontribution to the function of the whole protein. Inthis domain-centric functional classification scheme,domains are classified into seven categories that encom-pass catalytic activity, cofactor binding, responsibilityfor subcellular localisation, protein–protein interactionand so forth.
Two generic principles emerge. First, a domain can per-form the same function, but in different protein contexts(i.e. with different partner domains). This is illustrated byThe WHDs in these examplescombine with different sensory, regulatory and enzymaticdomains, but maintain their function in that they target
the protein to a specific sequence. In contrast, the WHD
in acts as a substrate specificity pocket and has
no DNA-binding activity at all. This domain has divergedand acquired a novel or modified function. In an analogyto linguistics, one can describe the two different fates of a
dimerisation domain of the Iron-dependent repressor proteinsuperfamily. (b) Two transcriptional repressors composed of aHomeodomain-like domain and a Tetracyclin repressor-like C-terminaldomain are shown in such a way that the N-terminal domains (in blackwith orange and yellow interface peptides) are in the same orientation.
The C-terminal domains are clearly rotated relative to each other. In thisrepresentation, the difference in geometry is apparent as a downwardtilt of the C-terminal domain of TetR (dark blue helices) relative to theQacR C-terminal domain (light blue helices). (c) Residues forming
contacts at the interdomain interfaces of the structures in (b). Theorange and yellow interface peptides of the N-terminal domains aresuperposed, and the difference in the positions of the blue C-terminalinterface peptides is clear. Again it appears as a downward tilt of thedark blue TetR helices compared with the light blue QacR interfacepeptides. Structural information. (a) Iron-dependent regulatory proteinIdeR from Mycobacterium tuberculosis (PDB code 1b1b, chain A )structurally aligned with diphtheria toxin repressor DtxR fromCorynebacterium diphtheriae (PDB code 1g3t, chain B The
N-terminal WHDs are shown in orange and yellow, and the C-terminaldimerisation domains are in different shades of blue. They have about80% sequence identity to each other and the interface between the twodomains is about 1400 A˚2 in both structures. (b) Tet repressor D fromEscherichia coli (PDB code 2tct and the Staphylococcus aureusmultidrug-binding protein QacR (PDB code 1jt6, chain A The N-terminal domains are DNA-binding Homeodomain-like domains and areshown in black, with the peptides forming the interdomain interface inorange and yellow. The C-terminal domains are Tetracyclin repressor-
like and are shown in grey, with interface peptides in different shades of
blue. The chains shown here are both in the ligand-bound state. The
two proteins have about 10% sequence identity to each other bothacross the entire sequence and in the residues that form contacts at theinterdomain interfaces. The interface between the domains is around
Current Opinion in Structural Biology
2100 A˚2 in both structures. (c) The residues from the N-terminaldomains of the structures in (b) that form interface contacts, shown in
Geometry of domains in different transcriptional regulators. (a)
orange and yellow, were superimposed and are shown in the same
Superposition of two chains shows that the geometry of the domain
orientation. The difference in position of the C-terminal interface
pair is conserved in the two proteins. The two structures are
residues, shown in shades of blue, is apparent and corresponds to a
homodimeric transcriptional repressors consisting of a WHD and a
shift of 10 A˚ and a rotation of 408.
Current Opinion in Structural Biology 2004, 14:208–216
Theory and simulation
Current Opinion in Structural Biology
Variation in function of the Winged helix domain in different proteins. These three proteins each contain WHDs and illustrate how a superfamily canundergo syntactical and semantic shifts in protein function in different domain contexts. Many transcription factors are made by combining aWHD with a sensory or regulatory domain, as in FadR shown in (a), and in the proteins shown in . The WHD can also be found inenzymatic proteins, such as restriction endonucleases, where it combines with a catalytic domain that nicks DNA, as in the FokI protein shownin (b). In (a,b), the WHD performs the same role (i.e. it targets the protein to a specific sequence), but the range of functions is achieved bycombining the WHD with different partner domains, so it is exhibiting a syntactical shift. (c) A semantic shift is found in human methionineaminopeptidase 2 in which the WHD acts as a substrate specificity pocket with no DNA-binding activity at all. Structural information. (a)FadR (PDB code 1hw2 ): for the a chain, the WHD is orange and the oligomerisation/CoA-binding domain (fatty acid responsive transcriptionfactor FadR, C-terminal domain) is dark blue; for the b chain, the colours are yellow and light blue, respectively. (b) Restriction endonuclease FokI(PDB code 1fok ): the three WHDs are shown in yellow, orange and red (N- to C-terminal order), and the catalytic domain (Restrictionendonuclease-like) is in blue. (c) Human methionine aminopeptidase 2 (PDB code 1boa the Creatinase/aminopeptidase domain is shown in
light blue and the WHD is in orange. The WHDs of FadR and human methionine aminopeptidase 2 superpose with RMSD ¼ 0:995 A (not shown).
particular domain superfamily as syntactical and semantic
distribution across the three kingdoms of life or their
versatility with respect to other combination partners. Our concept of versatile domain combinations, or
Domain combinations of known and
supradomains (introduces another useful filter
unknown structure
. As mentioned above, supradomains are two- or
The discussion of domain function above illustrates
three-domain combinations that occur in different
how knowledge of three-dimensional structures of pro-
domain architectures with different N- and C-terminal
teins is key to a detailed understanding of how they
neighbours. The 200 most duplicated two-domain supra-
work. For this reason, enormous efforts are currently
domains, for example, occur in more than 75 000
underway in structural genomics projects to obtain com-
sequences (28% of the sequences with domain assign-
plete structural coverage of all domain superfamilies and
ments) from 113 archaeal, bacterial and eukaryotic
folds as far as possible . However, beyond the struc-
completely sequenced genomes. Knowledge of their
ture determination of single domains, the structure
structure can hence provide insights into the function
determination of a wide range of domain combinations
of almost one-third of the sequences in this data set.
is crucial for a deeper understanding of protein relation-
lists the ten most abundant combinations of
ships, functions and interactions, for the reasons we
these 200 supradomains that do not have homologues
have discussed in the previous sections. Furthermore,
of known structure. Almost all of these combinations
for many domain architectures, multidomain proteins
occur in biochemically characterised proteins. However,
of known structure can be used confidently as a temp-
they also occur in many other uncharacterised proteins.
late for the domain geometry of homologous pro-
One example is the above-mentioned winged helix
teins of unknown structure (N Kerrison et al.,
DNA-binding domain in combination with the periplas-
mic binding protein II domain (). This domaincombination alone occurs in almost 2000 sequences of
Domain combinations can be prioritised for target selec-
unknown function, many of which could be regulators
tion according to different criteria: their abundance, their
like the two examples in the table. Exact knowledge of
Current Opinion in Structural Biology 2004, 14:208–216
Multidomain proteins Vogel et al.
The most duplicated two-domain combinations – targets for structure determination.
Domain combination
Known proteins with the domain combinations (examples)
category (possible)
Winged helix DNA-binding domain and
OXYR_ECOLI: OxyR is a positive regulator of
periplasmic binding protein-like II
hydrogen-peroxide-inducible genes in E. coli and otherbacteria, and is homologous to other regulatory proteinsNODD_RHILE: NodD is responsible for activatingtranscription of the Nod genes in the bacterium inresponse to plant inducers
Homeodomain-like and ribonuclease-H-like
TC3A_CAEEL: Transposase in Caenorhabditis elegans
Signal transduction
PRGR_HUMAN: The human progesterone receptor is
domain) and nuclear receptor
involved in the regulation of gene expression, and affects
ligand-binding domain
cellular proliferation and differentiation in target tissues
PYP-like sensor domain (PAS domain)
Signal transduction
NTRB_ECOLI: NTRB acts as a signal transducer involved
and homodimeric domain of
in nitrogen regulation in E. coli
signal-transducing histidine kinase
TORS_ECOLI: The TorS sensor protein in E. coli is partof a two-component regulatory system
ATPase domain of HSP90 chaperone/DNA
Signal transduction
ARCB_ECOLI: ArcB is a member of the two-component
topoisomerase II/histidine kinase and
regulatory system arcB/arcA. Sensor-regulator protein for
anaerobic repression of the arc modulon
Actin-like ATPase domain and heat
HSCC_ECOLI: Hsc62 is a DnaK homologue of E. coli
shock protein 70 kDa (HSP70),
HS7F_CAEEL: The protein is a member of the Hsp70
C-terminal substrate-binding fragment
multi-gene family of mitochondrial chaperones in C. elegans
Calcium ATPase, transmembrane
PMA1_CANAL: Plasma membrane H-ATPase from
domain M and HAD-like
Candida albicans. The H-pump produces a protongradient that is used for active nutrient transport
Growth factor receptor domain and
Signal transduction
MTN3_HUMAN: Matrilin-3 is an extracellular matrix protein
and a major component of cartilageFBL2_HUMAN: Fibulin-2 binds, depending on calcium,to fibronectin and other ligandsEGF_HUMAN: Human epidermal growth factor receptor
Winged helix DNA-binding domain and
GATR_ECOLI: A repressor of the GAT operon for
phosphosugar isomerase
galacticol transport and metabolism in E. coli
TRAF domain-like and POZ domain
Signal transduction
SPOP_HUMAN: Speckle-type POZ protein is an antigenrecognised by serum from a scleroderma patientThe domain combination occurs in no other protein inSwissProt
This table lists those two-domain combinations without homologues of known structure in decreasing order of occurrence in proteins of completelysequenced genomes. Only combinations of two different domains are shown; repetitions of the same domain are not listed. The last columnprovides one or more examples of biochemically characterised proteins from SwissProt (version 41.20 ) that contain the particular domaincombination. It should be noted that some of these domain pairs that are common in genome sequences could be flexible instead of having a rigidinterdomain geometry. If this is the case, structure determination is more difficult and less meaningful in terms of the domain geometry, though thedomains are still functionally linked of course. aThe domain combination occurs in all three kingdoms of life.
the properties of these domain combinations will con-
emergence of new combinations is linked to speciation
tribute enormously to annotation in terms of protein
and specific phylogenetic groups: whereas more than half
function and structure.
of all domain superfamilies are common to archaea,bacteria and eukaryotes, this is the case for only about
5% of two-domain combinations . Domain com-
We have provided an overview of the role of domain
binations and expansions of domain superfamilies, as well
combinations in the formation of the protein repertoire.
as other processes such as alternative splicing play
From the fairly comprehensive domain assignments that
important roles in the emergence of more complex organ-
are available for completely sequenced genomes, it has
become clear that the majority of proteins, even in simplegenomes, are multidomain. Though the domain combi-
Proteins that contain the same domain combination or
nations observed are only a small fraction of all possible
have the same domain architecture tend to have a
combinations of the repertoire of protein families, the
common ancestor and common functional features. For
Current Opinion in Structural Biology 2004, 14:208–216
Theory and simulation
instance, supradomains are combinations of domains that
Buchan DW, Rison SC, Bray JE, Lee D, Pearl F, Thornton JM,Orengo CA: Gene3D: structural assignments for the
adopt a function that is useful within a variety of different
biologist and bioinformaticist alike. Nucleic Acids Res 2003,
domain architectures. Several domain assignment servers
now offer tools for searches for particular domain combi-
10. Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Barrell D, Bateman
nations and architectures (e.g. SUPERFAMILY
A, Binns D, Biswas M, Bradley P, Bork P et al.: The InterProDatabase, 2003 brings increased coverage and new features.
SMART , Pfam and the Conserved Domain
Nucleic Acids Res 2003, 31:315-318.
Architecture Retrieval Tool [CDART] connected
InterPro is a metaserver that combines several domain assignment
with the Conserved Domain Database [CDD]
servers, including PRINTS, PROSITE, Pfam, ProDom, SMART, TIGR-FAMs and also SUPERFAMILY, and integrates information on proteinfamilies, domains and functional sites.
In order to understand the molecular details of the func-
11. Teichmann SA, Park J, Chothia C: Structural assignments to
tions of domain combinations, the three-dimensional
the Mycoplasma genitalium proteins show extensive geneduplications and domain rearrangements. Proc Natl Acad
structure of the domain architecture is needed. Structural
Sci USA 1998, 95:14658-14663.
genomics projects could therefore have novel combina-
12. Gerstein M: How representative are the known structures of the
tions of domains, in addition to the structures of indivi-
proteins in a complete genome? A comprehensive structural
dual domains, as their aim. Target selection of domain
census. Fold Des 1998, 3:497-512.
combinations could be based on the number of proteins
13. Lynch M, Conery JS: The evolutionary fate and consequences of
containing the domain combination, as well as the versa-
duplicate genes. Science 2000, 290:1151-1155.
tility of the domain combination in terms of occurrence in
14. Muller A, MacCallum RM, Sternberg MJ: Structural
characterization of the human proteome. Genome Res 2002,
different domain architectures.
This large-scale study of the human and three other eukaryote pro-
Knowledge of this kind could be integrated with genome-
teomes, as well as several bacterial and archaeal proteomes, focuseson domain superfamilies rather than whole proteins. The authors describe
scale data of different types and contribute towards a
the duplication and expansion of specific domain superfamilies in
more comprehensive understanding of the evolution of
the human genome compared with other organisms. They also discusstransmembrane and disease-related proteins, and domain superfamilies
the structure and function of the protein repertoire.
in the human genome.
15. Wolf YI, Karev G, Koonin EV: Scale-free networks in biology: new
insights into the fundamentals of evolution? Bioessays 2002,
We are grateful to Jung-Hoon Han for help with CV has a
pre-doctoral fellowship from the Boehringer Ingelheim Fonds.
16. Qian J, Luscombe NM, Gerstein M: Protein family and fold
occurrence in genomes: power-law behaviour and
References and recommended reading
evolutionary model. J Mol Biol 2001, 313:673-681.
Papers of particular interest, published within the annual period of
17. Wuchty S: Scale-free behavior in protein domain networks.
review, have been highlighted as:
Mol Biol Evol 2001, 18:1694-1702.
of special interest
18. Hill E, Broadbent ID, Chothia C, Pettitt J: Cadherin superfamily
of outstanding interest
proteins in Caenorhabditis elegans and Drosophilamelanogaster. J Mol Biol 2001, 305:1011-1024.
Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP - a structuralclassification of proteins database for the investigation of
19. Vogel C, Teichmann SA, Chothia C: The immunoglobulin
sequences and structures. J Mol Biol 1995, 247:536-540.
superfamily in Drosophila melanogaster and Caenorhabditiselegans and the evolution of complexity. Development 2003,
Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C,
Murzin AG: SCOP database in 2004: refinements integratestructure and sequence family data. Nucleic Acids Res 2004,
20. Chervitz SA, Aravind L, Sherlock G, Ball CA, Koonin EV, Dwight SS,
Harris MA, Dolinski K, Mohr S, Smith T et al.: Comparison of thecomplete protein sets of worm and yeast: orthology and
Chothia C: Proteins - 1000 families for the molecular biologist.
divergence. Science 1998, 282:2022-2028.
Nature 1992, 357:543-544.
21. Aravind L, Subramanian G: Origin of multicellular eukaryotes -
Orengo CA, Jones DT, Thornton JM: Protein superfamilies and
insights from proteome comparisons. Curr Opin Genet Dev
domain superfolds. Nature 1994, 372:631-634.
1999, 9:688-694.
Coulson AF, Moult J: A unifold, mesofold, and superfold model
22. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J,
of protein fold use. Proteins 2002, 46:61-71.
Devon K, Dewar K, Doyle M, FitzHugh W et al.: Initial sequencing
Gough J, Karplus K, Hughey R, Chothia C: Assignment of
and analysis of the human genome. Nature 2001, 409:860-921.
homology to genome sequences using a library of hidden
23. Kaessmann H, Zollner S, Nekrutenko A, Li WH: Signatures of
Markov models that represent all proteins of known structure.
domain shuffling in the human genome. Genome Res 2002,
J Mol Biol 2001, 313:903-919.
Madera M, Vogel C, Kummerfeld SK, Chothia C, Gough J: The
24. Patthy L: Genome evolution and the evolution of exon-
SUPERFAMILY database in 2004: additions and improvements.
shuffling–a review. Gene 1999, 238:103-114.
Nucleic Acids Res 2004, 32:D235-D239.
The SUPERFAMILY database provides assignments of the over 1200
25. Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA: Protein
domain superfamilies, as defined in the SCOP database, to proteins using
interaction maps for complete genomes based on gene fusion
highly sensitive hidden Markov models. Close to 60% of all proteins have
events. Nature 1999, 402:86-90.
at least one match and one half of all residues are covered by assign-ments. The database is located at and updated
26. Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D:
twice a year. SUPERFAMILY is now part of InterPro.
A combined algorithm for genome-wide prediction of proteinfunction. Nature 1999, 402:83-86.
Kelley LA, MacCallum RM, Sternberg MJ: Enhanced genomeannotation using structural profiles in the program 3D-PSSM.
27. Snel B, Bork P, Huynen M: Genome evolution. Gene fusion
J Mol Biol 2000, 299:499-520.
versus gene fission. Trends Genet 2000, 16:9-11.
Current Opinion in Structural Biology 2004, 14:208–216
Multidomain proteins Vogel et al.
28. Yanai I, Wolf YI, Koonin EV: Evolution of gene fusions: horizontal
43. Beckmann R, Spahn CM, Eswar N, Helmers J, Penczek PA, Sali A,
transfer versus independent events. Genome Biol 2002,
Frank J, Blobel G: Architecture of the protein-conducting
channel associated with the translating 80S ribosome.
Cell 2001, 107:361-372.
29. Apic G, Huber W, Teichmann SA: Multi-domain protein families
and domain pairs: comparison with known structures and a
44. Aloy P, Ciccarelli FD, Leutwein C, Gavin AC, Superti-Furga G,
random model of domain recombination. J Struct Funct
Bork P, Bottcher B, Russell RB: A complex prediction: three-
Genomics 2003, 4:67-78.
dimensional model of the yeast exosome. EMBO Rep 2002,3:628-635.
30. Vogel C, Berzuini C, Bashton M, Gough J, Teichmann SA:
Supra-domains - evolutionary units larger than single protein
45. Aloy P, Ceulemans H, Stark A, Russell RB: The relationship
domains. J Mol Biol 2004, in press.
between sequence and interaction divergence in proteins.
J Mol Biol 2003, 332:989-998.
31. Apic G, Gough J, Teichmann SA: Domain combinations in
Using RMSD as a simple measure to compare interactions between
archaeal, eubacterial and eukaryotic proteomes.
domains, the authors found that homologues with more than 30%
J Mol Biol 2001, 310:311-325.
sequence identity usually conserve the geometry of their interaction.
32. Liu Y, Gerstein M, Engelman DM: Evolutionary use of domain
46. Serres MH, Goswami S, Riley M: GenProtEC: an updated and
recombination: a distinction between membrane and soluble
improved analysis of functions of Escherichia coli K-12
proteins. Proc Natl Acad Sci USA 2004, in press.
proteins. Nucleic Acids Res 2004, 32:D300-D302.
33. Park J, Lappe M, Teichmann SA: Mapping protein family
47. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R,
interactions: intramolecular and intermolecular protein family
Eilbeck K, Lewis S, Marshall B, Mungall C et al.: The Gene
interaction repertoires in the PDB and yeast. J Mol Biol 2001,
Ontology (GO) database and informatics resource.
Nucleic Acids Res 2004, 32:D258-D261.
34. Harrison SC: Variation on an Src-like theme. Cell 2003,
48. Mewes HW, Amid C, Arnold R, Frishman D, Guldener U, Mannhaupt
G, Munsterkotter M, Pagel P, Strack N, Stumpflen V et al.: MIPS:
This review presents a good example of domains that reappear in
analysis and annotation of proteins from whole genomes.
different domain contexts: the SH3, SH2 and kinase domains recombine
Nucleic Acids Res 2004, 32:D41-D44.
with various other domains in signal transduction multidomain proteins.
49. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B,
35. Pawson T, Nash P: Assembly of cell regulatory systems
Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN
through protein interaction domains. Science 2003,
et al.: The COG database: an updated version includes
eukaryotes. BMC Bioinformatics 2003, 4:41.
This review illustrates the modularity of proteins in a colourful manner. Itdescribes the reuse of protein interaction domains, such as SH2 and SH3
50. Bairoch A: The ENZYME database in 2000. Nucleic Acids Res
domains and others, in the regulation of different cellular processes. The
2000, 28:304-305.
authors focus on the properties of single domains, but also point out theincrease in dimensionality of functions and interactions when domains are
51. Todd AE, Orengo CA, Thornton JM: Evolution of protein function,
combined to form multidomain proteins.
from a structural perspective. Curr Opin Chem Biol 1999,3:548-556.
36. Chothia C, Gough J, Vogel C, Teichmann SA: Evolution of the
protein repertoire. Science 2003, 300:1701-1703.
52. Todd AE, Orengo CA, Thornton JM: Evolution of function in
protein superfamilies, from a structural perspective.
37. Bashton M, Chothia C: The geometry of domain combination in
J Mol Biol 2001, 307:1113-1143.
proteins. J Mol Biol 2002, 315:927-939.
This study presents a detailed analysis of the structures of proteins
53. Bartlett GJ, Borkakoti N, Thornton JM: Catalysing new reactions
containing Rossmann fold domains in combination with other domain
during evolution: economy of residues and mechanism.
superfamilies. It demonstrates that, in all the cases studied, the N- to C-
J Mol Biol 2003, 331:829-860.
terminal order of the domains is conserved because the proteins have
A detailed analysis of catalytic sites and residues in homologous enzymes
descended from a common ancestor. For pairs of proteins in the PDB in
of different function reveals the economy of evolution: the residue types,
which the order is reversed, the interface and functional relationships of
functions and mechanistic steps are frequently conserved.
the domains are altered.
54. Brenner SE: Target selection for structural genomics.
38. Coin L, Bateman A, Durbin R: Enhanced protein domain
Nat Struct Biol 2000, 7(suppl):967-969.
discovery by using language modeling techniques from
55. Kriventseva EV, Koch I, Apweiler R, Vingron M, Bork P, Gelfand MS,
speech recognition. Proc Natl Acad Sci USA 2003,
Sunyaev S: Increase of functional diversity by alternative
splicing. Trends Genet 2003, 19:124-128.
The technique presented in this paper formalises the use of domainassociations for the improvement of domain assignment to protein
56. Lespinet O, Wolf YI, Koonin EV, Aravind L: The role of lineage-
sequences. In analogy to speech recognition methods that use context
specific gene family expansion in the evolution of eukaryotes.
information to improve recognition of words, assignment of domains is
Genome Res 2002, 12:1048-1059.
improved using information on their domain combination context inmultidomain proteins.
57. Letunic I, Goodstadt L, Dickens NJ, Doerks T, Schultz J, Mott R,
Ciccarelli F, Copley RR, Ponting CP, Bork P: Recent
39. Hegyi H, Gerstein M: Annotation transfer for genomics:
improvements to the SMART domain-based sequence
measuring functional divergence in multi-domain proteins.
annotation resource. Nucleic Acids Res 2002, 30:242-244.
Genome Res 2001, 11:1632-1640.
58. Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR,
40. Aloy P, Russell RB: Interrogating protein interaction networks
Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL:
through structural biology. Proc Natl Acad Sci USA 2002,
The Pfam protein families database. Nucleic Acids Res 2002,
The authors describe a method to assess the likelihood of an interfaceforming between two proteins when the components are modelled on
59. Geer LY, Domrachev M, Lipman DJ, Bryant SH: CDART:
complexes of known structure.
protein homology by domain architecture. ConservedDomain Architecture Retrieval Tool. Genome Res 2003,
41. Prabu MM, Suguna K, Vijayan M: Variability in quaternary
association of proteins with the same tertiary fold: a case studyand rationalization involving legume lectins. Proteins 1999,
60. Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA,
Geer LY, Bryant SH: CDD: a database of conserved domainalignments with links to domain three-dimensional structure.
42. Spahn CM, Beckmann R, Eswar N, Penczek PA, Sali A, Blobel G,
Nucleic Acids Res 2002, 30:281-283.
Frank J: Structure of the 80S ribosome from saccharomycescerevisiae–tRNA-ribosome and subunit-subunit interactions.
61. Pohl E, Holmes RK, Hol WG: Crystal structure of the iron-
Cell 2001, 107:373-386.
dependent regulator (IdeR) from Mycobacterium tuberculosis
Current Opinion in Structural Biology 2004, 14:208–216
Theory and simulation
shows both metal binding sites fully occupied. J Mol Biol 1999,
65. Xu Y, Heath RJ, Li Z, Rock CO, White SW: The FadR.DNA
complex. Transcriptional control of fatty acid metabolism inEscherichia coli. J Biol Chem 2001, 276:17373-17379.
62. Pohl E, Goranson-Siekierke J, Choi MK, Roosild T, Holmes RK,
Hol WG: Structures of three diphtheria toxin repressor (DtxR)
66. Wah DA, Hirsch JA, Dorner LF, Schildkraut I, Aggarwal AK:
variants with decreased repressor activity. Acta Crystallogr
Structure of the multimodular endonuclease FokI bound to
2001, 57:619-627.
DNA. Nature 1997, 388:97-100.
63. Kisker C, Hinrichs W, Tovar K, Hillen W, Saenger W: The complex
67. Liu S, Widom J, Kemp CW, Crews CM, Clardy J: Structure of
formed between Tet repressor and tetracycline-Mg2R reveals
human methionine aminopeptidase-2 complexed with
mechanism of antibiotic resistance. J Mol Biol 1995,
fumagillin. Science 1998, 282:1324-1327.
68. Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A,
64. Schumacher MA, Miller MC, Grkovic S, Brown MH, Skurray RA,
Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I et al.:
Brennan RG: Structural mechanisms of QacR induction and
The SWISS-PROT protein knowledgebase and its supplement
multidrug recognition. Science 2001, 294:2158-2163.
TrEMBL in 2003. Nucleic Acids Res 2003, 31:365-370.
Current Opinion in Structural Biology 2004, 14:208–216
Source: http://people.unica.it/elisabettasoro/files/2012/04/Struttura_proteine.pdf
Pharmacological upregulation of h-channels reduces the excitabilityof pyramidal neuron dendrites Nicholas P. Poolos1–3, Michele Migliore4,5 and Daniel Johnston1 1 Division of Neuroscience and 2Department of Neurology, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA 3 Current address: Department of Neurology and Regional Epilepsy Center, University of Washington, 325 9th Avenue, Seattle, Washington 98104, USA4 Yale University School of Medicine, Section of Neurobiology, New Haven, Connecticut 06520, USA5 Permanent address: National Research Council, Institute of Advanced Diagnostic Methodologies, Palermo 90146, Italy
DICTAMEN Nº. 122/2010, de 7 de julio.* Expediente relativo a reclamación de responsabilidad patrimonial de la Administra-ción Sanitaria a instancia de D. K, D.ª C, D.ª T, D. Z, D.ª J y D.ª Q, como consecuencia de la asistencia sanitaria recibida por su hijo y hermano respectivamente, D. X, en el Área de Salud Mental del Servicio de Salud de Castilla-La Mancha (SESCAM).