Doi:10.1016/j.sbi.2004.03.01

Structure, function and evolution of multidomain proteinsChristine Vogel, Matthew Bashton, Nicola D Kerrison,Cyrus Chothia and Sarah A Teichmann Proteins are composed of evolutionary units called domains; the residues in the proteins of completely sequenced genomes majority of proteins consist of at least two domains. These using homology-based methods. These include the profile domains and nature of their interactions determine the hidden Markov models in the SUPERFAMILY database function of the protein. The roles that combinations of the structural profiles of the PSSM server the domains play in the formation of the protein repertoire have PSI-BLAST profiles in the Gene3D database or com- been found by analysis of domain assignments to genome bined approaches . From the assignment of structural sequences. Additional findings on the geometry of domains domains to genome sequences, it is clear that some two- have been gained from examination of three-dimensional thirds of proteins consist of two or more domains in pro- protein structures. Future work will require a domain-centric karyotes and an even larger fraction in eukaryotes functional classification scheme and efforts to determinestructures of domain combinations.
As most proteins consist of multiple domains, anddomains determine the function and evolutionary rela- tionships of proteins, it is important to understand the MRC Laboratory of Molecular Biology, Hills Road, principles of domain combinations and interactions. In Cambridge CB2 2QH, UK this review, we discuss how domain superfamilies form the repertoire of multidomain proteins via duplicationand recombination (We then describe the Current Opinion in Structural Biology 2004, 14:208–216 principles and extent of conservation of the N- to C-terminal order of domains, their three-dimensional geo- This review comes from a themed issue on metry and their functional relationships. This will illus- Theory and simulationEdited by Joel Janin and Thomas Simonson trate the importance of domain combinations to anunderstanding of protein evolution, structure and func- 0959-440X/$ – see front matter tion, and to target selection in structural genomics.
ß 2004 Elsevier Ltd. All rights reserved.
Proteins are formed by duplication,divergence and recombination of domainsIn order to understand how multidomain proteins func- Protein Data Bank tion, it is useful to know how they are created in evolution root mean square deviation and how they are related to each other. Duplication is one Structural Classification of Proteins of the main sources for creation of new whole genes and this is also true at the level of domains: at least 58% of Winged helix domain the domains in Mycoplasma and 98% of the domainsin humans are duplicates. The domains of different superfamilies are duplicated to different extents There are various uses of the word domain with respect to and this results in a distribution of superfamily sizes in proteins. Here, we define a protein domain as an inde- genomes that follows a power law . This means pendent, evolutionary unit that can form a single-domain that there are a few highly abundant superfamilies, for protein or be part of one or more different multidomain example, the P-loop NTP hydrolases, NAD(P)-binding proteins. The domain can either have an independent Rossmann domains and certain kinase families. The function or contribute to the function of a multidomain expansion of superfamilies in a particular phylogenetic protein in cooperation with other domains. The definition group can deliver one explanation for the characteristics of a domain as an evolutionary unit is used in the Struc- of the organisms in that group (e.g. the immunoglobulin tural Classification of Proteins (SCOP) database .
proteins in metazoa) . Once a domain or proteinhas duplicated, it can evolve a new or modified function In SCOP, domains that have a common ancestor based on either by sequence divergence or by combining with sequence, structural and functional evidence are grouped other domains to form a multidomain protein with a into superfamilies. There are more than 1200 domain new series of domains. The N- to C-terminal series of superfamilies in the current version of the database domains in a protein is its ‘domain architecture'.
though estimates of the total number of superfamilies varyfrom a few to several thousand . Domains from the We will consider the recombination of domains in order superfamilies in SCOP can be assigned to 40–60% of the to form different domain architectures in more detail.
Current Opinion in Structural Biology 2004, 14:208–216 Multidomain proteins Vogel et al.
The major molecular mechanism that leads to multi-domain proteins and novel combinations is non- The repertoire of domain superfamilies.
homologous recombination, sometimes referred to as‘domain shuffling'. In eukaryotes, there is evidence thatthere is a tendency for exon boundaries to coincide withdomain boundaries, which suggests that proteins may beformed by intronic recombination (e.g. Another important recombinatorial mechanism is the fusion ofgenes which is more common than the splittingor fission of a gene .
The properties of the repertoire of domain .duplicates and recombines to form single and multi-domain Our knowledge of the domain architectures of proteinsstems from the assignment of structural domains to thewhole or part of 40–60% of the predicted proteins fromcompletely sequenced genomes From examinationof these proteins, it has become clear that the formation ofnew domain combinations is an important mechanismin protein evolution. The proteins from more than 100 different organisms contain several thousand differentcombinations of two superfamilies , but this isfar fewer (less than 0.5%) than would be possible given The same combination can adopt different geometries… the total number of superfamilies or the number ofmultidomain proteins per proteome . This number islikely to decrease even more if membrane proteins areincluded . The limited repertoire of domain combi-nations that are observed in proteins indicates that allcombinations have been under strong selection.
A few domain superfamilies are highly versatile and haveneighbouring domains from many superfamilies, whereasmost superfamilies are little versatile Thedistribution of the number of partner superfamilies .and/or different functions.
per superfamily follows a power law, like the distributionof superfamily sizes mentioned above. Despite thesegeneral principles, each domain superfamily has itsown story. Some superfamilies are highly versatile, someare highly abundant and some superfamilies are bothIt is the structure and function of the domainsand domain combinations that determine why they have Catalytic site or ligand been selected.
The properties observed for single domains are similar to Current Opinion in Structural Biology those for combinations of two or more domains A few two-domain combinations, for example, are highly The role of domains in protein evolution. Overview of different versatile and occur with many different additional do- aspects of multidomain proteins: the repertoire of domainsuperfamilies and their role in the formation of multidomain proteins by mains, but most two-domain combinations occur in only duplication and recombination, and the geometry and functional one or two different protein contexts. Important examples relationships of domains within these combinations. Domains belonging of the reuse of particular domains and domain combina- to the same superfamily are represented as rectangles of the same tions come from signal transduction. For instance, the colour. Supradomains are two- or three-domain combinations thatoccur in different domain architectures with different N- and C-terminal combination of SH3 and SH2 domains recurs in several neighbours, as shown in the second panel. These short series of different signal transduction proteins and this domains form functional units that are reused in different protein versatility of recombination qualifies the SH3–SH2 domain pair as what we have called a ‘supradomain'Supradomains are two- or three-domain combi-nations that occur in different domain architectures with Current Opinion in Structural Biology 2004, 14:208–216 Theory and simulation different N- and C-terminal neighbours, a concept illu- there is a tendency for the geometry of interaction of strated in .
protein domains to be more conserved the more similarthe domain sequences are. They assessed similarity of The sequential order of domains is domain geometry by comparing the average shift of a group of points in each of two domains. In order to understand If the same domain combination is observed in two the changes in geometry of domain combinations in more different proteins, one possibility is that they have evolved detail, Kerrison et al. (see also studied the rotation, by duplication rather than assembled independently by shift, interface size and residue contacts of related two- different recombinatorial routes. Most instances of the domain combinations (N Kerrison et al., unpublished).
same two-domain combination or domain architecture They analysed 143 pairs of homologous proteins with have evolved from the same ancestor; there are several two domains, extracted from SCOP version 1.63 They lines of evidence that support this. First, three-dimen- found that, when one pair of homologous domains are sional structural analyses of individual protein families, superposed, the positions of the two second domains such as the Rossmann domains have shown that mostly differ by shifts and rotations of less than 5 A proteins with the same domain architecture are relatedby descent (i.e. evolved from one common ancestor).
Although automatic approaches are useful to gain first Unpublished data by Kerrison, Chothia and Teichmann insights into general relationships, manual inspection has shown that this is true for most two-domain protein and alignment of the residues at the interface of domains families of known structure in the current databases can reveal the precise nature of the changes. This is Second, with only a small fraction of exceptions (less illustrated by the examples in As shown in than 10%), two domains occur in only one N- to C- , homodimeric transcriptional repressors of the terminal order in structural assignments to genome Iron-dependent repressor protein superfamily consist sequences This conservation of domain order is of a Winged helix domain (WHD) and a dimerisation likely to be historical instead of functional, as a very domain. The two proteins shown have conserved inter- similar interface and functional sites could be formed domain geometry. In contrast, shows two by two domains in either order, for instance, given a long different proteins for which the domain geometry has linker The conserved order of domains can thus be changed. Both proteins consist of a Homeodomain-like exploited to improve domain assignments to protein DNA-binding domain and a Tetracyclin repressor-like sequences Last, proteins sharing the same series C-terminal domain. The rotation and shift of the latter of domains tend to have the same function , which domain become clear when the interface residues of the is rarely the case if domain order is switched ( N-terminal domains are aligned J Gough, personal communication).
Functional relationships of domains in The geometry of domain combinations multidomain proteins Above, we discussed how the sequential order of domains In order to delineate the functional relationships of within multidomain proteins of the same domain archi- domains in single- and multi-domain proteins, one needs tecture is largely conserved and suggests homology. We a systematic understanding of the domain functions in will now describe the combinations and interactions of different contexts, that is to say, the range of functions domains on a different level, that is, with respect to their of a particular domain depending on its different partner three-dimensional arrangement or geometry. The geo- domains. Existing functional classification schemes, such metry of Rossmann domains and their partner domains as GenProtEC GO , MIPS that used in on one protein chain is conserved whenever the partner COGs and the EC (Enzyme Commission) classifica- domains are from the same superfamily Studies of tion operate at the level of the whole protein and are small numbers of families of protein complexes, for which thus inadequate to describe the contribution of the indi- the interdomain geometry occurs across different poly- vidual domains to protein function.
peptide chains, have also revealed extensive conservationof geometry These results suggested that, for In order to understand the molecular roles of individual proteins of unknown structure, their quaternary structure domains, it is vital to know their three-dimensional and complex geometry can usually be modelled based on structure. Todd et al. and Bartlett et al. homologous polypeptide(s) of known structure. Impres- have studied domain function and evolution at a detailed sive examples of such structural models of complexes structural level, but were primarily concerned with indi- include the yeast ribosome and exosome vidual superfamilies as opposed to domains in the contextof their combinations. The role of ‘ancillary' or ‘accessory' However, in order to thoroughly understand the evolution domains is alluded to briefly in their work.
of interdomain interfaces and assess the true extent ofconservation, a survey of all proteins of known structure is Bashton and Chothia (M Bashton, C Chothia, unpub- necessary. Aloy, Russell and co-workers found that lished) have developed a domain-centric scheme that Current Opinion in Structural Biology 2004, 14:208–216

Multidomain proteins Vogel et al.
emphasises domain function in the context of domainneighbours in multidomain proteins, providing func-tional annotations for a subset of SCOP domains. The annotation is based on detailed examination of the pro-tein structures, which is essential for understandingthe precise molecular function of the domain and itscontribution to the function of the whole protein. Inthis domain-centric functional classification scheme,domains are classified into seven categories that encom-pass catalytic activity, cofactor binding, responsibilityfor subcellular localisation, protein–protein interactionand so forth.
Two generic principles emerge. First, a domain can per-form the same function, but in different protein contexts(i.e. with different partner domains). This is illustrated byThe WHDs in these examplescombine with different sensory, regulatory and enzymaticdomains, but maintain their function in that they target the protein to a specific sequence. In contrast, the WHD in acts as a substrate specificity pocket and has no DNA-binding activity at all. This domain has divergedand acquired a novel or modified function. In an analogyto linguistics, one can describe the two different fates of a dimerisation domain of the Iron-dependent repressor proteinsuperfamily. (b) Two transcriptional repressors composed of aHomeodomain-like domain and a Tetracyclin repressor-like C-terminaldomain are shown in such a way that the N-terminal domains (in blackwith orange and yellow interface peptides) are in the same orientation.
The C-terminal domains are clearly rotated relative to each other. In thisrepresentation, the difference in geometry is apparent as a downwardtilt of the C-terminal domain of TetR (dark blue helices) relative to theQacR C-terminal domain (light blue helices). (c) Residues forming contacts at the interdomain interfaces of the structures in (b). Theorange and yellow interface peptides of the N-terminal domains aresuperposed, and the difference in the positions of the blue C-terminalinterface peptides is clear. Again it appears as a downward tilt of thedark blue TetR helices compared with the light blue QacR interfacepeptides. Structural information. (a) Iron-dependent regulatory proteinIdeR from Mycobacterium tuberculosis (PDB code 1b1b, chain A )structurally aligned with diphtheria toxin repressor DtxR fromCorynebacterium diphtheriae (PDB code 1g3t, chain B The N-terminal WHDs are shown in orange and yellow, and the C-terminaldimerisation domains are in different shades of blue. They have about80% sequence identity to each other and the interface between the twodomains is about 1400 A˚2 in both structures. (b) Tet repressor D fromEscherichia coli (PDB code 2tct and the Staphylococcus aureusmultidrug-binding protein QacR (PDB code 1jt6, chain A The N-terminal domains are DNA-binding Homeodomain-like domains and areshown in black, with the peptides forming the interdomain interface inorange and yellow. The C-terminal domains are Tetracyclin repressor- like and are shown in grey, with interface peptides in different shades of blue. The chains shown here are both in the ligand-bound state. The two proteins have about 10% sequence identity to each other bothacross the entire sequence and in the residues that form contacts at theinterdomain interfaces. The interface between the domains is around Current Opinion in Structural Biology 2100 A˚2 in both structures. (c) The residues from the N-terminaldomains of the structures in (b) that form interface contacts, shown in Geometry of domains in different transcriptional regulators. (a) orange and yellow, were superimposed and are shown in the same Superposition of two chains shows that the geometry of the domain orientation. The difference in position of the C-terminal interface pair is conserved in the two proteins. The two structures are residues, shown in shades of blue, is apparent and corresponds to a homodimeric transcriptional repressors consisting of a WHD and a shift of 10 A˚ and a rotation of 408.
Current Opinion in Structural Biology 2004, 14:208–216

Theory and simulation Current Opinion in Structural Biology Variation in function of the Winged helix domain in different proteins. These three proteins each contain WHDs and illustrate how a superfamily canundergo syntactical and semantic shifts in protein function in different domain contexts. Many transcription factors are made by combining aWHD with a sensory or regulatory domain, as in FadR shown in (a), and in the proteins shown in . The WHD can also be found inenzymatic proteins, such as restriction endonucleases, where it combines with a catalytic domain that nicks DNA, as in the FokI protein shownin (b). In (a,b), the WHD performs the same role (i.e. it targets the protein to a specific sequence), but the range of functions is achieved bycombining the WHD with different partner domains, so it is exhibiting a syntactical shift. (c) A semantic shift is found in human methionineaminopeptidase 2 in which the WHD acts as a substrate specificity pocket with no DNA-binding activity at all. Structural information. (a)FadR (PDB code 1hw2 ): for the a chain, the WHD is orange and the oligomerisation/CoA-binding domain (fatty acid responsive transcriptionfactor FadR, C-terminal domain) is dark blue; for the b chain, the colours are yellow and light blue, respectively. (b) Restriction endonuclease FokI(PDB code 1fok ): the three WHDs are shown in yellow, orange and red (N- to C-terminal order), and the catalytic domain (Restrictionendonuclease-like) is in blue. (c) Human methionine aminopeptidase 2 (PDB code 1boa the Creatinase/aminopeptidase domain is shown in light blue and the WHD is in orange. The WHDs of FadR and human methionine aminopeptidase 2 superpose with RMSD ¼ 0:995 A (not shown).
particular domain superfamily as syntactical and semantic distribution across the three kingdoms of life or their versatility with respect to other combination partners. Our concept of versatile domain combinations, or Domain combinations of known and supradomains (introduces another useful filter unknown structure . As mentioned above, supradomains are two- or The discussion of domain function above illustrates three-domain combinations that occur in different how knowledge of three-dimensional structures of pro- domain architectures with different N- and C-terminal teins is key to a detailed understanding of how they neighbours. The 200 most duplicated two-domain supra- work. For this reason, enormous efforts are currently domains, for example, occur in more than 75 000 underway in structural genomics projects to obtain com- sequences (28% of the sequences with domain assign- plete structural coverage of all domain superfamilies and ments) from 113 archaeal, bacterial and eukaryotic folds as far as possible . However, beyond the struc- completely sequenced genomes. Knowledge of their ture determination of single domains, the structure structure can hence provide insights into the function determination of a wide range of domain combinations of almost one-third of the sequences in this data set.
is crucial for a deeper understanding of protein relation- lists the ten most abundant combinations of ships, functions and interactions, for the reasons we these 200 supradomains that do not have homologues have discussed in the previous sections. Furthermore, of known structure. Almost all of these combinations for many domain architectures, multidomain proteins occur in biochemically characterised proteins. However, of known structure can be used confidently as a temp- they also occur in many other uncharacterised proteins.
late for the domain geometry of homologous pro- One example is the above-mentioned winged helix teins of unknown structure (N Kerrison et al., DNA-binding domain in combination with the periplas- mic binding protein II domain (). This domaincombination alone occurs in almost 2000 sequences of Domain combinations can be prioritised for target selec- unknown function, many of which could be regulators tion according to different criteria: their abundance, their like the two examples in the table. Exact knowledge of Current Opinion in Structural Biology 2004, 14:208–216 Multidomain proteins Vogel et al.
The most duplicated two-domain combinations – targets for structure determination.
Domain combination Known proteins with the domain combinations (examples) category (possible) Winged helix DNA-binding domain and OXYR_ECOLI: OxyR is a positive regulator of periplasmic binding protein-like II hydrogen-peroxide-inducible genes in E. coli and otherbacteria, and is homologous to other regulatory proteinsNODD_RHILE: NodD is responsible for activatingtranscription of the Nod genes in the bacterium inresponse to plant inducers Homeodomain-like and ribonuclease-H-like TC3A_CAEEL: Transposase in Caenorhabditis elegans Signal transduction PRGR_HUMAN: The human progesterone receptor is domain) and nuclear receptor involved in the regulation of gene expression, and affects ligand-binding domain cellular proliferation and differentiation in target tissues PYP-like sensor domain (PAS domain) Signal transduction NTRB_ECOLI: NTRB acts as a signal transducer involved and homodimeric domain of in nitrogen regulation in E. coli signal-transducing histidine kinase TORS_ECOLI: The TorS sensor protein in E. coli is partof a two-component regulatory system ATPase domain of HSP90 chaperone/DNA Signal transduction ARCB_ECOLI: ArcB is a member of the two-component topoisomerase II/histidine kinase and regulatory system arcB/arcA. Sensor-regulator protein for anaerobic repression of the arc modulon Actin-like ATPase domain and heat HSCC_ECOLI: Hsc62 is a DnaK homologue of E. coli shock protein 70 kDa (HSP70), HS7F_CAEEL: The protein is a member of the Hsp70 C-terminal substrate-binding fragment multi-gene family of mitochondrial chaperones in C. elegans Calcium ATPase, transmembrane PMA1_CANAL: Plasma membrane H-ATPase from domain M and HAD-like Candida albicans. The H-pump produces a protongradient that is used for active nutrient transport Growth factor receptor domain and Signal transduction MTN3_HUMAN: Matrilin-3 is an extracellular matrix protein and a major component of cartilageFBL2_HUMAN: Fibulin-2 binds, depending on calcium,to fibronectin and other ligandsEGF_HUMAN: Human epidermal growth factor receptor Winged helix DNA-binding domain and GATR_ECOLI: A repressor of the GAT operon for phosphosugar isomerase galacticol transport and metabolism in E. coli TRAF domain-like and POZ domain Signal transduction SPOP_HUMAN: Speckle-type POZ protein is an antigenrecognised by serum from a scleroderma patientThe domain combination occurs in no other protein inSwissProt This table lists those two-domain combinations without homologues of known structure in decreasing order of occurrence in proteins of completelysequenced genomes. Only combinations of two different domains are shown; repetitions of the same domain are not listed. The last columnprovides one or more examples of biochemically characterised proteins from SwissProt (version 41.20 ) that contain the particular domaincombination. It should be noted that some of these domain pairs that are common in genome sequences could be flexible instead of having a rigidinterdomain geometry. If this is the case, structure determination is more difficult and less meaningful in terms of the domain geometry, though thedomains are still functionally linked of course. aThe domain combination occurs in all three kingdoms of life.
the properties of these domain combinations will con- emergence of new combinations is linked to speciation tribute enormously to annotation in terms of protein and specific phylogenetic groups: whereas more than half function and structure.
of all domain superfamilies are common to archaea,bacteria and eukaryotes, this is the case for only about 5% of two-domain combinations . Domain com- We have provided an overview of the role of domain binations and expansions of domain superfamilies, as well combinations in the formation of the protein repertoire.
as other processes such as alternative splicing play From the fairly comprehensive domain assignments that important roles in the emergence of more complex organ- are available for completely sequenced genomes, it has become clear that the majority of proteins, even in simplegenomes, are multidomain. Though the domain combi- Proteins that contain the same domain combination or nations observed are only a small fraction of all possible have the same domain architecture tend to have a combinations of the repertoire of protein families, the common ancestor and common functional features. For Current Opinion in Structural Biology 2004, 14:208–216 Theory and simulation instance, supradomains are combinations of domains that Buchan DW, Rison SC, Bray JE, Lee D, Pearl F, Thornton JM,Orengo CA: Gene3D: structural assignments for the adopt a function that is useful within a variety of different biologist and bioinformaticist alike. Nucleic Acids Res 2003, domain architectures. Several domain assignment servers now offer tools for searches for particular domain combi- 10. Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Barrell D, Bateman nations and architectures (e.g. SUPERFAMILY A, Binns D, Biswas M, Bradley P, Bork P et al.: The InterProDatabase, 2003 brings increased coverage and new features.
SMART , Pfam and the Conserved Domain Nucleic Acids Res 2003, 31:315-318.
Architecture Retrieval Tool [CDART] connected InterPro is a metaserver that combines several domain assignment with the Conserved Domain Database [CDD] servers, including PRINTS, PROSITE, Pfam, ProDom, SMART, TIGR-FAMs and also SUPERFAMILY, and integrates information on proteinfamilies, domains and functional sites.
In order to understand the molecular details of the func- 11. Teichmann SA, Park J, Chothia C: Structural assignments to tions of domain combinations, the three-dimensional the Mycoplasma genitalium proteins show extensive geneduplications and domain rearrangements. Proc Natl Acad structure of the domain architecture is needed. Structural Sci USA 1998, 95:14658-14663.
genomics projects could therefore have novel combina- 12. Gerstein M: How representative are the known structures of the tions of domains, in addition to the structures of indivi- proteins in a complete genome? A comprehensive structural dual domains, as their aim. Target selection of domain census. Fold Des 1998, 3:497-512.
combinations could be based on the number of proteins 13. Lynch M, Conery JS: The evolutionary fate and consequences of containing the domain combination, as well as the versa- duplicate genes. Science 2000, 290:1151-1155.
tility of the domain combination in terms of occurrence in 14. Muller A, MacCallum RM, Sternberg MJ: Structural characterization of the human proteome. Genome Res 2002, different domain architectures.
This large-scale study of the human and three other eukaryote pro- Knowledge of this kind could be integrated with genome- teomes, as well as several bacterial and archaeal proteomes, focuseson domain superfamilies rather than whole proteins. The authors describe scale data of different types and contribute towards a the duplication and expansion of specific domain superfamilies in more comprehensive understanding of the evolution of the human genome compared with other organisms. They also discusstransmembrane and disease-related proteins, and domain superfamilies the structure and function of the protein repertoire.
in the human genome.
15. Wolf YI, Karev G, Koonin EV: Scale-free networks in biology: new insights into the fundamentals of evolution? Bioessays 2002, We are grateful to Jung-Hoon Han for help with CV has a pre-doctoral fellowship from the Boehringer Ingelheim Fonds.
16. Qian J, Luscombe NM, Gerstein M: Protein family and fold occurrence in genomes: power-law behaviour and References and recommended reading evolutionary model. J Mol Biol 2001, 313:673-681.
Papers of particular interest, published within the annual period of 17. Wuchty S: Scale-free behavior in protein domain networks.
review, have been highlighted as: Mol Biol Evol 2001, 18:1694-1702.
of special interest 18. Hill E, Broadbent ID, Chothia C, Pettitt J: Cadherin superfamily of outstanding interest proteins in Caenorhabditis elegans and Drosophilamelanogaster. J Mol Biol 2001, 305:1011-1024.
Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP - a structuralclassification of proteins database for the investigation of 19. Vogel C, Teichmann SA, Chothia C: The immunoglobulin sequences and structures. J Mol Biol 1995, 247:536-540.
superfamily in Drosophila melanogaster and Caenorhabditiselegans and the evolution of complexity. Development 2003, Andreeva A, Howorth D, Brenner SE, Hubbard TJ, Chothia C, Murzin AG: SCOP database in 2004: refinements integratestructure and sequence family data. Nucleic Acids Res 2004, 20. Chervitz SA, Aravind L, Sherlock G, Ball CA, Koonin EV, Dwight SS, Harris MA, Dolinski K, Mohr S, Smith T et al.: Comparison of thecomplete protein sets of worm and yeast: orthology and Chothia C: Proteins - 1000 families for the molecular biologist.
divergence. Science 1998, 282:2022-2028.
Nature 1992, 357:543-544.
21. Aravind L, Subramanian G: Origin of multicellular eukaryotes - Orengo CA, Jones DT, Thornton JM: Protein superfamilies and insights from proteome comparisons. Curr Opin Genet Dev domain superfolds. Nature 1994, 372:631-634.
1999, 9:688-694.
Coulson AF, Moult J: A unifold, mesofold, and superfold model 22. Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, of protein fold use. Proteins 2002, 46:61-71.
Devon K, Dewar K, Doyle M, FitzHugh W et al.: Initial sequencing Gough J, Karplus K, Hughey R, Chothia C: Assignment of and analysis of the human genome. Nature 2001, 409:860-921.
homology to genome sequences using a library of hidden 23. Kaessmann H, Zollner S, Nekrutenko A, Li WH: Signatures of Markov models that represent all proteins of known structure.
domain shuffling in the human genome. Genome Res 2002, J Mol Biol 2001, 313:903-919.
Madera M, Vogel C, Kummerfeld SK, Chothia C, Gough J: The 24. Patthy L: Genome evolution and the evolution of exon- SUPERFAMILY database in 2004: additions and improvements.
shuffling–a review. Gene 1999, 238:103-114.
Nucleic Acids Res 2004, 32:D235-D239.
The SUPERFAMILY database provides assignments of the over 1200 25. Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA: Protein domain superfamilies, as defined in the SCOP database, to proteins using interaction maps for complete genomes based on gene fusion highly sensitive hidden Markov models. Close to 60% of all proteins have events. Nature 1999, 402:86-90.
at least one match and one half of all residues are covered by assign-ments. The database is located at and updated 26. Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D: twice a year. SUPERFAMILY is now part of InterPro.
A combined algorithm for genome-wide prediction of proteinfunction. Nature 1999, 402:83-86.
Kelley LA, MacCallum RM, Sternberg MJ: Enhanced genomeannotation using structural profiles in the program 3D-PSSM.
27. Snel B, Bork P, Huynen M: Genome evolution. Gene fusion J Mol Biol 2000, 299:499-520.
versus gene fission. Trends Genet 2000, 16:9-11.
Current Opinion in Structural Biology 2004, 14:208–216 Multidomain proteins Vogel et al.
28. Yanai I, Wolf YI, Koonin EV: Evolution of gene fusions: horizontal 43. Beckmann R, Spahn CM, Eswar N, Helmers J, Penczek PA, Sali A, transfer versus independent events. Genome Biol 2002, Frank J, Blobel G: Architecture of the protein-conducting channel associated with the translating 80S ribosome.
Cell 2001, 107:361-372.
29. Apic G, Huber W, Teichmann SA: Multi-domain protein families and domain pairs: comparison with known structures and a 44. Aloy P, Ciccarelli FD, Leutwein C, Gavin AC, Superti-Furga G, random model of domain recombination. J Struct Funct Bork P, Bottcher B, Russell RB: A complex prediction: three- Genomics 2003, 4:67-78.
dimensional model of the yeast exosome. EMBO Rep 2002,3:628-635.
30. Vogel C, Berzuini C, Bashton M, Gough J, Teichmann SA: Supra-domains - evolutionary units larger than single protein 45. Aloy P, Ceulemans H, Stark A, Russell RB: The relationship domains. J Mol Biol 2004, in press.
between sequence and interaction divergence in proteins.
J Mol Biol 2003, 332:989-998.
31. Apic G, Gough J, Teichmann SA: Domain combinations in Using RMSD as a simple measure to compare interactions between archaeal, eubacterial and eukaryotic proteomes.
domains, the authors found that homologues with more than 30% J Mol Biol 2001, 310:311-325.
sequence identity usually conserve the geometry of their interaction.
32. Liu Y, Gerstein M, Engelman DM: Evolutionary use of domain 46. Serres MH, Goswami S, Riley M: GenProtEC: an updated and recombination: a distinction between membrane and soluble improved analysis of functions of Escherichia coli K-12 proteins. Proc Natl Acad Sci USA 2004, in press.
proteins. Nucleic Acids Res 2004, 32:D300-D302.
33. Park J, Lappe M, Teichmann SA: Mapping protein family 47. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, interactions: intramolecular and intermolecular protein family Eilbeck K, Lewis S, Marshall B, Mungall C et al.: The Gene interaction repertoires in the PDB and yeast. J Mol Biol 2001, Ontology (GO) database and informatics resource.
Nucleic Acids Res 2004, 32:D258-D261.
34. Harrison SC: Variation on an Src-like theme. Cell 2003, 48. Mewes HW, Amid C, Arnold R, Frishman D, Guldener U, Mannhaupt G, Munsterkotter M, Pagel P, Strack N, Stumpflen V et al.: MIPS: This review presents a good example of domains that reappear in analysis and annotation of proteins from whole genomes.
different domain contexts: the SH3, SH2 and kinase domains recombine Nucleic Acids Res 2004, 32:D41-D44.
with various other domains in signal transduction multidomain proteins.
49. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, 35. Pawson T, Nash P: Assembly of cell regulatory systems Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN through protein interaction domains. Science 2003, et al.: The COG database: an updated version includes eukaryotes. BMC Bioinformatics 2003, 4:41.
This review illustrates the modularity of proteins in a colourful manner. Itdescribes the reuse of protein interaction domains, such as SH2 and SH3 50. Bairoch A: The ENZYME database in 2000. Nucleic Acids Res domains and others, in the regulation of different cellular processes. The 2000, 28:304-305.
authors focus on the properties of single domains, but also point out theincrease in dimensionality of functions and interactions when domains are 51. Todd AE, Orengo CA, Thornton JM: Evolution of protein function, combined to form multidomain proteins.
from a structural perspective. Curr Opin Chem Biol 1999,3:548-556.
36. Chothia C, Gough J, Vogel C, Teichmann SA: Evolution of the protein repertoire. Science 2003, 300:1701-1703.
52. Todd AE, Orengo CA, Thornton JM: Evolution of function in protein superfamilies, from a structural perspective.
37. Bashton M, Chothia C: The geometry of domain combination in J Mol Biol 2001, 307:1113-1143.
proteins. J Mol Biol 2002, 315:927-939.
This study presents a detailed analysis of the structures of proteins 53. Bartlett GJ, Borkakoti N, Thornton JM: Catalysing new reactions containing Rossmann fold domains in combination with other domain during evolution: economy of residues and mechanism.
superfamilies. It demonstrates that, in all the cases studied, the N- to C- J Mol Biol 2003, 331:829-860.
terminal order of the domains is conserved because the proteins have A detailed analysis of catalytic sites and residues in homologous enzymes descended from a common ancestor. For pairs of proteins in the PDB in of different function reveals the economy of evolution: the residue types, which the order is reversed, the interface and functional relationships of functions and mechanistic steps are frequently conserved.
the domains are altered.
54. Brenner SE: Target selection for structural genomics.
38. Coin L, Bateman A, Durbin R: Enhanced protein domain Nat Struct Biol 2000, 7(suppl):967-969.
discovery by using language modeling techniques from 55. Kriventseva EV, Koch I, Apweiler R, Vingron M, Bork P, Gelfand MS, speech recognition. Proc Natl Acad Sci USA 2003, Sunyaev S: Increase of functional diversity by alternative splicing. Trends Genet 2003, 19:124-128.
The technique presented in this paper formalises the use of domainassociations for the improvement of domain assignment to protein 56. Lespinet O, Wolf YI, Koonin EV, Aravind L: The role of lineage- sequences. In analogy to speech recognition methods that use context specific gene family expansion in the evolution of eukaryotes.
information to improve recognition of words, assignment of domains is Genome Res 2002, 12:1048-1059.
improved using information on their domain combination context inmultidomain proteins.
57. Letunic I, Goodstadt L, Dickens NJ, Doerks T, Schultz J, Mott R, Ciccarelli F, Copley RR, Ponting CP, Bork P: Recent 39. Hegyi H, Gerstein M: Annotation transfer for genomics: improvements to the SMART domain-based sequence measuring functional divergence in multi-domain proteins.
annotation resource. Nucleic Acids Res 2002, 30:242-244.
Genome Res 2001, 11:1632-1640.
58. Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, 40. Aloy P, Russell RB: Interrogating protein interaction networks Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL: through structural biology. Proc Natl Acad Sci USA 2002, The Pfam protein families database. Nucleic Acids Res 2002, The authors describe a method to assess the likelihood of an interfaceforming between two proteins when the components are modelled on 59. Geer LY, Domrachev M, Lipman DJ, Bryant SH: CDART: complexes of known structure.
protein homology by domain architecture. ConservedDomain Architecture Retrieval Tool. Genome Res 2003, 41. Prabu MM, Suguna K, Vijayan M: Variability in quaternary association of proteins with the same tertiary fold: a case studyand rationalization involving legume lectins. Proteins 1999, 60. Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thiessen PA, Geer LY, Bryant SH: CDD: a database of conserved domainalignments with links to domain three-dimensional structure.
42. Spahn CM, Beckmann R, Eswar N, Penczek PA, Sali A, Blobel G, Nucleic Acids Res 2002, 30:281-283.
Frank J: Structure of the 80S ribosome from saccharomycescerevisiae–tRNA-ribosome and subunit-subunit interactions.
61. Pohl E, Holmes RK, Hol WG: Crystal structure of the iron- Cell 2001, 107:373-386.
dependent regulator (IdeR) from Mycobacterium tuberculosis Current Opinion in Structural Biology 2004, 14:208–216 Theory and simulation shows both metal binding sites fully occupied. J Mol Biol 1999, 65. Xu Y, Heath RJ, Li Z, Rock CO, White SW: The FadR.DNA complex. Transcriptional control of fatty acid metabolism inEscherichia coli. J Biol Chem 2001, 276:17373-17379.
62. Pohl E, Goranson-Siekierke J, Choi MK, Roosild T, Holmes RK, Hol WG: Structures of three diphtheria toxin repressor (DtxR) 66. Wah DA, Hirsch JA, Dorner LF, Schildkraut I, Aggarwal AK: variants with decreased repressor activity. Acta Crystallogr Structure of the multimodular endonuclease FokI bound to 2001, 57:619-627.
DNA. Nature 1997, 388:97-100.
63. Kisker C, Hinrichs W, Tovar K, Hillen W, Saenger W: The complex 67. Liu S, Widom J, Kemp CW, Crews CM, Clardy J: Structure of formed between Tet repressor and tetracycline-Mg2R reveals human methionine aminopeptidase-2 complexed with mechanism of antibiotic resistance. J Mol Biol 1995, fumagillin. Science 1998, 282:1324-1327.
68. Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, 64. Schumacher MA, Miller MC, Grkovic S, Brown MH, Skurray RA, Gasteiger E, Martin MJ, Michoud K, O'Donovan C, Phan I et al.: Brennan RG: Structural mechanisms of QacR induction and The SWISS-PROT protein knowledgebase and its supplement multidrug recognition. Science 2001, 294:2158-2163.
TrEMBL in 2003. Nucleic Acids Res 2003, 31:365-370.
Current Opinion in Structural Biology 2004, 14:208–216

Source: http://people.unica.it/elisabettasoro/files/2012/04/Struttura_proteine.pdf

joomla.pa.ibf.cnr.it

Pharmacological upregulation of h-channels reduces the excitabilityof pyramidal neuron dendrites Nicholas P. Poolos1–3, Michele Migliore4,5 and Daniel Johnston1 1 Division of Neuroscience and 2Department of Neurology, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA 3 Current address: Department of Neurology and Regional Epilepsy Center, University of Washington, 325 9th Avenue, Seattle, Washington 98104, USA4 Yale University School of Medicine, Section of Neurobiology, New Haven, Connecticut 06520, USA5 Permanent address: National Research Council, Institute of Advanced Diagnostic Methodologies, Palermo 90146, Italy

Microsoft word - 122-2010.doc

DICTAMEN Nº. 122/2010, de 7 de julio.* Expediente relativo a reclamación de responsabilidad patrimonial de la Administra-ción Sanitaria a instancia de D. K, D.ª C, D.ª T, D. Z, D.ª J y D.ª Q, como consecuencia de la asistencia sanitaria recibida por su hijo y hermano respectivamente, D. X, en el Área de Salud Mental del Servicio de Salud de Castilla-La Mancha (SESCAM).