HM Medical Clinic


B902356a 588.602

Systems biology approaches and pathway tools for investigatingcardiovascular diseasew Craig E. Wheelock,*ab A˚sa M. Wheelock,bcd Shuichi Kawashima,e Diego Diez,bMinoru Kanehisa,be Marjan van Erk,f Robert Kleemann,g Jesper Z. Haeggstro¨maand Susumu Gotob Received 4th February 2009, Accepted 26th March 2009First published as an Advance Article on the web 27th April 2009DOI: 10.1039/b902356a Systems biology aims to understand the nonlinear interactions of multiple biomolecularcomponents that characterize a living organism. One important aspect of systems biologyapproaches is to identify the biological pathways or networks that connect the differing elementsof a system, and examine how they evolve with temporal and environmental changes. The utilityof this method becomes clear when applied to multifactorial diseases with complex etiologies,such as inflammatory-related diseases, herein exemplified by atherosclerosis. In this paper, theinitial studies in this discipline are reviewed and examined within the context of the developmentof the field. In addition, several different software tools are briefly described and a novelapplication for the KEGG database suite called KegArray is presented. This tool is designed formapping the results of high-throughput omics studies, including transcriptomics, proteomics andmetabolomics data, onto interactive KEGG metabolic pathways. The utility of KegArray isdemonstrated using a combined transcriptomics and lipidomics dataset from a published studydesigned to examine the potential of cholesterol in the diet to influence the inflammatorycomponent in the development of atherosclerosis. These data were mapped onto the KEGGPATHWAY database, with a low cholesterol diet affecting 60 distinct biochemical pathways anda high cholesterol exposure affecting 76 biochemical pathways. A total of 77 pathways weredifferentially affected between low and high cholesterol diets. The KEGG pathways ‘‘Biosynthesisof unsaturated fatty acids'' and ‘‘Sphingolipid metabolism'' evidenced multiple changes ingene/lipid levels between low and high cholesterol treatment, and are discussed in detail.
Taken together, this paper provides a brief introduction to systems biology and the applicationsof pathway mapping to the study of cardiovascular disease, as well as a summary of availabletools. Current limitations and future visions of this emerging field are discussed, with theconclusion that combining knowledge from biological pathways and high-throughput omics datawill move clinical medicine one step further to individualize medical diagnosis and treatment.
a Department of Medical Biochemistry and Biophysics, Division of Physiological Chemistry II, Karolinska Institutet,S-171 77, Stockholm, Sweden. E-mail: [email protected]; An organism is an individual living system capable of reacting Fax: +46-8-736-0439; Tel: +46-8-5248-7630 to stimuli, reproducing and maintaining a stable structure b Bioinformatics Center, Institute for Chemical Research, over time. Organisms are composed of multiple individual Kyoto University, Uji, Kyoto, 611-0011, Japan components, e.g. cells and their corresponding genes, proteins, Lung Research Lab L4:01, Respiratory Medicine Unit, Departmentof Medicine, Karolinska Institutet, 171 76, Stockholm, Sweden metabolites, etc., which are all governed by an intricate d Karolinska Biomics Center Z5:02, Karolinska University Hospital, network of interactions. This network is not static, and the 171 76, Stockholm, Sweden various components evolve and adapt dynamically to internal Human Genome Center, Institute of Medical Science, University of and environmental changes. The study of this complex system Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo, 108-8639, Tokyo f Department of Physiological Genomics, TNO-Quality of Life, as a single entity is a challenge that has been traditionally BioSciences, Utrechtseweg 48, 3704 HE, Zeist, The Netherlands addressed by studying different components of the system g Department of Vascular and Metabolic Disease, TNO-Quality of in isolation. Although such approaches have produced a Life, BioSciences, Gaubius Laboratory, Zernikedreef 9,2333 CK, Leiden, The Netherlands significant amount of knowledge and understanding, they w Electronic supplementary information (ESI) available: Complete list are limited in their ability to predict the effects of alterations of all KEGG biochemical pathways identified by KegArray as being in single or multiple components upon the dynamics of the affected by low cholesterol treatment, high cholesterol treatment, and whole system. This limitation may reflect why in some cases, differentially affected between low and high cholesterol treatment. SeeDOI: 10.1039/b902356a significant research advances do not translate, for example, 588 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009 into improved therapeutics or a ‘‘cure'' for the disease under interindividual basis. The normal or control state is more study. The discipline of systems biology attempts to shift the appropriately categorized as one of dynamic stability in which way in which an organism is perceived to address the our concept of homeostasis is more correctly defined as complexity of living systems. Multiple definitions for systems homeodynamics.3 Accordingly, by defining the parameters of biology exist, one of which describes it as a new field of study the network that determine disease from healthy state, inter- that aims to understand the living cell as a complete system.1,2 ventions or treatments can be derived that are tailored for the In other words, systems biology seeks to understand how individual variability of the parameters for this steady state— system properties emerge from the nonlinear interactions of in other words, personalized medicine.
multiple components.3,4 The era of personalized medicine has been heralded for a The applications of systems biology approaches are number of years, and systems biology is a key component of increasing dramatically; however, the exact nature of what a this new paradigm.6–8 The intent is to identify disease before ‘‘systems approach'' entails remains diffuse in the literature.
pathogenic manifestation, thereby initiating therapeutic inter- The fundamental theme of systems biology is integration vention prior to significant adverse effects. Current medical practice is a reductionist approach that involves treating each disciplines.5 However, it should be noted that systems science problem or symptom in isolation. By these standards, the relief is not novel and has been advocated for many years in a of symptoms as determined by clinical evaluations following a number of research fields. At the simplest level, a systems treatment regimen embodies the definition of a cured or approach signifies a study based upon examining the entire maintained patient. A corresponding ‘‘limited'' systems ‘‘system'' simultaneously, as opposed to a reductionist biology approach, where a multitude of clinical and bio- approach that focuses on a single gene, metabolite, pathway, chemical variables are combined with multivariate statistical etc. In other words, a systems biology approach does not focus analyses often reveals that the patient indeed has been on identifying a single target or mechanism for an observed removed from the disease group following treatment, but phenotype (e.g. disease). Systems biology instead seeks to not necessarily back towards a healthy state as is often identify the biological networks or pathways that connect assumed. Instead, the treated patient belongs to a novel the differing elements of a system, and in the process describe biological status, distinctly different from both healthy the characteristics that define a shift in equilibrium, such as individuals and peers in the disease group. This novel metabolic fluxes or altered protein activities, which may cause pharmacological state is generally not discernable in classical a shift from a healthy to a diseased state. The hypothesis then medicine, as the patient per definition is classified as belonging becomes that those components of the network that are to the ‘‘healthy'' group as soon as the symptoms that define the disease are no longer detectable. More importantly, the and potentially descriptive of the disease, and accordingly classical reductionist approach does not reveal the novel represent potential targets for intervention to return the pharmaceutical state that the treatment regimen has induced, system to its original state (i.e. a healthy state). However, it and consequently implications on the patient's future health is important to realize that the concept of equilibrium may not cannot be predicted. In contrast, a true systems biology be as static as previously thought. It is more likely that approach offers the ability to distinguish between multiple equilibrium is a steady state that represents a range of disease, healthy, or pharmacological states, as well as fluctuations in the biological network that varies on an causative and adaptive responses and variables. However, in Associate Professor Craig E. Wheelock heads a research group at University and a professor at the Human Genome Center the Karolinska Institute that examines the role of bioactive lipid in the Institute of Medical Science at the University of mediators in inflammatory diseases, with a focus on cardiovascular Tokyo. His research involves deciphering systemic biological disease. He is broadly interested in the development of bioinfor- functions by integrated analysis of genomic and chemical matics tools for probing inflammatory diseases at the systems level.
Assistant Professor A˚sa M. Wheelock heads a research group at Dr. Marjan van Erk is a researcher at TNO Quality of Life who the Karolinska Institute that investigates pneumotoxicants and is interested in developing bioinformatical systems biology tools inflammatory lung diseases, as well as gel-based quantitative for metabolic and cardiovascular diseases.
Dr. Robert Kleemann heads a research unit at TNO Quality of Assistant professor Shuichi Kawashima is a researcher at the Life that investigates the role of inflammation in cardiovascular Human Genome Center in the Institute of Medical Science at disease and metabolic disorders and has particular interest in the University of Tokyo who is broadly interested in the devel- gene regulation and drug intervention.
opment of genome databases, bioinformatics web services and Professor Jesper Z. Haeggstro¨m heads a research group at the the biology of eukaryotic genomes.
Karolinska Institute that examines the role of bioactive lipid Dr. Diego Diez is a postdoctoral researcher at the Kyoto mediators in inflammatory disease.
University Bioinformatics Center working on applying systems Associate Professor Susumu Goto is interested in the develop- biology approaches to cardiovascular disease.
ment of databases for molecular interaction networks and Professor Minoru Kanehisa is the Director of the Bioinformatics network analysis using the KEGG database suite. His work also Center in the Institute for Chemical Research at Kyoto involves in silico metabolic reconstruction.
c The Royal Society of Chemistry 2009 Mol. BioSyst., 2009, 5, 588–602 589 order to make conclusions regarding causative relationships, it requiring the life scientist to become familiar with this research is necessary to have a sufficient number of variables and field. These technical properties provide information regarding observations. In addition, the quantitative quality and source the global behavior of the network and therefore of the of the data, as well as the choice of multivariate statistical tools biological system under study. For example, one important both in the experimental design and the post-experimental finding was the scale-free topology nature of biological analyses, are vital for interpretation.
networks. In this type of network, most nodes have few links, The increase in systems biology applications is a reflection whereas a few nodes have many links (called hubs or nexus of a ‘‘perfect storm'' of advances in analytical methodology, nodes). One of the translations of this characteristic into a computing power and data acquisition. The completion of the biological context is the hypothesis that hub nodes perform human genome sequencing project heralded the age of key functions in the network. Accordingly, many fundamental large-scale biology and data acquisition. This paradigm shift genes, proteins, enzymes and compounds have been identified coupled to commensurate developments in technology and as hubs in their respective biological networks. Another experimental techniques that can simultaneously interrogate consequence derived from this finding is that because of the many elements of a system (i.e., microarrays, mass spectro- sparse nature of scale-free networks (i.e. most nodes having a metry, computational power and the Internet) has led to a few edges), they are very robust to environmental alterations.
veritable explosion in ‘‘omics'' science and systems biology However, although network analysis can help us understand related research. The challenge for systems biology is to the behavior of the system as a whole, the importance of integrate the disparate disciplines of biology, chemistry, individual elements is not lost in this global view. For example, statistics, computer science and engineering into a cohesive the study of biological networks shows that complex networks science. Towards this end, it is necessary to develop common are constructed of recurrent simple motifs.29 Initially described platforms for the analysis, presentation and archiving of data in simple bacteria, these motifs are also found in the regulatory to ensure inter-laboratory and cross-disciplinary compatibility networks of higher eukaryotes and are fundamental to and accessibility of data sets. Significant steps have already understanding the behavior of complex networks, including been taken in this direction, and it is not our aim to review the biological networks. Moreover, the mathematical models used status of the technological platforms or compatibility of to generate the network itself can be used to predict the data formats, as these aspects have been covered in behavior of the network when specific elements are altered.
detail elsewhere.9–17 In contrast, this review focuses on the For example, what are the effects if a specific node of a gene integration of different types of data sets, and aims to regulatory network is removed by a knockout mutation? summarize the current state of systems biology research into How does this change affect the global stability and robustness cardiovascular disease as well as present a number of different of the network, and eventually, the phenotype of the pathway mapping tools that have been developed. In addition, studied system? Systems biology seeks to answer these and an example of a pathway analysis of atherosclerosis is other questions by modeling the relationship between the presented using a novel tool for mapping of omics data to the KEGG database suite.
One critical step is how the network is constructed from the raw data (transcriptomics, proteomics, metabolomics, etc.).
Networks in a nutshell This is accomplished by using different mathematicaltechniques, ranging from simple Pearson correlations to the One of the recurrent concepts in system biology is that of the use of ordinary differential equations, Boolean networks, etc.
network. Much of the early work in networks focused (reviewed in refs. 31 and 32). Through this modeling, on simple model organisms including bacteria, yeast and fundamental concepts in the understanding of biological nematodes;18–24 however, this work is expanding to the under- systems, like robustness, modularity, emergence, etc. are standing of human diseases.25–28 A network type of represen- incorporated. Unfortunately not of all these questions are tation formalizes the interaction of different components of a easily answered, even within the context of the systems biology system utilizing the infrastructure of a branch of mathematics paradigm. Whereas most studies currently focus on individual called graph theory. In the network paradigm, nodes represent networks (i.e. a transcription network or a protein–protein elements of the system while relations are symbolized by edges.
interaction network), in reality these different networks func- For example, in a metabolic network, enzymes and com- tion as a connected system. Therefore, a change in the gene pounds are nodes, and reactions are edges. In a protein–protein regulatory network may have a corresponding effect in the interaction network, two nodes connected by an edge protein–protein interaction network, the metabolic network, represent interacting proteins. This formalism enables the etc., which collectively may manifest changes in the observed study of living systems in a way never thought possible before.
phenotype. To understand the whole system, it is critical to The individual elements are integrated in a network whose integrate knowledge from different studies. However, the properties can be analyzed globally: the number of edges per crosstalk between different networks is not yet well understood node, the degree distribution (the probability that a node has a and although some progress has been made,33,34 the specific number of edges), the cluster coefficient, etc. Barabasi integration of different types of data is still in its infancy.12 and Oltvai have reviewed these concepts in detail, and Through the generation of mathematical models that integrate provided a comprehensive review of the terminology and different types of data (e.g. transcriptomic, metabolomic, and concepts associated with network analysis.2 This new termi- protein–protein interactions),2 we can explain the observed nology is increasingly prevalent in the biological literature, phenotype, and hopefully make predictions regarding how the 590 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009 phenotype is altered when the network itself is modified components is utilized, it is possible to build a model that can through the alteration of internal or environmental factors.
describe any data set with a perfect correlation (i.e. R2 = 1.0;Fig. 1). A comparison of the correlation coefficient to the Data processing and statistical analysis predictive power of the model is therefore essential. Thepredictive power (Q2) can be calculated through the use of a The pre-processing of data is crucial in network applications, training set and a test set, or if the data set is too small to allow as well as other systems level analyses. It is important to this, through a cross-validation approach. A good rule of recognize that the nature of large scale omics data is very thumb is to remove all components that do not contribute different from that of reductionist approaches, and other to an increased predictive power of the model. If the data set is statistical methods should be utilized. The majority of the sufficiently large, Q2 can be used as a measure to evaluate the univariate methods that have dominated biological sciences robustness of the model in relation to the whole population.
for centuries (e.g. Student's t-test) are not well-suited for a Another concern when utilizing MVA is that of strong number of reasons. For example, univariate statistical outliers. One should be cautious of any observation that is methods employ repeated testing to evaluate whether the null located on either end of the axis of the first component hypothesis for a certain variable can be rejected, i.e. if it is (strong outliers), as it is likely that characteristics that are significantly altered compared to the control group. Given the unique for this individual are influencing the entire model.
cumulative nature of the error in repeated testing, these Interpretability represents another concern in MVA. MVA methods are prone to high false positive rates, which become summarizes the entire data set in a few latent variables, which particularly pronounced in omics analyses where a large cannot be directly connected to the original measured number of variables are tested simultaneously. Even though variables. As such, it can be difficult for the untrained eye to a range of approaches have been developed to correct interpret which variables are important or ‘‘significant'' in for the resulting large false positive rates, most notably driving the separation of the different study groups. This Bonferroni35,36 and false discovery rate (FDR) corrections,37 becomes particularly pronounced in more complex analyses the use of univariate methods remains a compromise. The fact such as PLS. A recent addition to this group of analysis, that univariate methods are very sensitive to missing data orthogonal PLS (OPLS), greatly simplifies the interpretability points further decreases the robustness of network analyses by separating the variance in the data set according to the based solely on traditional statistical pre-processing of correlation to the selected Y matrix (e.g. disease group).38 In contrast, the ‘‘orthogonal'' component pulls out the variance Multivariate analysis (MVA) is a more suitable option for that is not correlated to the Y-variables of interest, and thus these ‘‘short and fat'' data sets that are typical for omics represents internal variance in the X-matrix. While this studies (i.e. a large number of variables with few observations).
approach is well-suited for motivating variable selection, it Instead of repeated testing of single variables, MVA aims to should be used cautiously in this aspect, given that the create a model that reduces the complexity of multi- back-drop of the method is a supervised selection of the dimensional data to a few latent variables that express the Y-variables that determine the separation. When in doubt, it majority of the variance of the data set. Exemplified is generally better to include all of the variables in subsequent by principal component analysis (PCA), the most utilizedunsupervised method in omics applications, the model isstructured so that the first principal component (PC1) isoriented so that it describes the largest possible portion ofthe variance in the data set that can be described by a linearvector. Accordingly, each subsequent PC contains a smallerportion of the variance in the data set than the previouscomponent. Given that the MVA is based on all individualvariable data points for all observations, the resulting model isrobust both against false positives and missing data points.
Furthermore, a confidence interval representing all of thevariables is obtained, in contrast to univariate methods whereeach variable is analyzed as a separate unit, and consequentlyonly confidence intervals for individual variables can beobtained. MVA can also be utilized to perform regressionanalysis between large data sets, most commonly throughpartial least squares between latent structures (PLS). Thesetypes of analyses are referred to as supervised methods, Overfitting of data represents one of the main pitfalls since the user defines which variables belong to the X dataset associated with multivariate analyses. With a sufficient number of (dictating variables) and which belong to the Y dataset components, a model that explains 100% of the variance (R2 = 1.0) can be built for any data set. In the above example, the simplest While useful, multivariate statistical methods are not (linear) model represents the most representative model for the data, without their own weaknesses. A major pitfall in MVA relates demonstrating that the simplest model provides optimal prediction, to overfitting of the model to the data. If a sufficient number of even though the correlation coefficient is lower.
c The Royal Society of Chemistry 2009 Mol. BioSyst., 2009, 5, 588–602 591 analyses. Taken together, this section emphasized the point syndrome are recalcitrant to current interventions and that it is vital to employ the correct statistical analysis in both challenge the ability of the pharmaceutical industry to produce experimental design as well as data processing. These effective and inexpensive therapies. For example, in cardio- approaches require an in-depth knowledge of MVA in order vascular disease, each known risk factor is addressed to correctly interpret the output of statistical models, prevent individually, whether it be hyperlipidemia or hypertension.3 overfitting of the data, apply multitest corrections, and achieve However, given the complex etiology of this disease, it is an appropriate balance of false positives and power.
likely that multiple factors are responsible for the observedpathology, resulting in a need for holistic treatment Systems biology in cardiovascular disease approaches that address the underlying problems. Accord-ingly, these diseases are logical targets for systems biology The utility of systems biology becomes clear when applied approaches to understanding disease mechanism, progression to multifactorial diseases whose etiology is complex. For and pathogenesis.
example, the etiology of inflammatory diseases such as atherosclerosis and asthma has proven recalcitrant to and linked to other systemic disorders,43,44 and the role of elucidation with reductionist approaches. It is possible that inflammation in the development of atherosclerosis and part of the difficulty in identifying new therapeutics lies in the cardiovascular disease is firmly established.45,46 The onset inability of current approaches to visualize the complexity of and development of cardiovascular disease has been shown these biological systems.39 The development of lead drug to involve multiple factors including lifestyle, diet, body candidates would also benefit from a systems approach. For mass index, (epi)genetics, dyslipidemia, hypertension, and example, drugs such as torcetrapib, statin + ezetimide and inflammation among others. However, the current paradigm rimonabant have been withdrawn from the market because of patient treatment involves addressing these individual of side effects that were not predicted with reductionistic risk factors in isolation, even though they are known to thinking. Diseases and disorders such as cardiovascular concomitantly contribute to disease pathogenesis. While disease, diabetes, metabolic syndrome, asthma and chronic effective in many cases, this approach has not provided a cure or even a full understanding of the disease, which remains a complicated developments that resist efforts to identify a single major source of mortality and morbidity worldwide.
gene or pathway responsible for disease onset and progression.
A number of studies have begun to address the issues Numerous therapeutics have been successfully developed that outlined above in a comprehensive fashion, and active intervene in different stages of the disease; however, we are still research is being performed to develop systems biology far from developing a true cure for any of these pathologies.
approaches to cardiovascular disease.47 We present a few of The cellular complexity of many of the affected organs these studies in chronological order, but stress that this list is represents a major obstacle in the elucidation of the systems not comprehensive. Many of the early studies that performed biology behind these pathologies. The lung, for example, systems biology-related investigations into cardiovascular consists of more than 40 different cell phenotypes, all of which disease focused on a single omics profiling method (i.e., may elicit different responses to up- or down-regulation of a transcriptomics or metabolomics) and then included clinical certain factor. Add to that the spatial and temporal aspects of parameters using multivariate statistics to develop models of the cellular response, and we are starting to approach the true disease. It is only recently that unifying systems biology complexity of biological systems. Accordingly, while beyond models employing multiple analytical platforms linked with the scope of this review, sampling design and strategy can have bioinformatics analyses have been produced. One of the significant effects upon experimental observations. Given earliest attempts to bring systems biology to cardiovascular the heterogeneity of many tissue types, it is challenging to function involved mapping important cardiovascular pheno- reproducibly sample tissue in such a way as to enable types onto the human genome. Stoll et al. studied 239 intra- and interlab comparisons. The obstacles involved in cardiovascular and renal phenotypes in 113 male rats. They this area are not trivial and need to be addressed by the identified and mapped a total of 81 cardiovascular phenotypes research community.
from an F2 intercross onto the human genome using correla- Cardiovascular disease is the major cause of premature tion patterns (‘‘physiological profiles'') and comparative death in Europe, resulting in 44 million deaths in the year genomics.25 The resulting genomic-systems biology map 2000.40 In the United States, cardiovascular disease was was applicable for gene hunting and mechanism-based physio- responsible for one of every five deaths in 2004, with an logical studies of cardiovascular function. For example, the average of one death every 37 seconds.41 The rapidly increasing authors presented a correlation matrix with phenotypic incidence of obesity and commensurate health effects ordering of 125 likely determinants of arterial blood pressure, including atherosclerosis, metabolic syndrome and diabetes which could be used to assess the impact of allelic substitutions is of epidemic proportions, with the potential for significant on each of the traits in either the parental or F2 generation increases in developing countries. It is anticipated that the of the intercross. The phenotypes were grouped into ‘‘BRIC'' countries (Brazil, Russia, India and China) will functionally related clusters (vascular, heart, renal, endocrine significantly contribute to the global cardiovascular disease burden such that by 2020 an additional B4% of deaths in the blood pressure, and ordered within the clusters by known world will be due to ischemic heart disease.42 The complexities physiological relationships. All of the results of the linkage analyses and the phenotypic physiological profiles for each 592 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009 While useful for identifying potential markers of disease, the previous studies do not represent a systems methodology.
( A more diagnostic application One of the first comprehensive systems biology approaches was presented by Brindle et al. who employed a supervised involving the integration of multiple omics platforms partial least squares discriminant analysis (PLS-DA) approach (transcriptomics, proteomics and metabolomics) examined to analyze 1H NMR spectra of human serum to diagnose the presence, as well as the severity of coronary heart disease.48 (ApoE*3Leiden) mouse model (a commonly used model of The PLS-DA model predicted the presence of coronary heart atherosclerosis50). The authors integrated gene transcripts, disease with a sensitivity of 92% and a specificity of 93% and protein and lipid data along with their putative relation- based on a 99% confidence limit. The major driving factor for ships to gain insight into the early onset of disease.51,52 As is the observed separation in severe coronary heart disease common with many systems approaches, the authors devel- patients (triple vessel disease, TVD) was the presence of lipids, oped a number of their methods for data processing particularly LDL and VLDL, whereas the most influential and network analysis in-house, demonstrating a significant loadings for the angiographically normal coronary arteries obstacle in the advance of systems biology. It is challenging to (NVA) were HDL-associated (e.g., fatty acid chains and integrate bioanalytical results from multiple platforms and phosphotidylcholine). Of particular importance is the fact that between different research groups, making it difficult to the authors confirmed that the method was able to diagnose standardize results.12 The ApoE knockout mouse was used coronary heart disease independently of the inevitable in another investigation into atherosclerosis mechanisms associated gender bias. However, work by Kirschenlohr involving conjugated linoleic acids (CLAs) to determine how et al. concluded that plasma-based 1H NMR analysis is a individual CLA isomers differently affected pathways involved weak predictor of coronary heart disease.49 They found that in atherosclerosis.53 ApoE knockout mice were fed a diet the predictive power was significantly weaker, with NVA and supplemented with 1% cis9, trans11-CLA, 1% trans10, coronary heart disease groups identified 80.3% correctly for cis12-CLA or 1% linoleic acid for twelve weeks. The effects patients not receiving statin therapy and 61.3% for patients upon lipid and glucose metabolism were measured, as well as treated with statins. The main reason postulated for the the regulation of hepatic proteins. Correlation analysis observed study discrepancy was the inclusion of additional between physiological and protein data identified two clusters variables in the Kirschenlohr et al. study, including drug associated with glucose metabolism. The results showed that treatment regimen. Statins significantly affect LDL levels, cis9, trans11-CLA specifically increased expression of the which was a discriminating factor in the PLS-DA model.
anti-inflammatory HSP 70, as well as decreased expression Accordingly, as the most significant loadings associated with of the pro-inflammatory macrophage migration inhibitory diagnosis in both studies were related to lipid species, it is not factor, suggesting that consumption of cis9, trans11-CLA surprising that treatments affecting lipid levels influenced the could protect against the development of atherosclerosis.
observed separation power of the model. In other words, statin A systems biology approach to elucidating biological treatment partially resolves the incidence of coronary artery pathways in coronary atherosclerosis was published by King disease, thus reducing the biomarker signal in these patients. It et al. who performed custom microarray analysis of coronary would be interesting to further examine these patients to artery segments.54 A number of clinical variables were determine if they were truly moving towards a ‘‘healthy'' examined, and diabetic states provided the most interesting phenotype or were instead representative of a third pharma- results, with 653 upregulated genes in the no diabetes class and cological state as discussed above. This point demonstrates 37 upregulated genes in the diabetes class, with an FDR of one of the main challenges in developing diagnostic markers of 0.08%. The top gene upregulated in the diabetes class was complex disease in that in many cases patients will present IGF-1, followed by the IL-1 receptor and IL-2 receptor-a, distinct genotypes as well as personal therapeutic treatment indicating that there were changes in cytokine-induced regimens that can potentially confound the use of biomarkers, immune and inflammatory responses. These results suggest as reported by Brindle et al. At the very least, these studies that inflammation is more prominent in diabetic than demonstrate the importance of including as much patient metadata in the analyses as possible. The work of both expression profiles were then used to construct a novel groups supports further research into exploring the potential pathway based upon gene connectivity as determined by of applying metabolomics methods to identify plasma language parsing of the published literature, and ranking as (i.e., non-invasive) biomarkers of coronary heart disease. It determined by the significance of differentially regulated genes is possible that biomarkers could be identified in a study with in the network. The resulting gene subnets were visualized with increased cohort size composed of the myriad of clinical Cytoscape, an open-source bioinformatics resource (discussed and interindividual variables. An important aspect of these in more detail below55), to identify nexus genes in disease metabolomic analyses is that in order to correctly classify severity. Results indicated that the key process in the individuals with coronary heart disease, it is not necessary progression of atherosclerosis relates to smooth muscle cell to fully understand the complex molecular differences dedifferentiation, suggesting a focus on changes in the smooth that underlie disease etiology.48 This methodology is an muscle phenotype as a target for atherosclerosis. The results important first step towards being able to identify individuals also provided insight into the severe form of coronary artery at risk of disease development or in the early stages of disease associated with diabetes, reporting an overabundance disease onset.
of immune and inflammatory signals in diabetics. This method c The Royal Society of Chemistry 2009 Mol. BioSyst., 2009, 5, 588–602 593 for querying multiple search engines and/or databases biomarker of myopathy. The results showed that the arachi- combined with parsing of the retrieved results (documents) donate 5-lipoxygenase activating protein gene (ALOX5AP) for biological associations is extremely powerful for generating had high positive regression coefficients with plasma levels networks, and is used extensively in multiple software of phosphatidylethanolamine(42:6) and negative regression applications for network generation.
coefficients for cholesterol ester ChoE(18:0). These results Lipopolysaccharide (LPS) is a critical inducer of sepsis, were particularly intriguing as the ALOX5 gene has been which is characterized by systemic inflammation, hypotension previously shown to predispose humans to atherosclerosis.64,65 and multiple organ failure.56 Tseng et al.57 examined the This systems biology approach successfully identified potential molecular effects of late-phase LPS stimulation on primary plasma-based markers of the effects of statin treatment rat endothelial cells in an attempt to develop diagnostic and showed that observed effects upon pathways were markers of inflammatory disease. A combination of cDNA statin-specific. In particular it also provided mechanistic microarray, 2-DE and MALDI-TOF MS/MS, as well as insight into the development of atherosclerosis, demonstrating cytokine protein arrays were analyzed using custom bio- the utility of a systems approach. A similar method was informatics applications. Differentially expressed genes and employed by Pietila¨inen et al. who examined obesity in proteins were mapped onto their corresponding biological pathways using BioCarta or KEGG, and the results were obesity to be associated with deleterious alterations in lipid ordered using the BGSSJ software (bulk gene search system metabolism pathways known to promote atherogenesis, for Java) followed by analysis with ArrayXPath.58 The results inflammation and insulin resistance.66 Intriguingly, they showed significant effects (p o 0.05) on the BioCarta path- reported that obesity primarily related to increases in ways ‘‘LDL pathway during atherogenesis'', ‘‘MSP/RON lyso-phosphatidylcholines and decreases in ether phospholipids.
receptor signaling pathway'' (MSP, macrophage-stimulating Nikkila¨ et al.67 used this method to examine the gender- protein; RON, tyrosine kinase/receptor d'origine nantais), dependent progression of systemic metabolic states in early ‘‘signal transduction through IL-1R'', and ‘‘IL-5 signaling childhood. They were able to categorize children in terms of pathway'', demonstrating that inflammatory pathways were metabolic state at a very young age (from birth to 4 years old).
significantly affected by LPS treatment, as would be expected.
Using lipidomics profiling methodology and hidden Markov Overall, this study used a systems biology approach to models, they found that the major developmental state differ- show that NF-kB-associated responses in endothelial cells ences between girls and boys can be attributed to sphingolipids.
affected pathways involved in proliferation, atherogenesis, They also found multiple previously unknown age- and gender- inflammation and apoptosis, thereby providing information related metabolome changes of potential medical significance.
on multiple pathways simultaneously. However, it should be In addition, they demonstrated the feasibility of state-based stressed that it is necessary to differentiate protein concentra- alignment of personal metabolic trajectories, which is an tions from protein activities in order to make meaningful important proof-of-principle step for applications of meta- deductions. Several studies using ‘‘focused'' arrays to analyze bolomics towards systems biology and personalized medicine.
Children were shown to have different development rates at the confirmed that short-term LPS exposure results in vivid level of the metabolome and thus the state-based approach may upregulation of a spectrum of proinflammatory genes be advantageous when applying metabolome profiling in search including IL-1b, IL-15, interferon-induced genes, and a series of markers for subtle (patho)physiological changes.
of TNF superfamily members.59–62 Statins are an important therapeutic in the control of plasma lipoproteins upon plaque formation using the hyperlipidemia, with demonstrated efficacy in lowering Ldlr / Apo100/100Mttpflox/floxMx1-Cre mouse model, which cholesterol levels. However, there are concerns regarding the has a plasma lipoprotein profile similar to that of familial development of statin-induced myopathy following aggressive hypercholesterolemia and a genetic switch to block the hepatic treatment. Laaksonen et al. employed a systems biology synthesis of lipoproteins.68 Transcriptional profiling of approach to probe the cellular mechanisms leading to atherosclerosis-prone mice with human-like hypercholestero- myopathy and identify potential biomarkers.63 Muscle lemia and reverse engineering of whole-genome expression biopsies were analyzed for whole genome expression and data provided a network of cholesterol-response atherosclerosis plasma samples were profiled using a lipidomics approach.
target genes. This regulatory gene network appeared to The microarray analysis revealed modest changes in the control foam cell formation, suggesting that these genes could atorvastatin treatment group (five altered genes), but 111 potentially serve as drug targets to prevent the transformation genes were affected in the simvastatin group. The differences of early lesions into advanced, clinically significant plaques.
in response are not necessarily unexpected given that the two Kleemann et al. employed a systems approach to examine statins differ in their hydrophobicity/lipophilicity, and thus in the effects of dietary cholesterol upon atherosclerosis.69 Of the extent that they affect the vasculature. The lipidomics particular interest in this study is the focus of the effects of profiling identified 132 unique lipid molecular species dietary cholesterol upon inflammation. The role of inflamma- (however, this method does not allow for the unequivocal tion in cardiovascular disease and atherosclerosis in particular identification of fatty acid substitution position on lipid head has been established;70 however, the source of inflammation groups). The gene expression data and the lipidomics data and the exact mechanisms of how inflammation is evoked and were combined following gene set enrichment analysis (GSEA) contributes to disease development and progression are still and further analyzed with PLS-DA to look for a plasma-based unclear. The data of Kleeman et al. demonstrated that the liver 594 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009 is capable of absorbing moderate cholesterol-induced stress pSTIING. These types of tools enable the visualization of the (up to about 0.5% w/w in the diet), but a further increase results integrated with the information provided in these evoked the expression of hepatic pro-inflammatory genes databases. Other tools enable the generation of networks that including a number of pro-atherosclerotic candidate genes.
are inferred from omics data, such as Cytoscape (through These data also showed that dietary cholesterol can be a several plugins), VANTED, some of the R/Bioconductor trigger of hepatic inflammation (as reflected by elevated packages79 and many of the commercial software packages.
plasma levels of acute phase genes) and that it may be involved Most of these tools can also be used to analyze and manipulate in the development of the inflammatory component of networks. However, to date there is no perfect solution and atherosclerosis by switching on four distinct inflammatory substantial effort is needed to integrate multiple datasets in a comprehensive fashion. Herein we provide a brief overview of pathways). Furthermore, the authors used a network some of the diverse options.
analysis approach to demonstrate that lipid metabolism and The Kyoto Encyclopedia of Genes and Genomes (KEGG) inflammatory pathways are closely linked via specific is a web-based resource that contains a series of databases of transcriptional regulators. They confirmed that targeting of biological systems, consisting of genetic building blocks of a prototype transcription factor of the inflammatory response genes and proteins (KEGG GENES), chemical building (NF-kB) affected plasma lipid levels and lowered plasma blocks of both endogenous and exogenous substances (KEGG LIGAND), molecular wiring diagrams of interaction and demonstrated the strength of a systems approach in that reaction networks (KEGG PATHWAY), and hierarchies multiple analytical platforms were combined to build an and relationships of various biological objects (KEGG overall model of disease, which provided mechanistic BRITE). KEGG provides a reference knowledge base for information across multiple biological pathways that suggest linking genomes to biological systems, and also to environ- potential new strategies for therapeutic interventions affecting ments, by the processes of PATHWAY mapping and BRITE inflammation, as well as plasma lipids, in a beneficial way. The mapping. The visualization objects in the KEGG suite are results of this study are examined in greater detail using the consistent, with the nodes of a pathway map shown as KegArray tool discussed below.
rectangles that represent gene products, usually proteins, andsmall circles representing chemical compounds and othermolecules. A large oval represents a link to another pathway An expanding toolbox map, and a cluster of rectangles represents a protein complex.
An important bottleneck in the development of systems Aoki and Kanehisa provide a comprehensive tutorial on approaches is the need for software capable of analyzing KEGG for interested readers.80 collected omics data from multiple platforms. There are many The Systems Biology Markup Language (SBML) is a software packages and web resources available, all of which are too numerous to describe in this review (see ref. 71 for a biochemical reaction networks in software. It is oriented comprehensive list of 4150 resources for systems biology).
towards describing systems of biochemical reactions, including A few resources worth briefly mentioning here include cell signaling pathways, metabolic pathways, biochemical KEGG,72 PathVisio,73 pSTIING,74 MetaCoret,75 Cytoscape,55 reactions and gene regulation.78 The SBML project has VANTED,76 Pathway-Express,77 Ingenuitys Systems and a produced a KEGG2SBML tool that is useful for converting plethora of SBML applications78 (Table 1). Some of this KEGG-based metabolic pathways into SBML format. The software is designed to map the results from omics experi- pSTIING resource consists of a web-based application ments onto existing pathway databases such as KEGG or containing metabolic pathways, protein–protein, protein–lipid Network and pathway mapping software, including tools for network visualization/manipulation and network inference from high-throughput dataa Various (plugins) Ingenuitys Systems KEGG (Kyoto Encyclopedia of Genes and Genomes) Same as Cytoscape a This list is non-exhaustive and is solely provided to give an example of some of the available resources. See Ng et al. for a more comprehensivelist.71 b Systems biology markup language (see c Affinity purification-Mass spectrometry.
c The Royal Society of Chemistry 2009 Mol. BioSyst., 2009, 5, 588–602 595 but interested readers are suggested to examine work by the transcriptional regulatory associations. It is focused on Institute for Systems Biology SBEAMS (Systems Biology regulatory networks relevant to chronic inflammation, cell migration and cancer, therefore, making it a useful resource, a framework for collecting, storing, and for inflammatory-related applications. The pSTIING web site accessing data produced by these and other experiments.89 also features a tool for inferring networks (Cladist). VANTED Other efforts in this area include the Biological Networks is a multiplatform tool for the manipulation of graphs that server, which is a systems biology software platform with represent either biological pathways or functional hierarchies.
multiple visualization and analysis functions including: It also allows the mapping of experimental data into visualization of molecular interaction networks, sequence the network and is capable of processing flux data. Graph and 3D structure information, integration with other graph- information is loaded in SBML format, but it also has a structured data such as ontologies (e.g., gene ontology) and KEGG interface.81 Cytoscape is an open source platform for taxonomies (e.g., enzyme classification system), integration of visualizing molecular interaction networks and biological interactions with experimental data (e.g., gene expression), pathways. One of its most useful features is the ability to and extraction of biologically meaningful relations, as well as accept custom plugins to perform specific tasks, extending the number of initial features. A number of useful plugins are Networks server provides querying services and an information already available, including MONET,82 a method for inferring management framework over PathSys, which is a graph-based gene regulatory networks from gene expression data, and system for creating a combined database of biological the AgilentLiteratureSearch plugin,83 which enables the pathways, gene regulatory networks and protein interaction generation of association networks from literature mining maps, which integrates over 14 curated and publicly contributed (see below). R and Bioconductor are a platform extensively data sources for eight representative organisms.91 There is also used for the analysis of high-throughput data.84 In addition, currently a significant amount of effort to determine standards there are several free resources available related to the for storing microarray data (MAGE-OM/ML, GeneX, analysis of networks, including packages such as GeneNet,85 apComplex86 and Rgraphivz,87 (for creating and visualizing and metabolomics standards initiatives.93 Data-integration networks). The package Gaggle88 enables interaction between techniques for omics data sets have been reviewed in detail Cytoscape and R.
by Joyce and Palsson,12 and references therein.
The two main commercial packages are MetaCoret and One of the long-range goals of systems biology approaches Ingenuitys Systems. MetaCoret (GeneGo, Inc.) is an is to develop models capable of predicting clinical phenotypes, integrated suite of software applications that is designed for as well as patient treatment regimens and associated outcomes.
functional analysis of experimental data, including omics data, However, the complexity of cardiovascular disease and other CGH arrays, SNPs, SAGE gene expression and pathway inflammatory-related diseases makes model development analysis. MetaCoret is based on a proprietary manually challenging. A number of different groups are working on curated database of human protein–protein, protein–DNA developing in silico models of inflammation, with the majority and protein–compound interactions, metabolic and signaling of efforts focused on the acute inflammatory response.94–97 pathways, and the effects of bioactive molecules on gene However, it is likely that these models can eventually be expression. GeneGo is also in the process of creating a systems adapted for diseases of chronic inflammation. Recent reviews biology and pathway analysis platform specific for cardio- have addressed the status of cardiac systems biology, with a vascular diseases (MetaMiner Cardiac Consortium). Ingenuity number of promising developments.5,47,98–100 These models Pathways Analysis (IPA) enables researchers to model and represent the logical extension of the systems biology tools analyze biological and chemical systems. The IPA suite discussed above and as the amount of data increases, our contains a series of modules including IPA-Biomarkert ability to develop interactive models of individual pathologies will increase. This translational systems biology approach will Analysis. IPA-Biomarkert identifies the most promising and make it feasible to develop patient-specific modeling based relevant biomarker candidates within experimental data.
upon known disease mechanisms.97 These models will be IPA-Toxt delivers a focused toxicity and safety assessment useful in clinical settings to predict and optimize the outcome of candidate compounds, elucidates toxicity mechanisms and from surgery and non-interventional therapy.101 identifies potential markers of toxicity, with a focus oncardiovascular toxicity, nephrotoxicity, and hepatotoxicity.
IPA-Metabolomicst analyzes metabolomics data in thecontext of metabolic and signaling pathways. This module To address the need for software capable of analyzing data can integrate transcriptomics, proteomics and metabolomics from multiple omics platforms, KEGG has recently intro- data in a systems biology approach to biomarker discovery, duced a new application called KegArray that is designed to molecular toxicology, and mechanism of action studies.
map omics data onto the KEGG suite of databases. KegArray Multiple efforts are currently under way to synchronize the is a Java application that provides an environment for data being collected by research groups around the world. In analyzing transcriptomics or proteomics (expression profiles) order to advance the field, it is therefore necessary to develop and metabolomics data (compound profiles) individually or databases with defined metrics for evaluating the quality of the simultaneously. The application is tightly integrated with the global data sets. This area is beyond the scope of this review, KEGG database, and maps input data to KEGG resources 596 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009 including PATHWAY, BRITE and genome maps. KegArray genes/proteins/compounds. In this case, the ranking represents is available for running in Mac, Windows or Linux how well the respective pathways have been covered by the environments and can be downloaded freely from the KEGG experimental analyses. Subsequently, by only including the up- and down-regulated entries in the mapping, a ranking The KegArray tool is designed to facilitate integrated based on biological effects on the pathway can be achieved.
mapping of omics results onto a KEGG application of choice.
The statistical evaluation of systems biology data is a complex Metabolic pathways significantly affected in high cholesterol and highly debated subject (see Data Processing and Statistical exposure relative to low cholesterol exposurea Analysis). As such, the KegArray tool itself does not imposeany statistical evaluation on the inputted data, but is rather mmu01040 Biosynthesis of unsaturated fatty acidsmmu03320 PPAR signaling pathway intended as a link between processed data and the interactive mmu00564 Glycerophospholipid metabolism KEGG environment. This conceptual solution allows the user mmu00071 Fatty acid metabolism to have full control over the choice of statistical methods, data mmu04920 Adipocytokine signaling pathway transformation and data selection prior to mapping onto the mmu00565 Ether lipid metabolismmmu00590 Arachidonic acid metabolism KEGG tool of choice. KegArray allows full flexibility in mmu00100 Biosynthesis of steroids determining the significance or cut-off levels, as well as the mmu00120 Bile acid biosynthesis corresponding color coding for the mapping. KegArray can mmu00561 Glycerolipid metabolismmmu00600 Sphingolipid metabolism thus be described as a visualization tool, but with the added mmu00591 Linoleic acid metabolism advantage of a sustained interactive environment with the vast mmu00592 alpha-Linolenic acid metabolism KEGG database. It is not necessary to pre-select the pathways a Data are from a KegArray-based analysis of quantified lipid and of interest and the output is formatted as a list of links transcriptomics data from Kleemann et al.69 Pathways are from to all affected pathways, organized in the order of highest KEGG PATHWAY and are listed with pathway name and KEGG number of mapped genes/proteins/compounds per pathway.
ID number (e.g. mmu for mouse). The pathways are ranked in order of KegArray can be configured to display any combination greatest number of components significantly affected in the pathway.
A total of 77 different pathways were affected, of which the top 13 areshown here. A complete list of all 77 affected pathways is provided inTable S3. In addition, those pathways significantly affected by low and An example for expression ratios between two channels for high cholesterol exposure are provided in Table S1 and S2, respec- the input of transcriptomics data into KegArraya tively. It is not possible to state whether an entire pathway is positivelyor negatively affected, but these individual pathways can be visualized following mapping to KEGG and inspected for specific fluctuations in the data. Examples of this are shown in Fig. 3 and Fig. 4.
a Data are the high cholesterol (HC) treatment shown in Fig. 2.
KegArray input format for metabolomics dataa Venn diagram displaying the number of metabolic pathways significantly affected following treatment with either low cholesterol (LC) or high cholesterol (HC) relative to control in n ApoE*3Leiden mouse model of atherosclerosis. In addition, the changes between HC and LC were compared, evidencing five pathways that were specifically affected between these two treatments (mmu00010 glycolysis/ gluconeogenesis, mmu00641 3-chloroacrylic acid degradation, mmu00680 methane metabolism, mmu00980 metabolism of xenobiotics by cyto- chrome P450, and mmu00982 drug metabolism-cytochrome P450).
Data are from a KegArray-based analysis of quantified lipid and transcriptomics data from Kleemann et al.69 A complete list of all Data are the high cholesterol (HC) treatment shown in Fig. 2.
pathways affected is provided in the ESI, Tables S1–S3.w c The Royal Society of Chemistry 2009 Mol. BioSyst., 2009, 5, 588–602 597 The expected mapping format is that of ratios between e.g. a available. Additional information regarding experimental treated and control group, and a specific tab-delimited format descriptions, reference information, etc., can also be included to facilitate the automatic calculation of ratios from raw data in the input file by simply adding the ‘#' character at the is available (KEGG EXPRESSION format). However, in beginning of the line, which will result in that line being order to increase the versatility of the tool, an additional skipped by KegArray (other than the ‘#organism:' or generic file input format has also been constructed (RATIO ‘#source:' line).
format) to allow other aspects of the data to be evaluated The lines in tab-delimited format below the ‘#'-delimited through the KegArray tool (e.g. weighting according to section contain omics profiling data. The first column must statistical significance, ranking etc.). Both formats, described contain the KEGG GENES ID, which is the unique identifier in detail in the ReadMe file available for download with of the organism-specific gene. The second and third columns KegArray (, can be are aimed for entering X- and Y-coordinates, e.g. those used for the input of transcriptomics or proteomics data.
derived from a microarray experiment, to facilitate a Organism-specific mapping of the results is facilitated by the schematic view of the microarray through the ‘‘ArrayViewer'' organism information provided on the first line of the input application. If the data are from a proteomics experiment, the file, in the format ‘#organism:' followed by the organism second and third columns can be left blank. Accordingly, it is three- or four-letter organism identifier code used in KEGG.
not necessary to input the microarray coordinate information, (e.g., ‘hsa' for human and ‘mmu' for mouse). If organism- and the KEGG ID and data columns are sufficient. If the specific mapping is not desirable, the abbreviation for the RATIO file format is utilized, the fourth column contains the all-inclusive generic pathway can be used (‘map'). Since the data value of interest, as exemplified by the ratios between interactive environment of KEGG is maintained, it is easy to control channel and target channel in Table 2. In contrast, if scroll between the many different organism-specific pathways the EXPRESSION file format is utilized, the fourth through Results of KegArray-based analysis of quantified lipid and transcriptomics data from Kleemann et al.69 The KEGG metabolic pathway ‘‘Biosynthesis of unsaturated fatty acids'' (map01040) was the pathway that evidenced the greatest number of changes between low and highcholesterol treatment. KegArray was run with a 1.1-fold threshold, with red and orange indicating a 10% and 5% increase, respectively, yellowindicating no change (grey indicates that the enzyme/metabolite is present in the organism), and light green and dark green indicating a 5% and10% decrease, respectively. Table 4 provides a list of the top 13 pathways that differed between low and high cholesterol treatment.
598 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009 seventh columns contain the total signal from the treated/ KEGG PATHWAY maps as well as KEGG BRITE and diseased sample, background signal from the treated sample, KEGG DAS for further analysis. These data can also be total signal from control sample, and background signal from mapped onto the KEGG DISEASE pathways.
the control sample in the indicated order. KegArray then In order to demonstrate the utility of KegArray, we performs the background subtraction and calculates the ratio have applied it to a dataset of gene and metabolite data between treated and control sample upon submission of the taken from Kleemann et al.69 This study was designed to examine the potential of increasing doses of dietary cholesterol The data format for metabolomics data is similar to the to evoke the inflammatory component that is necessary for the gene/protein data; however, only the ratio format can be used.
onset of atherosclerosis. Towards this end, ApoE*3Leiden All metabolites (compounds) must be assigned KEGG mice were fed either a control diet (cholesterol-free), COMPOUND ID numbers in order to be recognized by low cholesterol (LC, 0.25% w/w) or high cholesterol KegArray. In the data file, the first column contains the (HC, 1.0% w/w) diet for ten weeks (to achieve early mild KEGG COMPOUND ID (e.g., C00219 for arachidonic acid) atherosclerotic plaques), with the amount of cholesterol being and the second column contains the pre-processed data value the only dietary variable in the study. At the end of the study, of interest, e.g. ratios of the target compound relative to the the mice were sacrificed, scored for atherosclerosis and control (Table 3).
profiled using microarray analysis (livers) and lipidomics Because entry IDs must be in KEGG GENES ID format, quantification (liver and plasma). The results showed that an ID converter has also been created. Currently, the only the HC diet evoked hepatic inflammation and induced following external databases are supported: NCBI GI, NCBI Entrez Gene, GenBank, UniGene, UniProt and IPI. When observed with the LC diet). A total of 264 genes involved in using KegArray, a number of parameters can be customized, lipid metabolism were measured, with 23 genes differentially including the threshold, normalization and color scheme.
expressed in the LC diet, and 64 in the HC diet. In addition, The output can be viewed as significantly either up- a range of intrahepatic fatty acids were quantified, of which regulated, down-regulated or all data that were input into 27 free fatty acids were mapped along with the gene data KegArray. These data are then visualized onto interactive onto the KEGG database using KegArray. The KegArray Results of KegArray-based analysis of quantified lipid and transcriptomics data from Kleemann et al.69 The KEGG metabolic pathway ‘‘Sphingolipid metabolism'' (map00600) evidenced a number of changes between low and high cholesterol treatment. KegArray was run with a1.1-fold threshold, with red and orange indicating a 10% and 5% increase, respectively, yellow indicating no change (grey indicates that theenzyme/metabolite is present in the organism), and light green and dark green indicating a 5% and 10% decrease, respectively. Table 4 provides alist of the top 13 pathways that differed between low and high cholesterol treatment.
c The Royal Society of Chemistry 2009 Mol. BioSyst., 2009, 5, 588–602 599 parameters were set to display a 1.1-fold difference and non-affected pathways were excluded. For the LC exposure,60 biochemical pathways were affected (ESI, Table S1w) as One of the main current obstacles in systems biology is the opposed to 76 pathways for the HC exposure (ESI, Table S2w), heterogeneity of available datasets. The field requires the which included all 60 pathways from the LC dosing. This creation of legacy databases of omics data that are formatted suggests that already with LC, a very pronounced adaptation to enable inter-study comparison. Many existing methodologies of liver lipid metabolism occurs. With these adaptations, the liver is capable of dealing with cholesterol as there is manipulation and analysis. In order to increase the utility very little development of early atherosclerotic lesions and and availability of these tools, it is necessary to either develop there is no significant inflammation. However, when the simplified web-based applications that are equally useable for dose of dietary cholesterol is increased (HC condition), cross-disciplinary users and/or shift the educational paradigm 16 additional lipid pathways are activated. These data suggest to place increased emphasis on the acquisition of computer that a very low dose of cholesterol affects a significant part of skills. Future advances in understanding complex medical the pathways involved in lipid handling. It appears that with problems are highly dependent on methodological advances HC, the quality of the lipids changes and increased number of and integration of the computational systems biology unsaturated or proatherogenic lipids such as sphinogomyelin community with biologists and clinicians.97 are significantly impacted. Of particular interest was the Although commercial tools are more complete in terms of difference in affected pathways between LC and HC diets.
features, they are often closed platforms that do not allow for A total of 77 pathways were differentially affected (ESI, the development and interchange of analysis tools and data Table S3w), of which the top 13 pathways affected are provided beyond their supported applications. In addition, these tools in Table 4. These differences are shown on treatment-specific can be expensive, which can be prohibitive for the academic basis in Fig. 2. A total of 59 pathways were affected in both and/or clinical settings. It is desirable that developments in LC and HC treatment, as well as between treatments. Of these fields be based upon open standards that allow the easy particular interest are the five pathways that differ between LC interchange of multiple types of data and the subsequent and HC treatment, but did not evidence changes in LC or HC analyses. The adoption of standard file formats should reduce the difficulties in the integration of data derived from different mmu00641 3-chloroacrylic acid degradation, mmu00680 analysis tools.
methane metabolism, mmu00980 metabolism of xenobiotics The ultimate goal for translational systems biology by cytochrome P450, and mmu00982 drug metabolism- approaches is to bring forth an understanding of the cytochrome P450). Examples of affected metabolic pathways pathogenesis and disease etiology at the organism level that are shown for the biosynthesis of unsaturated fatty acids goes beyond what traditional minimalistic approaches have to (Fig. 3) and sphingolipid metabolism (Fig. 4). Kleemann offer. Such an in depth understanding of the differences et al.69 reported that with increasing cholesterol uptake, the between the healthy and diseased states can help solve crucial liver switched from an adaptive state to an inflammatory clinical issues, and provide markers and insights that aid pro-atherosclerotic state (with LC there is primarily an clinicians in making prognostic and diagnostic evaluations.
adaptive response of key metabolic pathways required to In terms of atherosclerosis, one of the most important clinical cope with lipids). At the gene expression level, there is dilemmas is determining if and when a patient is at risk of clearly a further adaptation of the pathways switched on/off developing symptomatic disease. A systems biology approach with LC when animals receive HC. These effects were could potentially identify alterations in molecular pathways in accordance with the metabolite levels, with significant and targets that precede plaque instability, and thus assist in (p o 0.05) decreases in myristic, palmitic, stearic, arachidonic, developing molecular tools that can substitute imaging docosapentaenoic and docosahexaenoic acids. This finding modalities such as MRI or PET CT to more accurate identi- is supported by the observation that the biosynthesis fication of vulnerable lesions. Accordingly, systems biology of unsaturated fatty acids was the metabolic pathway with tools can be utilized to develop concrete clinical applications the greatest number of changes between LC and HC that will help improve patient selection, monitoring of treatment. Specific decreases were observed in unsaturated stroke preventive intervention, and other needs of the medical fatty acids in the HC treatment: a decrease in arachidonic acid was observed at p o 0.05 and docosahexaenoic The advent of systems biology is bringing forth a change in acid (DHA) at p o 0.07). This pathway is a potential source the philosophy of medicine, and is rapidly changing the way of the unsaturated fatty acid substrates for the many of we view the disease process. However, in order to realize the the pro-inflammatory lipids involved in the development of promise of systems biology, i.e. the understanding of the atherosclerosis (e.g., observed reductions in arachidonic acid organism as a whole, the next major challenge is to facilitate levels). Accordingly, mapping of these data to KEGG was integrated analysis of data from multiple sources.102 Without a rapid method for providing information on which the integration of individual networks and biochemical pathways were most affected by cholesterol treatment and pathways into the entire system, the observed effects of provided a mechanistic insight into the disease process. This individual components remain without meaning and context, new tool for the KEGG suite will be a useful compliment and cannot provide understanding of pathological processes at to existing strategies for network analysis and pathway the systems level. Some steps in the direction of integrated analyses have already been made,33 but increased integration 600 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009 of heterogeneous data and networks is non-trivial. The 26 S. E. Calvano, W. Xiao, D. R. Richards, R. M. Felciano, potential of combining the knowledge from multiple networks H. V. Baker, R. J. Cho, R. O. Chen, B. H. Brownstein,J. P. Cobb, S. K. Tschoeke, C. Miller-Graziano, L. L. Moldawer, with high-throughput data, as exemplified herein by the M. N. Mindrinos, R. W. Davis, R. G. Tompkins and S. F. Lowry, KegArray tool and the KEGG database, will move us one Nature, 2005, 437, 1032–1037.
step further towards a true understanding of the living 27 K. I. Goh, M. E. Cusick, D. Valle, B. Childs, M. Vidal and organism. The rapid advances in computer sciences and A. L. Barabasi, Proc. Natl. Acad. Sci. U. S. A., 2007, 104,8685–8690.
high-throughput technologies, coupled with paradigm shifts 28 X. Wu, R. Jiang, M. Q. Zhang and S. Li, Mol. Syst. Biol., 2008, 4, 189.
in the way clinical and pre-clinical researchers perceive science, 29 U. Alon, Nat. Rev. Genet., 2007, 8, 450–461.
holds the key to understanding the intricate systems that 30 M. Isalan, C. Lemerle, K. Michalodimitrakis, C. Horn, P. Beltrao, E. Raineri, M. Garriga-Canut and L. Serrano, Nature, dictate the switch from healthy to diseased, and represents 2008, 452, 840–845.
the path that will lead us to true personalized medicine.
31 F. Markowetz and R. Spang, BMC Bioinf., 2007, 8(Suppl 6), S5.
32 T. Schlitt and A. Brazma, BMC Bioinf., 2007, 8(Suppl 6), S9.
33 N. Ishii, K. Nakahigashi, T. Baba, M. Robert, T. Soga, A. Kanai, T. Hirasawa, M. Naba, K. Hirai, A. Hoque, P. Y. Ho,Y. Kakazu, K. Sugawara, S. Igarashi, S. Harada, T. Masuda, This research was supported by the A˚ke Wibergs Stiftelse, the N. Sugiyama, T. Togashi, M. Hasegawa, Y. Takai, K. Yugi, Fredrik and Ingrid Thurings Stiftelse, The Royal Swedish K. Arakawa, N. Iwata, Y. Toya, Y. Nakayama, T. Nishioka, Academy of Sciences, the Swedish Heart-Lung Foundation K. Shimizu, H. Mori and M. Tomita, Science, 2007, 316, 593–597.
34 J. Zhu, B. Zhang, E. N. Smith, B. Drees, R. B. Brem, and the Japanese Society for the Promotion of Science (JSPS).
L. Kruglyak, R. E. Bumgarner and E. E. Schadt, Nat. Genet., C.E.W was supported by a Center for Allergy Research 2008, 40, 854–861.
Fellowship. R.K. and M.v.E. received support from the 35 C. Bonferroni, Pubblicazioni del R Istituto Superiore di Scienze TNO Research Program VP9 Personalized Health.
Economiche e Commerciali di Firenze, 1936, vol. 8, pp. 3–62.
36 R. G. Miller, Simultaneous Statistical Inference, Springer Verlag, New York, 1981, pp. 6–8.
37 Y. Benjamini and Y. Hochberg, J. R. Stat. Soc. Ser. B (Methodological), 1995, 289–300.
1 H. Kitano, Science, 2002, 295, 1662–1664.
38 J. Trygg and S. Wold, J. Chemom., 2002, 16, 119–128.
2 A. L. Barabasi and Z. N. Oltvai, Nat. Rev. Genet., 2004, 5, 101–113.
39 L. Hood and R. M. Perlmutter, Nat. Biotechnol., 2004, 22, 1215–1217.
3 A. C. Ahn, M. Tewari, C. S. Poon and R. S. Phillips, PLoS Med., 40 I. Graham, D. Atar, K. Borch-Johnsen, G. Boysen, G. Burell, 2006, 3, e208.
R. Cifkova, J. Dallongeville, G. De Backer, S. Ebrahim, 4 A. C. Ahn, M. Tewari, C. S. Poon and R. S. Phillips, PLoS Med., B. Gjelsvik, C. Herrmann-Lingen, A. Hoes, S. Humphries, 2006, 3, e209.
M. Knapton, J. Perk, S. G. Priori, K. Pyorala, Z. Reiner, 5 A. D. McCulloch and G. Paternostro, Ann. N. Y. Acad. Sci., L. Ruilope, S.
2005, 1047, 283–295.
P. Weissberg, D. Wood, J. Yarnell, J. L. Zamorano, E. Walma, 6 A. D. Weston and L. Hood, J. Proteome Res., 2004, 3, 179–196.
T. Fitzgerald, M. T. Cooney, A. Dudina, A. Vahanian, J. Camm, 7 J. van der Greef, T. Hankemeier and R. N. McBurney, R. De Caterina, V. Dean, K. Dickstein, C. Funck-Brentano, Pharmacogenomics, 2006, 7, 1087–1094.
G. Filippatos, I. Hellemans, S. D. Kristensen, K. McGregor, 8 J. van der Greef, S. Martin, P. Juhasz, A. Adourian, T. Plasterer, U. Sechtem, S. Silber, M. Tendera, P. Widimsky, J. L. Zamorano, E. R. Verheij and R. N. McBurney, J. Proteome Res., 2007, 6, I. Hellemans, A. Altiner, E. Bonora, P. N. Durrington, R. Fagard, S. Giampaoli, H. Hemingway, J. Hakansson, 9 D. J. Lockhart and E. A. Winzeler, Nature, 2000, 405, 827–836.
S. E. Kjeldsen, M. L. Larsen, G. Mancia, A. J. Manolis, 10 R. Aebersold and M. Mann, Nature, 2003, 422, 198–207.
11 B. Domon and R. Aebersold, Science, 2006, 312, 212–217.
12 A. R. Joyce and B. O. Palsson, Nat. Rev. Mol. Cell Biol., 2006, 7, L. Tokgozoglu, O. Wiklund and A. Zampelas, Eur. Heart J., 2007, 28, 2375–2414.
13 J. C. Smith and D. Figeys, Mol. BioSyst., 2006, 2, 364–370.
41 W. Rosamond, K. Flegal, K. Furie, A. Go, K. Greenlund, 14 B. F. Cravatt, G. M. Simon and J. R. Yates, 3rd, Nature, 2007, N. Haase, S. M. Hailpern, M. Ho, V. Howard, B. Kissela, 450, 991–1000.
S. Kittner, D. Lloyd-Jones, M. McDermott, J. Meigs, C. Moy, 15 K. Dettmer, P. A. Aronov and B. D. Hammock, Mass Spectrom.
G. Nichol, C. O'Donnell, V. Roger, P. Sorlie, J. Steinberger, Rev., 2007, 26, 51–78.
T. Thom, M. Wilson and Y. Hong, Circulation, 2008, 117, 16 X. Han, A. Aslanian and J. R. Yates 3rd, Curr. Opin. Chem. Biol., 2008, 12, 483–490.
42 D. B. Mark, F. J. Van de Werf, R. J. Simes, H. D. White, 17 J. Zaia, Chem. Biol., 2008, 15, 881–892.
L. C. Wallentin, R. M. Califf and P. W. Armstrong, Eur. Heart J., 18 H. Jeong, B. Tombor, R. Albert, Z. N. Oltvai and A. L. Barabasi, 2007, 28, 2678–2684.
Nature, 2000, 407, 651–654.
43 A. J. Lusis, J. Lipid Res., 2006, 47, 1887–1890.
19 S. S. Shen-Orr, R. Milo, S. Mangan and U. Alon, Nat. Genet., 44 A. J. Lusis, Nature, 2000, 407, 233–241.
2002, 31, 64–68.
45 G. K. Hansson, N. Engl. J. Med., 2005, 352, 1685–1695.
20 E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai and 46 G. K. Hansson and J. Nilsson, J. Intern. Med., 2008, 263, A. L. Barabasi, Science, 2002, 297, 1551–1555.
21 R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii 47 P. K. Shreenivasaiah, S. H. Rho, T. Kim and H. Kim do, J. Mol.
and U. Alon, Science, 2002, 298, 824–827.
Cell. Cardiol., 2008, 44, 460–469.
48 J. T. Brindle, H. Antti, E. Holmes, G. Tranter, J. K. Nicholson, C. D. Maranas, Genome Res., 2004, 14, 301–312.
H. W. Bethell, S. Clarke, P. M. Schofield, E. McKilligin, 23 E. V. Nikolaev, A. P. Burgard and C. D. Maranas, Biophys. J., D. E. Mosedale and D. J. Grainger, Nat. Med., 2002, 8, 2005, 88, 37–49.
24 V. Vermeirssen, M. I. Barrasa, C. A. Hidalgo, J. A. Babon, 49 H. L. Kirschenlohr, J. L. Griffin, S. C. Clarke, R. Rhydwen, A. A. Grace, P. M. Schofield, K. M. Brindle and J. C. Metcalfe, A. J. Walhout, Genome Res., 2007, 17, 1061–1071.
Nat. Med., 2006, 12, 705–710.
25 M. Stoll, A. W. Cowley, Jr, P. J. Tonellato, A. S. Greene, 50 A. M. van den Maagdenberg, M. H. Hofker, P. J. Krimpenfort, M. L. Kaldunski, R. J. Roman, P. Dumas, N. J. Schork, I. de Bruijn, B. van Vlijmen, H. van der Boom, L. M. Havekes Z. Wang and H. J. Jacob, Science, 2001, 294, 1723–1726.
and R. R. Frants, J. Biol. Chem., 1993, 268, 10540–10545.
c The Royal Society of Chemistry 2009 Mol. BioSyst., 2009, 5, 588–602 601 51 C. B. Clish, E. Davidov, M. Oresic, T. N. Plasterer, G. Lavine, 76 B. H. Junker, C. Klukas and F. Schreiber, BMC Bioinf., 2006, 7, T. Londo, M. Meys, P. Snell, W. Stochaj, A. Adourian, X. Zhang, N. Morel, E. Neumann, E. Verheij, J. T. Vogels, 77 S. Draghici, P. Khatri, A. L. Tarca, K. Amin, A. Done, L. M. Havekes, N. Afeyan, F. Regnier, J. van der Greef and C. Voichita, C. Georgescu and R. Romero, Genome Res., 2007, S. Naylor, Omics, 2004, 8, 3–13.
17, 1537–1545.
52 M. Oresic, C. B. Clish, E. J. Davidov, E. Verheij, J. Vogels, 78 M. Hucka, A. Finney, H. M. Sauro, H. Bolouri, J. C. Doyle, L. M. Havekes, E. Neumann, A. Adourian, S. Naylor, J. van der H. Kitano, A. P. Arkin, B. J. Bornstein, D. Bray, A. Cornish- Greef and T. Plasterer, Appl. Bioinf., 2004, 3, 205–217.
Bowden, A. A. Cuellar, S. Dronov, E. D. Gilles, M. Ginkel, 53 B. de Roos, G. Rucklidge, M. Reid, K. Ross, G. Duncan, V. Gor, Goryanin, II, W. J. Hedley, T. C. Hodgman, M. A. Navarro, J. M. Arbones-Mainar, M. A. Guzman-Garcia, J. H. Hofmeyr, P. J. Hunter, N. S. Juty, J. L. Kasberger, J. Osada, J. Browne, C. E. Loscher and H. M. Roche, FASEB J., A. Kremling, U. Kummer, N. Le Novere, L. M. Loew, 2005, 19, 1746–1748.
54 J. Y. King, R. Ferrara, R. Tabibiazar, J. M. Spin, M. M. Chen, Y. Nakayama, M. R. Nelson, P. F. Nielsen, T. Sakurada, A. Kuchinsky, A. Vailaya, R. Kincaid, A. Tsalenko, D. X. Deng, J. C. Schaff, B. E. Shapiro, T. S. Shimizu, H. D. Spence, A. Connolly, P. Zhang, E. Yang, C. Watt, Z. Yakhini, A.
J. Stelling, K. Takahashi, M. Tomita, J. Wagner and J. Wang, Ben-Dor, A. Adler, L. Bruhn, P. Tsao, T. Quertermous and Bioinformatics, 2003, 19, 524–531.
E. A. Ashley, Physiol. Genomics, 2005, 23, 103–118.
79 R. C. Gentleman, V. J. Carey, D. M. Bates, B. Bolstad, 55 P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, M. Dettling, S. Dudoit, B. Ellis, L. Gautier, Y. Ge, J. Gentry, D. Ramage, N. Amin, B. Schwikowski and T. Ideker, Genome K. Hornik, T. Hothorn, W. Huber, S. Iacus, R. Irizarry, Res., 2003, 13, 2498–2504.
F. Leisch, C. Li, M. Maechler, A. J. Rossini, G. Sawitzki, 56 J. Cohen, Nature, 2002, 420, 885–891.
C. Smith, G. Smyth, L. Tierney, J. Y. Yang and J. Zhang, Genome 57 H. W. Tseng, H. F. Juan, H. C. Huang, J. Y. Lin, S. Sinchaikul, Biol., 2004, 5, R80.
T. C. Lai, C. F. Chen, S. T. Chen and G. J. Wang, Proteomics, 80 K. Aoki and M. Kanehisa, Current Protocols in Bioinformatics, 2006, 6, 5915–5928.
2005, chapter 1, unit 1.12.
58 H. J. Chung, M. Kim, C. H. Park, J. Kim and J. H. Kim, Nucleic 81 C. Klukas and F. Schreiber, Bioinformatics, 2007, 23, 344–350.
Acids Res., 2004, 32, W460–464.
82 P. H. Lee and D. Lee, Bioinformatics, 2005, 21, 2739–2747.
59 D. M. Wuttge, A. Sirsjo, P. Eriksson and S. Stemme, Mol. Med., 83 A. Vailaya, P. Bluvas, R. Kincaid, A. Kuchinsky, M. Creech and 2001, 7, 383–392.
A. Adler, Bioinformatics, 2005, 21, 430–438.
60 K. Jatta, D. Wagsater, L. Norgren, B. Stenberg and A. Sirsjo, 84 M. Reimers and V. J. Carey, Methods Enzymol., 2006, 411, 119–134.
J. Vasc. Res., 2005, 42, 266–271.
85 J. Schafer and K. Strimmer, Bioinformatics, 2005, 21, 754–764.
61 P. S. Olofsson, K. Jatta, D. Wagsater, S. Gredmark, U. Hedin, 86 D. Scholtens, M. Vidal and R. Gentleman, Bioinformatics, 2005, G. Paulsson-Berne, C. Soderberg-Naucler, G. K. Hansson and 21, 3548–3557.
A. Sirsjo, Arterioscler. Thromb. Vasc. Biol., 2005, 25, e113–116.
87 V. J. Carey, J. Gentry, E. Whalen and R. Gentleman, 62 P. S. Olofsson, L. A. Soderstrom, D. Wagsater, Y. Sheikine, Bioinformatics, 2005, 21, 135–136.
P. Ocaya, F. Lang, C. Rabu, L. Chen, M. Rudling, P. Aukrust, 88 P. T. Shannon, D. J. Reiss, R. Bonneau and N. S. Baliga, BMC U. Hedin, G. Paulsson-Berne, A. Sirsjo and G. K. Hansson, Bioinf., 2006, 7, 176.
Circulation, 2008, 117, 1292–1301.
63 R. Laaksonen, M. Katajamaa, H. Paiva, M. Sysi-Aho, M. H. Johnson and T. Galitski, BMC Bioinf., 2006, 7, 286.
L. Saarinen, P. Junni, D. Lutjohann, J. Smet, R. Van Coster, 90 M. Baitaluk, M. Sedova, A. Ray and A. Gupta, Nucleic Acids T. Seppanen-Laakso, T. Lehtimaki, J. Soini and M. Oresic, PLoS Res., 2006, 34, W466–471.
One, 2006, 1, e97.
91 M. Baitaluk, X. Qian, S. Godbole, A. Raval, A. Ray and 64 J. H. Dwyer, H. Allayee, K. M. Dwyer, J. Fan, H. Wu, R. Mar, A. Gupta, BMC Bioinf., 2006, 7, 55.
A. J. Lusis and M. Mehrabian, N. Engl. J. Med., 2004, 350, 29–37.
92 C. F. Taylor, N. W. Paton, K. S. Lilley, P. A. Binz, R. K. Julian 65 H. Qiu, A. Gabrielsen, H. E. Agardh, M. Wan, A. Wetterholm, Jr, A. R. Jones, W. Zhu, R. Apweiler, R. Aebersold, C. H. Wong, U. Hedin, J. Swedenborg, G. K. Hansson, E. W. Deutsch, M. J. Dunn, A. J. Heck, A. Leitner, M. Macht, B. Samuelsson, G. Paulsson-Berne and J. Z. Haeggstrom, Proc.
M. Mann, L. Martens, T. A. Neubert, S. D. Patterson, P. Ping, Natl. Acad. Sci. U. S. A., 2006, 103, 8161–8166.
S. L. Seymour, P. Souda, A. Tsugita, J. Vandekerckhove, 66 K. H. Pietila¨inen, M. Sysi-Aho, A. Rissanen, T. Seppa¨nen- T. M. Vondriska, J. P. Whitelegge, M. R. Wilkins, I. Xenarios, Laakso, H. Yki-Ja¨rvinen, J. Kaprio and M. Oresic, PLoS One, J. R. Yates, 3rd and H. Hermjakob, Nat. Biotechnol., 2007, 25, 2007, 2, e218.
67 J. Nikkila, M. Sysi-Aho, A. Ermolov, T. Seppanen-Laakso, 93 S. A. Sansone, T. Fan, R. Goodacre, J. L. Griffin, N. W. Hardy, O. Simell, S. Kaski and M. Oresic, Mol. Syst. Biol., 2008, 4, 197.
R. Kaddurah-Daouk, B. S. Kristal, J. Lindon, P. Mendes, 68 J. Skogsberg, J. Lundstro¨m, A. Kovacs, R. Nilsson, P. Noori, N. Morrison, B. Nikolau, D. Robertson, L. W. Sumner, S. Maleki, M. Ko¨hler, A. Hamsten, J. Tegner and J. Bjo¨rkegren, C. Taylor, M. van der Werf, B. van Ommen and O. Fiehn, Nat.
PLoS Genetics, 2008, 4, e1000036.
Biotechnol., 2007, 25, 846–848.
69 R. Kleemann, L. Verschuren, M. J. van Erk, Y. Nikolsky, 94 G. An, J. Crit. Care, 2006, 21, 105–110; discussion 110–101.
N. H. Cnubben, E. R. Verheij, A. K. Smilde, H. F. Hendriks, 95 Y. Vodovotz, Immunol. Res., 2006, 36, 237–245.
S. Zadelaar, G. J. Smith, V. Kaznacheev, T. Nikolskaya, 96 Y. Vodovotz, C. C. Chow, J. Bartels, C. Lagoa, J. M. Prince, A. Melnikov, E. Hurt-Camejo, J. van der Greef, B. van Ommen R. M. Levy, R. Kumar, J. Day, J. Rubin, G. Constantine, and T. Kooistra, Genome Biol., 2007, 8, R200.
T. R. Billiar, M. P. Fink and G. Clermont, Shock, 2006, 26, 70 P. Libby, Nature, 2002, 420, 868–874.
71 A. Ng, B. Bursteinas, Q. Gao, E. Mollison and M. Zvelebil, 97 Y. Vodovotz, M. Csete, J. Bartels, S. Chang and G. An, PLoS Briefings Bioinf., 2006, 7, 318–330.
Comput. Biol., 2008, 4, e1000014.
72 M. Kanehisa, M. Araki, S. Goto, M. Hattori, M. Hirakawa, 98 D. Noble, Science, 2002, 295, 1678–1682.
M. Itoh, T. Katayama, S. Kawashima, S. Okuda, T. Tokimatsu 99 B. J. Bennett, C. E. Romanoski and A. J. Lusis, Expert Rev.
and Y. Yamanishi, Nucleic Acids Res., 2008, 36, D480–484.
Cardiovasc. Ther., 2007, 5, 1095–1103.
73 M. P. van Iersel, T. Kelder, A. R. Pico, K. Hanspers, S. Coort, 100 S. Y. Shin, S. M. Choo, S. H. Woo and K. H. Cho, Adv. Biochem.
B. R. Conklin and C. Evelo, BMC Bioinf., 2008, 9, 399.
Eng. Biotechnol., 2008, 110, 25–45.
74 A. Ng, B. Bursteinas, Q. Gao, E. Mollison and M. Zvelebil, 101 R. C. Kerckhoffs, S. M. Narayan, J. H. Omens, L. J. Mulligan Nucleic Acids Res., 2006, 34, D527–534.
and A. D. McCulloch, Heart Fail Clin., 2008, 4, 371–378.
75 S. Ekins, Y. Nikolsky, A. Bugrim, E. Kirillov and T. Nikolskaya, 102 U. Sauer, M. Heinemann and N. Zamboni, Science, 2007, 316, Methods Mol. Biol., 2007, 356, 319–350.
602 Mol. BioSyst., 2009, 5, 588–602 c The Royal Society of Chemistry 2009


Microsoft word - tw report 300608_pr.doc

An independent investigation into the care and treatmentof TW A report forNHS London Authors:Alan WatsonDr Sally Adams Verita is an independent consultancy which specialises in conducting and managinginquiries, investigations and reviews for public sector and statutory organisations. Verita77 Shaftesbury AvenueLondon W1D 5DU

Micro-Level Value Creation Under Managerial Short-termism ∗ Jonathan B. Cohn† University of Texas at Austin University of Texas at Dallas Wharton Research Data Services We present evidence that managers facing short-termist incentives set a lower threshold for accepting projects. Using novel data on new client and product an- nouncements in both the U.S. and international markets, we find that the marketresponds less positively to a new project announcement when the firm's managers haveincentives to focus on short-term stock price performance. Furthermore, textual analy-sis of project announcements show that firms with short-termist CEOs use more vagueand generically positive language when introducing new projects to the marketplace.Keywords: CEO Short-termism, Corporate Investment, CEO Compensation, CareerConcerns, Corporate Governance