Microsoft word - d3_1_semcare_architecture_v1

Microsoft word - d3_1_semcare_architecture_v1_final.docx

Semantic Data Platform for Healthcare Lead beneficiary: MUG
D3.1 Sketch of system Date: 31/03/2014
architecture specification Nature: Report
WP3 – Architecture and Dissemination level: PU

D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1-0 Final TABLE OF CONTENTS
DOCUMENT INFORMATION . 4
DOCUMENT HISTORY . 4
DEFINITIONS . 5
EXECUTIVE SUMMARY . 6
KEY WORDS (WORDLE STYLE) . 7
INTRODUCTION . 8
ABOUT SEMCARE . 8
MOTIVATION AND BACKGROUND . 8
PROJECT DESCRIPTION . 8
ABOUT THIS DOCUMENT . 9
AIM OF THIS DOCUMENT . 9
DOCUMENT STRUCTURE . 9
APPLICATION SCENARIO / REQUIREMENTS . 10
USE CASE . 10
BACKGROUND AND MOTIVATION . 10
APPROACH . 11
TOPICS OF INTEREST AND THEIR (TEXTUAL) REPRESENTATION IN EHRS . 11
REQUIREMENTS . 14
FUNCTIONAL REQUIREMENTS . 14
NON-FUNCTIONAL REQUIREMENTS . 15
ARCHITECTURE . 16
OVERVIEW . 16
INTERFACES . 19
DATA MODELS . 20
INPUT DATA . 20
TERMINOLOGIES . 21
Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388.

D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1-0 Final I2B2 STAR SCHEMA . 22
SEMCARE PATIENT RECORD SOLR DOCUMENT . 24
SEMCARE DATA LOADING FLOW . 26
MODULES & FUNCTIONAL VIEW . 27
SEMCARE DATA IMPORTER . 28
SOLR . 28
AVERBIS TEXT ANALYTICS (AEP) . 28
SEMCARE PORTAL WEB APPLICATION . 28
I2B2 APPLICATIONS . 29
THIRD PARTY TOOLS AND APPLICATIONS . 29
SCALABILITY . 29
USERS & ROLES . 29
OPEN POINTS . 30
DATA PRIVACY / TECHNICAL AND ORGANIZATIONAL SECURITY PROCEDURES . 31
DATA PROCESSING . 31
DATA TRANSFER AND DATA LOCATION . 31
ROLE CONCEPT . 32
AVAILABILITY CONTROL . 32
DATA SEPARATION CONTROL . 32
Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388.

D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1-0 Final TABLE OF FIGURES
FIGURE 1: SYSTEMS INVOLVED IN THE SEMCARE ARCHITECTURE . 16 FIGURE 2: ARCHITECTURE SKETCH . 18 FIGURE 3: ARCHITECTURE LAYERING . 19 FIGURE 4: AVERBIS SEARCH REST API . 20 FIGURE 5: REFINEMENT PROCESS FOR CRITERIA . 21 FIGURE 6: I2B2 STAR SCHEMA . 23 FIGURE 7: I2B2 CUSTOM_META TABLE . 23 FIGURE 8: CUSTOM METADATA IN I2B2 TERM NAVIGATOR . 24 FIGURE 9: SOLR TO I2B2 MAPPING . 24 FIGURE 10: MAPPING OF SOLR DOCUMENTS TO I2B2 DATABASE . 25 FIGURE 11: SEMCARE DATA LOADING FLOW . 26 FIGURE 12: TALEND OPEN STUDIO DATA IMPORTER . 27 FIGURE 13: SEMCARE COMPONENTS . 27 Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388.

D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1-0 Final DOCUMENT INFORMATION ICT-611388 Acronym Number Full title Semantic Data Platform for Healthcare EU Project officer Saila Rinne ([email protected]) Deliverable Number of system architecture specification Architecture and Requirements Contractual 31.03.2014 Version V1.0 Final Draft  Final   Prototype  Other  Dissemination Level Public  Confidential  Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Authors (Partner) Stefan Schulz (MUG) Stefan Schulz Responsible Author Partner MUG +43 699 150 96 270 DOCUMENT HISTORY DESCRIPTION
Initial Creation Corrections, comments, additions Corrections, additions Corrections, additions Corrections, additions A. Honrado, E. Chavarría Internal formal review Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388.

D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1-0 Final • Partners of the SEMCARE Consortium are referred to herein according to the following codes: AVERBIS - Averbis GmbH (Germany) Coordinator
EMC - Erasmus Universitair Medisch Centrum Rotterdam (Netherlands) – Beneficiary
MUG - Medical University of Graz (Austria) – Beneficiary
SGUL - Saint George's University of London (UK) – Beneficiary
SYNAPSE - Synapse Research Management Partners S.L. (Spain) – Beneficiary
• Project: The sum of all activities carried out in the framework of the Grant Agreement.
• AEP: Averbis Extraction Platform; text analysis tool to extract information units such as facts and
relations from unstructured text • CUI: Concept unique identifier in the Unified Medical Language System (UMLS)
• EHR: Electronic health record; clinical data record of a patient
• ETL: extract – transfer – load; Process in data warehousing that is often used to integrate data
from multiple sources. A common ETL tool is Talend Open Studio. • GUI: Graphical user interface of the application
• Graph DB: Database using graph structures with nodes, edges, and properties to represent and
store data. Compared to a relational database it is faster and better scalable for large data sets. • HL7v2 format: Health Level Seven; universal standard for the exchange of electronic health
• i2b2: Informatics for Integrating Biology and the Bedside; scalable informatics framework for
• REST: Representational State Transfer; communication service between two components using
JSON (JavaScript Object Notation) messages • Solr: Open source search platform from Apache Lucene, with Java client Solrj
• Terminology: General term for information artefacts that provide controlled terms for a domain,
identifiers of meaning and semantic relations. e.g. SNOMED, ICD-10, MeSH • Term Browser: Tool provided by Averbis to load, view, modify and export terminologies. It can
also be used to create new terminologies. • UIMA: Unstructured Information Management Architecture; framework by Apache enabling the
generation of analysis pipelines for arbitrary content such as text, image or video data Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388.

D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1-0 Final EXECUTIVE SUMMARY The initial task in work package 3 is the agreement on a generic architecture for the semantic data platform SEMCARE. This document gives a first overview of the planned system architecture for the project. The considerations about the architecture are a fundamental step in the development of such a data platform. They therefore constitute an essential task right from the beginning of the project. The architectural design decisions are driven by several dimensions. First, the use cases covered by the project must be defined in order to evaluate the resulting requirements. As the SEMCARE software will be installed within the different partner hospitals, it must also be considered that the integration into the clinical IT landscape should be simple. Furthermore, aspects about data governance, privacy and security should be kept in mind when developing the system architecture. Another important requirement is the scalability of the system to allow processing of large data sets. Finally, the SEMCARE architecture should be constructed in a way that enables a seamless integration of other platforms and applications, which is also called an ‘Open Architecture'. This can be allowed by using standard components. The main goal of the architecture is to provide a framework to extract meaningful information out of a broad range of structured and unstructured information from the Electronic Health Record. To this end, several systems and resources have to be integrated within a common framework. Some of these components are brought in and adapted by the partners, such as an extraction platform and a terminology browser. Others are available as free software such as indexing tools and semantic repositories. Domain terminology resources constitute another cornerstone of this framework. Whereas the coverage of existing terminologies is already very good for English, the two other languages addressed by SEMCARE, viz. Dutch and German, are less well served, which will require efforts in filling the terminology gaps by the combination of automated and manual term acquisition Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388.

D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1-0 Final KEY WORDS (Wordle style)1 1 http://www.wordle.net/ Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's Seventh Programme for research, technological
development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 1.1. About SEMCARE Motivation and Background
The exploitation of medical data from clinical trials and thus the monitoring and improving of healthcare delivery is of increasing interest. However, up to 80% of the clinical trials fail to meet their patient enrolment quotas on time. This recruitment delay currently causes up to $8 million per day in loss of revenue for the pharmaceutical industry. The SEMCARE project will provide a more efficient way of patient recruitment, which will be helpful to prevent recruitment delays. Furthermore, SEMCARE also addresses another challenge in the field of health care, which is the identification of rare diseases. For the doctors it is often hard to diagnose such diseases as they are hardly known, and hence this results in a number of undiagnosed or even wrongly diagnosed patients. SEMCARE will use available clinical patient data to combine signs and symptoms, thereby detecting undiagnosed patients suffering from rare diseases. This will contribute to speed up the research on this group of diseases. For the pharmaceutical companies, every newly diagnosed patient is of huge interest as it generates up to $300,000 drug revenue per year. 1.1.2. Project
The two-year research project SEMCARE ‘Semantic Data Platform for Healthcare' is funded by the European Commission's Seventh Framework Programme. The aim of the project is the development of a software platform that facilitates the diagnosis of rare diseases in various health care contexts, and supports the selection of appropriate patients for clinical studies, the basis being the automated, contextual evaluation of existing patient data. SEMCARE will combine current text-mining technologies with multi-lingual terminologies in order to develop solutions for typical problems that arise when interpreting medical narratives, e.g. ambiguities, abbreviations, spelling variations or typos. Testing and optimization of the analysis software for the analysis of routine medical data in clinics will be performed in leading European health centres in Great Britain, the Netherlands and Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 1.2. About this document Aim of this document
A fundamental task in work package 3 is the agreement on a generic architecture for the semantic data platform SEMCARE. This document gives a first overview of the planned system architecture for the project. The considerations about the architecture are a crucial step in the development of such a data platform, which makes them essential right from the beginning of the project. In order to be able to design the architecture it must be defined which use cases are covered in the project and which are the resulting requirements. This deliverable contains only the basic requirements. More specific, user- defined requirements related to the prototype will be provided in D3.2. 1.2.2. Document
This document has been structured into four main parts. Following an introduction into the SEMCARE project and the document, the use case that will be focused on during the project, is described. The definition of the use case is necessary for the identification of the requirements and the demands that are made on the platform and the underlying architecture. As a third step, we show the generic design of the SEMCARE architecture and describe the different modules and how they interact. Last but not least the document includes information about the technical and organizational procedures performed at the hospitals with regards to data privacy and security in the context of the SEMCARE systems and Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 2. Application Scenario / Requirements The three participating European health centres have agreed on one first general use case on which they will focus during the project. The use case is called 'Risk Stratification and Differential
Diagnosis of Patients suffering from transient loss of consciousness'. This use case is
described in detail in the following subsections. Background and motivation
Cardiovascular disease is the cause of 47% of all deaths in Europe, the majority of which are related to underlying coronary artery disease2. Sudden cardiac death accounts for approximately half of coronary artery disease related deaths3 and also occurs in those with non-coronary artery disease related cardiovascular diseases such as cardiomyopathies and inherited channelopathies. Sudden death is also more prevalent in patients with epilepsy and is often unexplained when it is known as SUDEP (Sudden unexplained Death in Epilepsy). We are currently unable to determine those who are at most risk from SUDEP. The symptom of transient loss of consciousness (T-LOC) occurs in up to 50% of the general population and leads to 1% of all hospital admissions4,5,6. A wide range of conditions can lead to T- LOC. Causes of T-LOC can be broadly categorized as cardiac (such as arrhythmia when it is known as syncope) or non-cardiac (such as epilepsy). Cardiac syncope carries a much more sinister prognosis as it is associated with sudden cardiac death. Fortunately effective treatments, such as anti-ischemic and heart failure medication and implantation of implantable cardioverter-defibrillator (ICD), can dramatically improve outcomes. Unfortunately, the clinically assigned aetiology and prognosis of T-LOC is frequently incorrect4, predominantly due to an inability to differentiate between cardiac syncope and epilepsy and a lack of appreciation of high-risk markers such as exertional T- LOC, T-LOC with palpitation and function and/or pre-existent coronary and/or structural heart disease. 2 European Cardiovascular Disease Statistics, 2012 edition. European Heart Network and European Society of Cardiology 3 Myerburg RJ1, Junttila MJ. Sudden cardiac death caused by coronary heart disease. Circulation. 2012 Feb 28;125(8):1043-52. 4 Fitzpatrick AP1, Cooper P. Diagnosis and management of patients with blackouts. Heart. 2006 Apr;92(4):559-68. 5 Petkar S, Jackson M, Fitzpatrick A. Management of blackouts and misdiagnosis of epilepsy and falls. Clin Med. 2005 Sep-Oct;5(5):514-20. 6 Brignole M et al. A new management of syncope: prospective systematic guideline-based evaluation of patients referred urgently to general hospitals. Eur Heart J. 2006 Jan;27(1):76-82. Epub 2005 Nov 4. Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 2.1.2. Approach
A number of phenotypic features can help risk stratify patients, most of which are available from routine assessment and investigations. Using a semantic data platform, we seek to identify high-risk patient cohorts based on patient-level criteria scattered in heterogeneous clinical data contained in electronic healthcare records (EHRs). Subjects belonging to a universal set of interest will have their electronic medical records processed for natural language expressions that denote often detailed descriptions about patients' clinical history, procedures or investigations planned or carried out. In our specific use case, the cases of interest are patients with prior myocardial infarction (MI), syncope of presumed cardiac origin or seizure disorder. The universal set of interest is defined as patients with Transient Loss of Consciousness and/or Sudden Cardiac Arrest and/or sustained Ventricular Arrhythmia and/or Cardiomyopathy and/or Ischemic Heart Disease and/or Seizure Disorder. However, the information extraction methodology we develop and describe is generic and could be adapted to whatsoever patient cohorts and medical inquiries. Topics of interest and their (textual) representation in EHRs
In the following, medical topics of interest like procedures and investigations, but also information about the patients' medication and history are listed that will be used in order to identify subjects belonging to the universal set of interest described in the use case above. In the table below, only the most frequent topics of interest are listed. Contents of electronic medical records will be processed for typical phrases for topics and attributes. The values of the attributes are assumed to be numeric or Boolean and are therefore not of terminological interest. This means, that, e.g. "normal ECG" would be represented by the attribute "ECG normal" and the value "true", or "QRS interval 0.12 s" would be represented by the attribute "QRS interval in seconds" and the numeric value "0.12". For each topic of interest, some examples of indicative phrases and related attributes are listed in the Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final Indicative phrases for topics
Indicative phrases for related attributes
Topic of Interest
(selected examples in English, Dutch (selected examples in English, Dutch and and German) "electrocardiogram", "normal", "normaal" "elektrocardiogram", "abnormal ECG", "abnormal "Elektrokardiogram", electrocardiogram" "PR Interval Duration" "atrioventricular", "atrioventriculaire", "AV", "QRS Interval Duration" Electrocardiogram "T wave inversion", "T wave abnormality" "ST segment depression", "ST segment elevation" "Bundle Branch Block", "RBBB", "LBBB" "pathological Q Waves" "Atrial fibrillation" "exercise tolerance test", "ETT", "normal", "ischemic", "ST segment depression" "T wave inversion", "blood pressure response" "ventricular tachycardia", "VT" Exercise Tolerance "ventricular ectopics", "VEs", "ectopics present", "ectopics absent", "couplets", "triplets", "salvos", "PVCs", "premature ventricular contractions" "holter monitoring", "holter", "24 hour tape", "48 hour tape", "event "non sustained VT", "non sustained ventricular tachycardia", "nsVT", "ventricular tachycardia", "ventricular ectopics", "VEs", "ectopics present", "couplets", "triplets", "salvos", "PVCs", "premature ventricular contractions", "cardiac catherisation", "normal", "unobstructed", "normal coronaries", "catherization", "cath", "angiogram", "normal coronary arteries", "normal coronary "coronary angiogram" angiography", "smooth coronary arteries", "stenosis", "stenoses", "obstruction" "echocardiogram", "echocardiografie" "normal heart", "no cardiomyopathy", "normal "echo", "TTE", "heart scan", "Echokardiogramm" "ejection fraction", "ventricular function" "ventricular dysfunction", "poor ventricular function", "impaired LV", "impaired left ventricular" "aortic stenosis", "mitral stenosis" "pulmonary hypertension" "CMR", "CMRI", "Cardiac MRI", "MRI - "normal" Cardiac", "Kardiales MRI", "Herz-MRI" "ejection fraction", "ventricular function" "late gadolinium enhancement", "Scar" "regional wall motion abnormality" "Blood Tests", "Bloods", "normal", "abnormal", "elevated", "raised", "low" "Biochemistry", "Full Blood Count", "FBC", "Troponin", "Toxicology", "Blutbild" "age", "DOB", "date of birth", "Geburtsdatum", "Alter", Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final Indicative phrases for topics
Indicative phrases for related attributes
Topic of Interest
(selected examples in English, Dutch (selected examples in English, Dutch and and German) "geboortedatum" "alter" "medications", "meds", "drugs "drug name", "substance name", "dose", History", "is on", "Medikamente" "Furosemide", "Frusemide", "Metolazone", "Eplerenone", "Spironolactone", "Dosis" "family history of", "sudden cardiac arrest", "unexplained death", "brother "degree of relative", "first", "second", "mother", died suddenly", "cousin died "father", "brother", "sister", "aunt", "uncle", "son", "daughter", "Vater", "Mutter", "Onkel", "vader", "broer", "zuster", "tante", "oom", "zoon", "dochter", "VT", "VF", "ventricular tachycardia", in context of ventricular fibrillation: "polymorphic VT", "ventricular "idiopathic", "no cause", "no aetiology", fibrillation", "torsades", "resuscitated "idiopathisch", "ohne erkennbare Ursache" sudden death", "resuscitated SCD" "Arrest", "Cardiac arrest", "VF arrest", "Plötzlicher Herztod", "Sekundentod" "syncope", "near syncope", "pre- "on exertion", "exertional", "on exercise", syncope", "presyncope" "exercise related", "exercise induced", "stress "blackout", "black-out", "collapse", related", "catecholamine related", "emotion "faint", "loss of consciousness", induced", "while running", "whilst running", "LOC", "TLOC", "T-LOC", "pass out", "mid-stride", "in Verbindung mit Stress", "passing out", "passed out", "prolonged standing", "prodromal symptoms", "coughing", "micturition", "passing water", "urinating", "swallowing" "heart failure", "HF", "CCF", "Severe", "Gross", "Moderate", "Mild", "cardiomyopathy", "breathlessness", "NYHA Class I", "NYHA Class II", "NYHA Class "NYHA II", "NYHA III", "NYHA IV", III", "NYHA Class IV", "NYHA I" "Herzversagen", "Herzinsuffizienz" "Myocardial infarction", "STEMI", "nonSTEMI", "non-STEMI", "NSTEMI", "acute coronary syndrome", "ACS", "ischaemic heart "Unstable Angina" disease", "IHD", "CAD", "Angina", "Previous stents" "Troponin rise" "Previous stents", "PCI", "angioplasty", "CABG" "seizure disorder", "epilepsy", "Type", "Petit Mal", "fitting", "fits" "Status Epilepticus", "limb jerking" "status epilepticus", "Krämpfe", "Epilepsie", "Anfall" Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 2.2. Requirements In this section the basic requirements arising from the described use case are described. A more detailed description of the user specific requirements will be provided after developing the first prototype. This description will be part of D3.2. 2.2.1. Functional
In order to identify candidates matching the aforementioned criteria, arbitrary types of free-text documents in patient records have to be gathered, pre-processed and analysed. Hence, in a first stage, interfaces to existing clinical IT systems have to be established to consolidate the data from each relevant resource. This stage in general also includes a data transformation process mapping, for instance, HL7 encoded data to a target schema of a central knowledge store. These kinds of tasks are perfectly solved by the aid of ETL (extraction, transformation, loading) tools such as Talend Open Studio7 or Pipeline Pilot8. Furthermore, the identification of use case specific criteria (e.g. ‘loss of consciousness whilst running') within clinical narratives require that an information extraction system needs to be prepared to a variety of isosemantic lexical and syntactic variants found in the texts. Consequently, for each criterion and attribute of interest numerous synonymous expressions have to be considered in order to guarantee a high recall of relevant candidates. To handle this huge complexity we will use a Solr search engine combined with several domain terminologies like SNOMED CT, ICD-10 or MeSH. One main focus of the SEMCARE platform is the end-user support in the criteria refinement process. This is not trivial as it will require a dialogue with the users in order to acquire custom expressions that would enhance the terminological coverage. Details on this refinement process are described within section 3 below. Another key aspect is the language of the document. Text processing tools have to consider the particular syntax and grammar, but also the terminology to be dealt with has to be specific for a language. Furthermore, regional particularities such as punctuation have to be accounted for. Examples are the decimal point in English, opposed to the decimal comma in German and Dutch, or different units of measurement used for the same laboratory observations. 7 http://talend.com/products/talend-open-studio 8 http://accelrys.com/products/pipeline-pilot Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 2.2.2. Non-Functional
The non-functional requirements elaborate the performance characteristics of the SEMCARE system. The handling of the SEMCARE graphical user interphase (GUI) should be easy and intuitive. The ranking of the results after submission of a user query should be Transparent ranking transparent and traceable. Users should be able to understand how they can refine their query in order to get better results. Compatibility of GUI The web-based GUI of the system must be compatible with the browser for browsers in use versions used in the hospitals. The response times while using the SEMCARE platform should be short in order to provide a user-friendly service. The performance of the system depends on several parameters such as: Low response time b) size of the index and main storage c) number of parallel requests d) strategy of authorization Each component of the SEMCARE architecture is platform independent as Java will be used for the implementation. It must be guaranteed that only authorized people can access the clinical Security / privacy Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 3.1.1. Involved
Figure 1 shows an overview of the different systems involved in the SEMCARE architecture and how data is transferred from one system to another. Figure 1: Systems involved in the SEMCARE architecture Each of the systems is briefly described below. Production data system
The production data system contains the hospital production data that may be structured or unstructured and is spread over different sources. Possible components of the system are: Multiple components that constitute a HIS (hospital information system) Staging data system
The staging data system is a copy of the hospital production data used for feeding the SEMCARE staging system. The reason for copying the hospital data is that it is usually not allowed to directly operate on the live data. By operating on a copy of the data, potential damages on the live-system are The staging data system has the same components as the production system: HIS (hospital information system) Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final SEMCARE staging system
The SEMCARE staging system reads the data from the hospital staging system. This is done via an ETL process that aggregates data from different data sources into one data store. A common tool for such an ETL process is Talend Open Studio. Once the data is loaded, patient data of interest is analysed, and the resulting data populates the SEMCARE staging databases as well as the Solr The different components are: Relational database: SEMCARE data store where the aggregated clinical data is stored.
Database importer process: An ETL process that loads data from the staging data system
into the SEMCARE staging system. Solr server and index: Indexes documents and searches indexes.
Graph database: Stores concept hierarchies and relations between documents and
concepts. For now, this is an experimental extension to the system. It will be further evaluated if it can add additional value to the SEMCARE platform. Averbis text analysis pipeline (AEP): Analyses text in order to extract structured data.
SEMCARE portal for testing: Provides capability for configuring and testing the staging
SEMCARE production system
The SEMCARE production system contains the structured data exported from the SEMCARE staging system. It is the system that is used by the end users to perform search queries and view reports. The system contains the following components: Relational database: SEMCARE data store where the aggregated clinical data is stored.
Solr server and index: Indexes documents and searches indexes.
Graph database: Stores concept hierarchies and relations between documents and
concepts. For now, this is an experimental extension to the system. It will be further evaluated if it can add additional value to the SEMCARE platform. Averbis text analysis pipeline (AEP): Analyses text in order to extract structured data.
SEMCARE portal for end users: The portal for building queries and searching the system.
3.1.2. Architecture
Figure 2 shows an overview of the complete architecture planned for the semantic analysis platform SEMCARE. The individual components have been described in section 3.1.1 above. Furthermore, the figure shows that it will be possible to apply third party tools on the data store of the SEMCARE production system in order to perform further analytics like visualisation or statistics. This will be enabled by providing a common data model (the i2b2 star schema) that can easily be used by Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final third party applications (e.g. tranSMART, QlikView, Rapidminer). As a consequence, hospitals can install third party tools if they want to use them on the SEMCARE data. Figure 2: Architecture sketch Architecture Layering
The SEMCARE system can be divided into three layers, which are described in the following paragraphs from the bottom to the top and graphically showed in Figure 3 below. The bottom layer contains the data sources, which consist of different types of patient data arising in
a hospital, for example unstructured data like discharge summaries or findings reports, and structured data like lab results or other routine data acquired and structured for health care, research and quality assessment. Also coded data could be available, which is mainly used for reimbursement. The data is scattered over different databases or stored in files, which can be of different format (e.g. Word, XML, Text, and PDF). Data may also be available as messages, generally in HL7v2 format as a universal health care messaging standard. The second layer is the semantic middleware. First, it contains tools for information extraction, ETL
and text mining as an interface to the data sources. The loaded and analysed data is then stored in a unifying semantic database. This layer also includes terminologies and texts stored in a graph Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final database and Solr index. The third part of the middleware is the communication between the SEMCARE data store and the topmost layer, which is the presentation layer. The presentation layer is the highest level and represents the interface to the user who could be a
researcher, clinician or administrator. Possible components of the presentation layer are: the terminology editor a search interface including a query generator dashboards and analytics Ad m in is tra tion H os p italD ata Figure 3: Architecture layering The following interfaces between components of the SEMCARE system have been identified: • Staging data to data importer: Imports data from the hospital information system as
documents or messages. Formats to be expected are xml, HL7, plain text, possibly also jpeg or other formats for scanned documents, DICOM. Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final • Data importer to staging Solr: The Solrj (Solr Java client) API is used for sending patient
record information to Solr to be analysed. • SEMCARE staging portal to staging Solr: The Averbis search REST API will be used for
the communication between the two components. This API uses JSON messages to communicate with Solr. Example message definitions are shown in Figure 4: Averbis search • SEMCARE production portal to production Solr: The Averbis search REST API will be
used for the communication between the two components. Example message definitions are shown in Figure 4: Averbis search REST API. public class Request {
public class Result {
private String query;
private String query;
private Integer rows;
private Integer start;
private Integer start;
private String highlightQuery;
private String highlightQuery;
private Integer numFound;
private List<SortField> sortFields;
private List<Facet> facets;
private Boolean facetHighlighting;
private List<Document> documents;
private Integer facetLimit;
private String didYouMean;
private String facetPrefix;
private String facetSort;
private List<Facet> facets;
private List<Field> fields;
private List<Param> params;
private User user;
Figure 4: Averbis search REST API 3.3. Data Models SEMCARE employs a number of different data formats and systems. These include unstructured input data, relational databases, terminologies, and Solr indexes. 3.3.1. Input
Data
The input data for the SEMCARE project may vary with regards to the data source and the data For each treatment episode, several sources are of interest: • documents, either original ones (e.g. findings reports) or aggregated ones (discharge letters) • messages, e.g. HL7v2 messages • raw data, e.g. images, measurement data (e.g. ECG) • database entries Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final The input data may exhibit different degrees of structure, such as • unstructured, e.g. free text, images • semi-structured, e.g. free text with standardized organizing patterns (e.g. headings) • structured, e.g. tables of lab values • coded, e.g. LOINC-coded lab values, ICD-10 coded diseases The SEMCARE system will import these different formats from the various data sources with an ETL 3.3.2. Terminologies
Medical terminologies provide meaning identifiers (codes) for terms or groups of synonymous terms, the latter generally referred to as concepts. In SEMCARE, terminologies will enrich the search process by knowledge about the meaning of domain terms, their groupings into concepts, and certain relations between concepts such as broader / narrower. In addition, SEMCARE will enable users to add new concepts and terms to the existing terminology, where needed, e.g. when they miss an important synonym. As some terminologies support several languages they will also allow for multilingual text analysis by grouping terms from different languages into the same concept. The continual process for refining terminologies is described in this section. Figure 5: Refinement process for criteria Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final As shown in Figure 5 above, the SEMCARE platform will provide a term browser and a dictionary creator for users to view and edit their terminology. The term browser will be able to import standard terminologies such as SNOMED CT, ICD-10 or MeSH and store them in a relational database (RDB). The users can then build their own terminology by enhancing, merging, or modifying existing terminologies. The most important medical terminologies are contained in the UMLS metathesaurus, which is a rich source of synonyms in different languages that also groups concepts into top-level categories via the UMLS Semantic Network. We will make use of all of this by enhancing the user interface of the term browser, so that also non-English terms can be used to search for concepts. In all stages of the terminology creation process, the terminology can be exported to the AEP analysis pipeline. The terminology can then be used to index and search documents via the SEMCARE search When the users build their search query, they may find that their terminology needs to be modified in order to produce better search results. They can then go back to the term browser to make changes to the terminology. This refinement process is crucial for optimizing the SEMCARE platform. Users should be able to quickly see how terminology changes affect search facets and results. Whereas there is a certain preference for SNOMED CT, ICD-10, and MeSH, a final decision of which terminologies to use for annotation will have to be made at the start of the work in WP2. Another decision to be made is how the known vocabulary gap for Dutch and German will be filled. One possible strategy is the use of machine translation, together with human review of the terms generated by this method. Manual additions to the terminologies, mainly driven by the use case, will be the option of choice wherever queries have to be fine-tuned. 3.3.3. I2B2
Star
In order to use a standard schema for the data storage and to ensure that we provide a common data model that is also widely used by third party providers (e.g. tranSMART), the i2b2 star schema will be used in SEMCARE to store the data. In the i2b2 star schema, observations or, more precise, factoid (fact-like) statements, are stored in the observation_fact table and linked to four so-called "dimension" tables for patient, visit, provider and concept details. These dimension tables contain descriptive information about factoid statements. Figure 6 below shows an overview of the i2b2 star schema. Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final Figure 6: i2b2 star schema I2b2 also uses metadata tables to define terminologies. SEMCARE terminologies can be stored in the i2b2 custom_meta table (Figure 7). This table stores hierarchical terminologies that are used to build queries in the i2b2 query and analysis tool. The c_fullname column is used to store the full path of each term with the ' ' character delimiting the hierarchical levels. After the custom_meta table is filled with SEMCARE terminologies via an import process, concept_dimensions can be created that link to the custom_meta terms. character varying(700) character varying(2000) c_visualattributes character varying(50) c_facttablecolumn character varying(50) character varying(50) character varying(50) c_columndatatype character varying(50) character varying(10) character varying(700) character varying(900) character varying(700) timestamp without time zone Figure 7: i2b2 custom_meta table Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final An example of how a custom terminology may look in the i2b2 term navigator is shown in Figure 8. Figure 8: Custom metadata in i2b2 term navigator In addition to the standard i2b2 tables, a new table will be created to map i2b2 records to Solr documents. This table will contain the encounter_num key, the original unstructured record, the Solr document and ID, and a copy of the CAS (Common Analysis System) object from the text analysis. Figure 9 shows this additional SEMCARE record table and its relation to the existing i2b2 tables. Figure 9: Solr to i2b2 mapping SEMCARE Patient Record Solr Document
Solr documents will be used to store patient record information for text search. Each Solr document will contain IDs that map the Solr document to corresponding records in the i2b2 database (see also Figure 10 below). With this linkage, only data required for search indexing will be stored in the Solr Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final document, and additional information can be pulled from the i2b2 database if needed. Dynamic fields can be used in the Solr document to store multiple concepts. References to terminology codes or concepts are stored in Solr as CUIs (concept unique identifier) to enable multilingual searches. Preferred terms and synonyms will not be stored in Solr because all documents and queries will be processed by the AEP to replace synonyms and preferred terms with CUIs before sending the query to Solr. The Solr system will provide a faceted search, which means that the search results are organized according to a faceted classification system, thus allowing the user to explore a collection of information by applying multiple filters. Facets correspond to properties of the search result. Solr will store multiple dynamic fields for each concept: • a list of all the types of concepts used for faceting (Note that this field is the set of all concept types in the document and it has no linkage to the relational database. Only individual concepts are linked to the database.) • a value for searching • an ID to link to the relational database • a path for hierarchical faceting For example, for medication with the CUI a1234 Solr would store the following fields: • concept_medications="a1234,b5678,c2313" • concept_med_val_a1234=50 • concept_med_id_a1234=123456 • concept_med_path_a1234=/c1000/b1023/a1234 Figure 10: Mapping of Solr documents to i2b2 database Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final SEMCARE Data Loading Flow
The data loading flow begins when the data importer ETL process loads unstructured data. The unstructured data is stored in the relational database and then sent to Solr to be analysed and indexed. The Solr process and text analysis pipeline stores data in a graph database, e.g. Neo4j9, and builds the Solr index. Finally, the structured data from the analysis is added to the relational database to enhance the unstructured data. A diagram of the data import flow is show in Figure 11. Figure 11: SEMCARE data loading flow The data importer process could be created with an ETL tool such as Talend Open Studio. Figure 12 below shows an example Talend job that reads a directory of plain text files and commits them to Solr and a PostgreSQL database. 9 http://www.neo4j.org/ Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final Figure 12: Talend Open Studio data importer 3.4. Modules & Functional View The SEMCARE system contains the following modules and components as shown in Figure 13 below and described in this section. Figure 13: SEMCARE components Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 3.4.1. SEMCARE
Data
The SEMCARE data importer is the entry point for health care data in the SEMCARE system. It could be an ETL process defined by a tool such as Talend Open Studio, or a custom coded software process. When it receives data, the data importer will write the unstructured data to the database and then send the unstructured data to Solr for analysis. 3.4.2. Solr
Solr is an open source search platform from Apache Lucene. In the SEMCARE project it is used to index and search patient record data. Solr will use the Averbis text analytic tools to create structured data from unstructured text. After the text is analysed, Solr will write the structured data to the Averbis Text Analytics (AEP)
The Averbis Extraction Platform (AEP) describes a text analysis tool that can be simply applied to arbitrary information extraction scenarios. It provides solutions to extract individual information units such as facts and relations from unstructured text having the highest relevance for a user. The AEP consists of a number of modular text analysis components, so called Analysis Engines (AEs), stick together in the Apache UIMA10 framework building an overall solution for different use cases. Depending on the requirements, rule-based, statistical methods or a combination of both are used to reveal the semantic from the content. Annotations between AEs are exchanged using an object named Common Analysis System (CAS). The CAS is UIMA's object-based data structure that allows memory based storage and exchange of annotations with respect to pre-defined type systems of hierarchically organized annotations. With the aid of this data structure it is possible to generate a common base to analyse unstructured text. SEMCARE Portal Web Application
The SEMCARE portal provides a graphical user interface, which allows users to build queries on the clinical data and to manage the system. The users will get immediate feedback from a search, which helps them to decide how to refine their query in order to get better results. The portal will also provide users with an interface for defining and refining terminologies. More specific requirements and details about the user interface will be provided in D3.2. 10 http://opennlp.apache.org Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 3.4.5. I2b2
I2b2 tools and components such as the i2b2 query and analysis tool can be installed in the system if needed. I2b2 runs on the JBoss application server. Third Party Tools and Applications
Third party tools can also be installed in the system as required. These tools could possibly interface with the i2b2 database or the Solr server, but because of the varying requirements and functionality of third party applications, they are not shown in Figure 13 or described in detail here. 3.4.7. Scalability
All of the components in the SEMCARE system can be deployed across multiple machines to support the processing of large data sets if needed. Multiple data importer processors can be launched to read input data. Solr Cloud can be used to distribute Solr indexes and search processing across multiple machines. The Averbis text analysis pipeline can also be deployed as a distributed system. By adding more machines and distributing SEMCARE components the SEMCARE system can scale to meet the processing requirements of large data sets. 3.5. Users & Roles In the context of the SEMCARE project different types of users can be distinguished. Their roles are briefly described below: Production Database Administrator
The production database administrator manages the copying of production patient data into the staging data system. He/she also manages the following export to the SEMCARE staging system via ETL process. The Production Database Administrator is located at the hospital site. SEMCARE Administrator
The SEMCARE administrator is responsible for managing the SEMCARE databases, the Solr configuration and the SEMCARE portal. He/she configures terminology and text analysis configuration. The administrator manages copying of data from SEMCARE staging to the SEMCARE production environment and creates custom dashboards, scripts and third party integrations. SEMCARE User
Typically, SEMCARE users will be researchers and clinicians who use the SEMCARE portal for search and analytics. Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final The role concept will be further verified during the project and refined if needed. Furthermore, it must be guaranteed that all roles have the access rights to the data to be analysed at the level of the hospital information system. 3.6. Open points A few points that are still open and need further clarification within the course of the project are listed below. More specific details about these points will be given in deliverable D3.2. • One challenge for the SEMCARE platform is the search for constellation of symptoms that are spread over several documents. A strategy will be developed in order to cover this • As the SEMCARE system will be installed within the hospital, a further analysis of the IT landscape within the hospitals will have to be performed. The interfaces need to be defined and the interchange formats to be specified. • Another point to think about is a possible weighting of criteria for a specific use case. For example, it should be possible to define mandatory and optional criteria when creating a Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final Data Privacy / Technical and organizational security procedures 4.1. Data processing The data processing within the scope of the SEMCARE project takes place entirely within each participating hospital. The project integrates into the existing IT landscape of the hospital with regards to admission (physical access), computer access, and data access control to the used IT components (servers and network components). This also affects the security of particularly sensitive health care data arising in a hospital. The architectural design of the SEMCARE platform permits data processing and storage on separate hard drives if needed because of the involvement of different departments and appropriate user 4.2. Data transfer and data location In the scope of the SEMCARE project patient data will not leave the hospitals at any time. Patient data may, however, be shared between different departments of each hospital. In these cases, already installed (pseudo-) anonymisation processes will be applied. The de-identification procedures for each of the three participating hospitals are explained in detail in deliverable D1.1. Regarding test data, SGUL will prepare anonymised data to be used by Averbis GmbH for the development of algorithms, interfaces and the final product. The legal basis for the transfer of such test data is section 251 of the NHS Act 2006. Transferred test data will be encrypted either at rest or in transition. The hospitals EMC and MUG will not provide any data to Averbis or to any other clinical Both, data processing and the operation of the data platform will be performed within a dedicated server infrastructure in the hospital. It will be ensured that no project-related data is stored on locations where unauthorised persons have access to. Furthermore, an additional encryption of the data that is e.g. stored in the Solr index is possible by using TrueCrypt11. 11 http://www.truecrypt.org/ Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388. D3.1 – Sketch of system architecture specification WP3: Architecture and Requirements Dissemination level: Public Authors: Philipp Daumke, Carla Haid, Luke Mertens (Averbis), Stefan Schulz (MUG) Version: 1.0 Final 4.3. Role concept A role concept will be applied that assures that only authorised users can access the data related to the SEMCARE project. Data Upload, Query generation SEMCARE Administrator Create, Edit, Delete Users SEMCARE Administrator System maintenance Local system administrator A connection to the local LDAP (Lightweight Directory Access Protocol) can be implemented in order to take over existing access rights. A logging of the activities will be performed in order to be able to examine if personal data has been entered, changed or deleted, and by whom. Only allocated and defined personnel will have access to the system components and applications of the SEMCARE applications 4.4. Availability control Actions will be considered in order to protect personal data against accidental destruction or loss. For example, the SEMCARE systems will not directly work on the hospital live data but on a copy (staging system) to ensure that no real patient data is affected in any way. High availability of the SEMCARE platform is no priority as the application is not crucial for patient 4.5. Data separation control It must be assured that data from different scenarios or different departments are separated from each other. The SEMCARE architecture allows this separation if needed, e.g. different Solr indexes The SEMCARE systems will only be run locally and queries will only be performed on relevant patient data. Other information that is not relevant for the defined use case will not be extracted from the hospital systems. A development system and a production system will be provided separately. Copyright 2014-2015 SEMCARE Consortium. This project has received funding from the European Union's
Seventh Programme for research, technological development and demonstration under grant agreement No 611388.

Source: http://semcare.eu/wp-content/uploads/2015/01/SEMCARE_D3.1-Architecture_v1_FINAL1.pdf

Publication trends and knowledge map of global translational medicine research

Publication trends and knowledge maps of global translational medicine research Fei-Cheng Ma• Peng-Hui Lyu• Qiang Yao• Lan Yao • Shi-Jing Zhang Translational medical research literatures have increased rapidly in last decades and there have been fewer attempts or efforts to map global research context of translational medical related research. The main purpose of this study is to evaluate the global progress and to assess the current quantitatively

Frontmtr.frame

SAIC/CHCS Doc. TC-4.5-0359 29 Jul 1996 PHR: OUTPATIENT PHARMACY FUNCTIONS Copyright 1996 SAIC License is granted under Contract DAHC94-88-D-0005 and the provisions of DFAR 52.227-7013 (May 1987) to the U.S. Government and third parties under its employ to reproduce this document, in whole or in part, for Government purposes only.