4 resultados para EHR
em Universidad Politécnica de Madrid
Resumo:
Over the last few decades, the ever-increasing output of scientific publications has led to new challenges to keep up to date with the literature. In the biomedical area, this growth has introduced new requirements for professionals, e.g., physicians, who have to locate the exact papers that they need for their clinical and research work amongst a huge number of publications. Against this backdrop, novel information retrieval methods are even more necessary. While web search engines are widespread in many areas, facilitating access to all kinds of information, additional tools are required to automatically link information retrieved from these engines to specific biomedical applications. In the case of clinical environments, this also means considering aspects such as patient data security and confidentiality or structured contents, e.g., electronic health records (EHRs). In this scenario, we have developed a new tool to facilitate query building to retrieve scientific literature related to EHRs. Results: We have developed CDAPubMed, an open-source web browser extension to integrate EHR features in biomedical literature retrieval approaches. Clinical users can use CDAPubMed to: (i) load patient clinical documents, i.e., EHRs based on the Health Level 7-Clinical Document Architecture Standard (HL7-CDA), (ii) identify relevant terms for scientific literature search in these documents, i.e., Medical Subject Headings (MeSH), automatically driven by the CDAPubMed configuration, which advanced users can optimize to adapt to each specific situation, and (iii) generate and launch literature search queries to a major search engine, i.e., PubMed, to retrieve citations related to the EHR under examination. Conclusions: CDAPubMed is a platform-independent tool designed to facilitate literature searching using keywords contained in specific EHRs. CDAPubMed is visually integrated, as an extension of a widespread web browser, within the standard PubMed interface. It has been tested on a public dataset of HL7-CDA documents, returning significantly fewer citations since queries are focused on characteristics identified within the EHR. For instance, compared with more than 200,000 citations retrieved by breast neoplasm, fewer than ten citations were retrieved when ten patient features were added using CDAPubMed. This is an open source tool that can be freely used for non-profit purposes and integrated with other existing systems.
Resumo:
An important objective of the INTEGRATE project1 is to build tools that support the efficient execution of post-genomic multi-centric clinical trials in breast cancer, which includes the automatic assessment of the eligibility of patients for available trials. The population suited to be enrolled in a trial is described by a set of free-text eligibility criteria that are both syntactically and semantically complex. At the same time, the assessment of the eligibility of a patient for a trial requires the (machineprocessable) understanding of the semantics of the eligibility criteria in order to further evaluate if the patient data available for example in the hospital EHR satisfies these criteria. This paper presents an analysis of the semantics of the clinical trial eligibility criteria based on relevant medical ontologies in the clinical research domain: SNOMED-CT, LOINC, MedDRA. We detect subsets of these widely-adopted ontologies that characterize the semantics of the eligibility criteria of trials in various clinical domains and compare these sets. Next, we evaluate the occurrence frequency of the concepts in the concrete case of breast cancer (which is our first application domain) in order to provide meaningful priorities for the task of binding/mapping these ontology concepts to the actual patient data. We further assess the effort required to extend our approach to new domains in terms of additional semantic mappings that need to be developed.
Resumo:
The availability of electronic health data favors scientific advance through the creation of repositories for secondary use. Data anonymization is a mandatory step to comply with current legislation. A service for the pseudonymization of electronic healthcare record (EHR) extracts aimed at facilitating the exchange of clinical information for secondary use in compliance with legislation on data protection is presented. According to ISO/TS 25237, pseudonymization is a particular type of anonymization. This tool performs the anonymizations by maintaining three quasi-identifiers (gender, date of birth and place of residence) with a degree of specification selected by the user. The developed system is based on the ISO/EN 13606 norm using its characteristics specifically favorable for anonymization. The service is made up of two independent modules: the demographic server and the pseudonymizing module. The demographic server supports the permanent storage of the demographic entities and the management of the identifiers. The pseudonymizing module anonymizes the ISO/EN 13606 extracts. The pseudonymizing process consists of four phases: the storage of the demographic information included in the extract, the substitution of the identifiers, the elimination of the demographic information of the extract and the elimination of key data in free-text fields. The described pseudonymizing system was used in three Telemedicine research projects with satisfactory results. A problem was detected with the type of data in a demographic data field and a proposal for modification was prepared for the group in charge of the drawing up and revision of the ISO/EN 13606 norm.
Resumo:
El presente Trabajo Fin de Grado (TFG) surge de la necesidad de disponer de tecnologías que faciliten el Procesamiento de Lenguaje Natural (NLP) en español dentro del sector de la medicina. Centrado concretamente en la extracción de conocimiento de las historias clínicas electrónicas (HCE), que recogen toda la información relacionada con la salud del paciente y en particular, de los documentos recogidos en dichas historias, pretende la obtención de todos los términos relacionados con la medicina. El Procesamiento de Lenguaje Natural permite la obtención de datos estructurados a partir de información no estructurada. Estas técnicas permiten un análisis de texto que genera etiquetas aportando significado semántico a las palabras para la manipulación de información. A partir de la investigación realizada del estado del arte en NLP y de las tecnologías existentes para otras lenguas, se propone como solución un módulo de anotación de términos médicos extraídos de documentos clínicos. Como términos médicos se han considerado síntomas, enfermedades, partes del cuerpo o tratamientos obtenidos de UMLS, una ontología categorizada que agrega distintas fuentes de datos médicos. Se ha realizado el diseño y la implementación del módulo así como el análisis de los resultados obtenidos realizando una evaluación con treinta y dos documentos que contenían 1372 menciones de terminología médica y que han dado un resultado medio de Precisión: 70,4%, Recall: 36,2%, Accuracy: 31,4% y F-Measure: 47,2%.---ABSTRACT---This Final Thesis arises from the need for technologies that facilitate the Natural Language Processing (NLP) in Spanish in the medical sector. Specifically it is focused on extracting knowledge from Electronic Health Records (EHR), which contain all the information related to the patient's health and, in particular, it expects to obtain all the terms related to medicine from the documents contained in these records. Natural Language Processing allows us to obtain structured information from unstructured data. These techniques enable analysis of text generating labels providing semantic meaning to words for handling information. From the investigation of the state of the art in NLP and existing technologies in other languages, an annotation module of medical terms extracted from clinical documents is proposed as a solution. Symptoms, diseases, body parts or treatments are considered part of the medical terms contained in UMLS ontology which is categorized joining different sources of medical data. This project has completed the design and implementation of a module and the analysis of the results have been obtained. Thirty two documents which contain 1372 mentions of medical terminology have been evaluated and the average results obtained are: Precision: 70.4% Recall: 36.2% Accuracy: 31.4% and F-Measure: 47.2%.