3 resultados para NLP (Natural Language Processing)
em DigitalCommons@The Texas Medical Center
Resumo:
Clinical text understanding (CTU) is of interest to health informatics because critical clinical information frequently represented as unconstrained text in electronic health records are extensively used by human experts to guide clinical practice, decision making, and to document delivery of care, but are largely unusable by information systems for queries and computations. Recent initiatives advocating for translational research call for generation of technologies that can integrate structured clinical data with unstructured data, provide a unified interface to all data, and contextualize clinical information for reuse in multidisciplinary and collaborative environment envisioned by CTSA program. This implies that technologies for the processing and interpretation of clinical text should be evaluated not only in terms of their validity and reliability in their intended environment, but also in light of their interoperability, and ability to support information integration and contextualization in a distributed and dynamic environment. This vision adds a new layer of information representation requirements that needs to be accounted for when conceptualizing implementation or acquisition of clinical text processing tools and technologies for multidisciplinary research. On the other hand, electronic health records frequently contain unconstrained clinical text with high variability in use of terms and documentation practices, and without commitmentto grammatical or syntactic structure of the language (e.g. Triage notes, physician and nurse notes, chief complaints, etc). This hinders performance of natural language processing technologies which typically rely heavily on the syntax of language and grammatical structure of the text. This document introduces our method to transform unconstrained clinical text found in electronic health information systems to a formal (computationally understandable) representation that is suitable for querying, integration, contextualization and reuse, and is resilient to the grammatical and syntactic irregularities of the clinical text. We present our design rationale, method, and results of evaluation in processing chief complaints and triage notes from 8 different emergency departments in Houston Texas. At the end, we will discuss significance of our contribution in enabling use of clinical text in a practical bio-surveillance setting.
Resumo:
Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.
Resumo:
Background. Among Hispanics, the HPV vaccine has the potential to eliminate disparities in cervical cancer incidence and mortality but only if optimal rates of vaccination are achieved. Media can be an important information source for increasing HPV knowledge and awareness of the vaccine. Very little is known about how media use among Hispanics affects their HPV knowledge and vaccine awareness. Even less is known about what differences exist in media use and information processing among English- and Spanish-speaking Hispanics.^ Aims. Examine the relationships between three health communication variables (media exposure, HPV-specific information scanning and seeking) and three HPV outcomes (knowledge, vaccine awareness and initiation) among English- and Spanish-speaking Hispanics.^ Methods. Cross-sectional data from a survey administered to Hispanic mothers in Dallas, Texas was used for univariate and multivariate logistic regression analyses. Sample used for analysis included 288 mothers of females aged 8-22 recruited from clinics and community events. Dependent variables of interest were HPV knowledge, HPV vaccine awareness and initiation. Independent variables were media exposure, HPV-specific information scanning and seeking. Language was tested as an effect modifier on the relationship between health communication variables and HPV outcomes.^ Results. English-speaking mothers reported more media exposure, HPV-specific information scanning and seeking than Spanish-speakers. Scanning for HPV information was associated with more HPV knowledge (OR = 4.26, 95% CI = 2.41 - 7.51), vaccine awareness (OR = 10.01, 95% CI = 5.43 - 18.47) and vaccine initiation (OR = 2.54, 95% CI = 1.09 - 5.91). Seeking HPV-specific information was associated with more knowledge (OR = 2.27, 95% CI = 1.23 - 4.16), awareness (OR = 6.60, 95% CI = 2.74 - 15.91) and initiation (OR = 4.93, 95% CI = 2.64 - 9.20). Language moderated the effect of information scanning and seeking on vaccine awareness.^ Discussion. Differences in information scanning and seeking behaviors among Hispanic subgroups have the potential to lead to disparities in vaccine awareness.^ Conclusion. Findings from this study underscore health communication differences among Hispanics and emphasize the need to target Spanish language media as well as English language media aimed at Hispanics to improve knowledge and awareness.^