105 resultados para Medical data
em DigitalCommons@The Texas Medical Center
Resumo:
Nuclear morphometry (NM) uses image analysis to measure features of the cell nucleus which are classified as: bulk properties, shape or form, and DNA distribution. Studies have used these measurements as diagnostic and prognostic indicators of disease with inconclusive results. The distributional properties of these variables have not been systematically investigated although much of the medical data exhibit nonnormal distributions. Measurements are done on several hundred cells per patient so summary measurements reflecting the underlying distribution are needed.^ Distributional characteristics of 34 NM variables from prostate cancer cells were investigated using graphical and analytical techniques. Cells per sample ranged from 52 to 458. A small sample of patients with benign prostatic hyperplasia (BPH), representing non-cancer cells, was used for general comparison with the cancer cells.^ Data transformations such as log, square root and 1/x did not yield normality as measured by the Shapiro-Wilks test for normality. A modulus transformation, used for distributions having abnormal kurtosis values, also did not produce normality.^ Kernel density histograms of the 34 variables exhibited non-normality and 18 variables also exhibited bimodality. A bimodality coefficient was calculated and 3 variables: DNA concentration, shape and elongation, showed the strongest evidence of bimodality and were studied further.^ Two analytical approaches were used to obtain a summary measure for each variable for each patient: cluster analysis to determine significant clusters and a mixture model analysis using a two component model having a Gaussian distribution with equal variances. The mixture component parameters were used to bootstrap the log likelihood ratio to determine the significant number of components, 1 or 2. These summary measures were used as predictors of disease severity in several proportional odds logistic regression models. The disease severity scale had 5 levels and was constructed of 3 components: extracapsulary penetration (ECP), lymph node involvement (LN+) and seminal vesicle involvement (SV+) which represent surrogate measures of prognosis. The summary measures were not strong predictors of disease severity. There was some indication from the mixture model results that there were changes in mean levels and proportions of the components in the lower severity levels. ^
Resumo:
Infant mortality as a problematic situation has been recognized for some 130 years in one form or another. It has undergone various changes in its empirical dimensions relative to whom we study within the population, what we study--low birth-weight vs. pre-term births--and how we study it--whether demographically or medically. An analysis of the process by which the condition was raised by claims makers as an intolerable situation among America's urban residents reveals that demographic and medical data were sparse. Nonetheless, a judgement about the meaning and significance of the condition was made, and that interpretation led to the promulgation of systems to both document and address the condition as it has come to be defined.^ This investigation depicts the historical context and natural history of infant mortality as one of a number of social problems that came to be defined through the interplay among groups and individuals making claims and how their claims came to the public policy agenda as worthy of collective resources--who won, who lost and why. The process of social definition focuses attention on the claims makers and the ways they contrast the meaning, origins and remedies for this troubling condition. The historical context becomes the frame of reference for understanding the actions of the claims makers and the meaning and significance they attached to the problem.^ We purport that "context" provides a closer reality than disjoined "value free" accounts. Context provides the evidence for the definition, who participated in the process, why and by what means.^ The role of women in the definitional process reveals the differences in approaches utilized by the women of the settlement house reform movement and African-American women working at the grass-roots. Much of the work done by these two groups provided options to the problem's remedy; however, their differences paved the way to our current (principally medically-oriented) definition and its inherent limitations. ^
Resumo:
Identifying accurate numbers of soldiers determined to be medically not ready after completing soldier readiness processing may help inform Army leadership about ongoing pressures on the military involved in long conflict with regular deployment. In Army soldiers screened using the SRP checklist for deployment, what is the prevalence of soldiers determined to be medically not ready? Study group. 15,289 soldiers screened at all 25 Army deployment platform sites with the eSRP checklist over a 4-month period (June 20, 2009 to October 20, 2009). The data included for analysis included age, rank, component, gender and final deployment medical readiness status from MEDPROS database. Methods.^ This information was compiled and univariate analysis using chi-square was conducted for each of the key variables by medical readiness status. Results. Descriptive epidemiology Of the total sample 1548 (9.7%) were female and 14319 (90.2%) were male. Enlisted soldiers made up 13,543 (88.6%) of the sample and officers 1,746 (11.4%). In the sample, 1533 (10.0%) were soldiers over the age of 40 and 13756 (90.0%) were age 18-40. Reserve, National Guard and Active Duty made up 1,931 (12.6%), 2,942 (19.2%) and 10,416 (68.1%) respectively. Univariate analysis. Overall 1226 (8.0%) of the soldiers screened were determined to be medically not ready for deployment. Biggest predictive factor was female gender OR (2.8; 2.57-3.28) p<0.001. Followed by enlisted rank OR (2.01; 1.60-2.53) p<0.001. Reserve component OR (1.33; 1.16-1.53) p<0.001 and Guard OR (0.37; 0.30-0.46) p<0.001. For age > 40 demonstrated OR (1.2; 1.09-1.50) p<0.003. Overall the results underscore there may be key demographic groups relating to medical readiness that can be targeted with programs and funding to improve overall military medical readiness.^
Resumo:
Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.
Resumo:
A rigorous between-subjects methodology employing independent random samples and having broad clinical applicability was designed and implemented to evaluate the effectiveness of back safety and patient transfer training interventions for both hospital nurses and nursing assistants. Effects upon self-efficacy, cognitive, and affective measures are assessed for each of three back safety procedures. The design solves the problem of obtaining randomly assigned independent controls where all experimental subjects must participate in the training interventions.
Resumo:
A management information system (MIS) provides a means for collecting, reporting, and analyzing data from all segments of an organization. Such systems are common in business but rare in libraries. The Houston Academy of Medicine-Texas Medical Center Library developed an MIS that operates on a system of networked IBM PCs and Paradox, a commercial database software package. The data collected in the system include monthly reports, client profile information, and data collected at the time of service requests. The MIS assists with enforcement of library policies, ensures that correct information is recorded, and provides reports for library managers. It also can be used to help answer a variety of ad hoc questions. Future plans call for the development of an MIS that could be adapted to other libraries' needs, and a decision-support interface that would facilitate access to the data contained in the MIS databases.
Resumo:
Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.
Resumo:
The current state of health and biomedicine includes an enormity of heterogeneous data ‘silos’, collected for different purposes and represented differently, that are presently impossible to share or analyze in toto. The greatest challenge for large-scale and meaningful analyses of health-related data is to achieve a uniform data representation for data extracted from heterogeneous source representations. Based upon an analysis and categorization of heterogeneities, a process for achieving comparable data content by using a uniform terminological representation is developed. This process addresses the types of representational heterogeneities that commonly arise in healthcare data integration problems. Specifically, this process uses a reference terminology, and associated "maps" to transform heterogeneous data to a standard representation for comparability and secondary use. The capture of quality and precision of the “maps” between local terms and reference terminology concepts enhances the meaning of the aggregated data, empowering end users with better-informed queries for subsequent analyses. A data integration case study in the domain of pediatric asthma illustrates the development and use of a reference terminology for creating comparable data from heterogeneous source representations. The contribution of this research is a generalized process for the integration of data from heterogeneous source representations, and this process can be applied and extended to other problems where heterogeneous data needs to be merged.
Resumo:
OBJECTIVE: To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant. DESIGN AND MEASUREMENTS A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, journal impact factor, PageRank, and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology. RESULTS Citation-based algorithms were more effective than noncitation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best noncitation-based algorithm (p < 0.001). The authors saw similar differences between citation-based and noncitation-based algorithms at 10, 20, 50, 200, 500, and 1,000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than noncitation-based algorithms. CONCLUSION Algorithms that have proved successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.
Resumo:
Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.