120 resultados para Medical lab data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The current state of health and biomedicine includes an enormity of heterogeneous data ‘silos’, collected for different purposes and represented differently, that are presently impossible to share or analyze in toto. The greatest challenge for large-scale and meaningful analyses of health-related data is to achieve a uniform data representation for data extracted from heterogeneous source representations. Based upon an analysis and categorization of heterogeneities, a process for achieving comparable data content by using a uniform terminological representation is developed. This process addresses the types of representational heterogeneities that commonly arise in healthcare data integration problems. Specifically, this process uses a reference terminology, and associated "maps" to transform heterogeneous data to a standard representation for comparability and secondary use. The capture of quality and precision of the “maps” between local terms and reference terminology concepts enhances the meaning of the aggregated data, empowering end users with better-informed queries for subsequent analyses. A data integration case study in the domain of pediatric asthma illustrates the development and use of a reference terminology for creating comparable data from heterogeneous source representations. The contribution of this research is a generalized process for the integration of data from heterogeneous source representations, and this process can be applied and extended to other problems where heterogeneous data needs to be merged.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE: To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant. DESIGN AND MEASUREMENTS A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, journal impact factor, PageRank, and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology. RESULTS Citation-based algorithms were more effective than noncitation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best noncitation-based algorithm (p < 0.001). The authors saw similar differences between citation-based and noncitation-based algorithms at 10, 20, 50, 200, 500, and 1,000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than noncitation-based algorithms. CONCLUSION Algorithms that have proved successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Information overload is a significant problem for modern medicine. Searching MEDLINE for common topics often retrieves more relevant documents than users can review. Therefore, we must identify documents that are not only relevant, but also important. Our system ranks articles using citation counts and the PageRank algorithm, incorporating data from the Science Citation Index. However, citation data is usually incomplete. Therefore, we explore the relationship between the quantity of citation information available to the system and the quality of the result ranking. Specifically, we test the ability of citation count and PageRank to identify "important articles" as defined by experts from large result sets with decreasing citation information. We found that PageRank performs better than simple citation counts, but both algorithms are surprisingly robust to information loss. We conclude that even an incomplete citation database is likely to be effective for importance ranking.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is becoming clear that if we are to impact the rate of medical errors it will have to be done at the practicing physician level. The purpose of this project was to survey the attitude of physicians in Alabama concerning their perception of medical error, and to obtain their thoughts and desires for medical education in the area of medical errors. The information will be used in the development of a physician education program.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

People often use tools to search for information. In order to improve the quality of an information search, it is important to understand how internal information, which is stored in user’s mind, and external information, represented by the interface of tools interact with each other. How information is distributed between internal and external representations significantly affects information search performance. However, few studies have examined the relationship between types of interface and types of search task in the context of information search. For a distributed information search task, how data are distributed, represented, and formatted significantly affects the user search performance in terms of response time and accuracy. Guided by UFuRT (User, Function, Representation, Task), a human-centered process, I propose a search model, task taxonomy. The model defines its relationship with other existing information models. The taxonomy clarifies the legitimate operations for each type of search task of relation data. Based on the model and taxonomy, I have also developed prototypes of interface for the search tasks of relational data. These prototypes were used for experiments. The experiments described in this study are of a within-subject design with a sample of 24 participants recruited from the graduate schools located in the Texas Medical Center. Participants performed one-dimensional nominal search tasks over nominal, ordinal, and ratio displays, and searched one-dimensional nominal, ordinal, interval, and ratio tasks over table and graph displays. Participants also performed the same task and display combination for twodimensional searches. Distributed cognition theory has been adopted as a theoretical framework for analyzing and predicting the search performance of relational data. It has been shown that the representation dimensions and data scales, as well as the search task types, are main factors in determining search efficiency and effectiveness. In particular, the more external representations used, the better search task performance, and the results suggest the ideal search performance occurs when the question type and corresponding data scale representation match. The implications of the study lie in contributing to the effective design of search interface for relational data, especially laboratory results, which are often used in healthcare activities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The implications of the new research presented in Volume 2, Issue 1 (Human Trafficking) of the Journal of Applied Research on Children are explored, calling attention to the need for increased awareness, greater availability of data, and proactive policy solutions to combat child trafficking.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Enterococcus faecalis is a Gram-positive bacterium that lives as a commensal organism in the mammalian gastrointestinal tract, but can behave as an opportunistic pathogen. Our lab discovered that mutation of the eutK gene attenuates virulence of E. faecalis in the C. elegans model host. eutK is part of the ethanolamine metabolic pathway which was previously unknown in E. faecalis. I discovered the presence of two unique posttranscriptional regulatory features that control expression of eut locus genes. The first feature I found is an AdoCBL riboswitch, a cis-acting RNA regulatory element that acts as a positive regulator of gene expression. The second feature I discovered is a unique two-component system, EutVW. The EutV response regulator contains an ANTAR family domain, which binds RNA to trigger transcriptional antitermination. I determined that induction of expression of several genes in the eut locus is dependent on ethanolamine, AdoCBL and the two-component system. AdoCBL and ethanolamine are both required for induction of eut locus gene expression. Additionally, I discovered eutG is regulated by a unique mechanism of antitermination. Both the AdoCBL riboswitch and EutV response regulator control the expression of the downstream gene eutG. EutV potentially acts through a novel antitermination mechanism in which a dimer of EutV binds to a pair of mRNA stem loops forming an antitermination complex. My data show a unique mechanism by which two environmental signals are integrated by two different posttranscriptional regulators to regulate a single locus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Increasing amounts of clinical research data are collected by manual data entry into electronic source systems and directly from research subjects. For this manual entered source data, common methods of data cleaning such as post-entry identification and resolution of discrepancies and double data entry are not feasible. However data accuracy rates achieved without these mechanisms may be higher than desired for a particular research use. We evaluated a heuristic usability method for utility as a tool to independently and prospectively identify data collection form questions associated with data errors. The method evaluated had a promising sensitivity of 64% and a specificity of 67%. The method was used as described in the literature for usability with no further adaptations or specialization for predicting data errors. We conclude that usability evaluation methodology should be further investigated for use in data quality assurance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissecting the Interaction of p53 and TRIM24 Aundrietta DeVan Duncan Supervisory Professor, Michelle Barton, Ph.D. p53, the “guardian of the genome”, plays an important role in multiple biological processes including cell cycle, angiogenesis, DNA repair and apoptosis. Because it is mutated in over 50% of cancers, p53 has been widely studied in established cancer cell lines. However, little is known about the function of p53 in a normal cell. We focused on characterizing p53 in normal cells and during differentiation. Our lab recently identified a novel binding partner of p53, Tripartite Motif 24 protein (TRIM24). TRIM24 is a member of the TRIM family of proteins, defined by their conserved RING, B-box, and coiled coil domains. Specifically, TRIM24 is a member of the TIF1 subfamily, which is characterized by PHD and Bromo domains in the C-terminus. Between the Coiled-coil and PHD domain is a linker region, 437 amino acids in length. This linker region houses important functions of TRIM24 including it’s site of interaction with nuclear receptors. TRIM24 is an E3-ubiquitin ligase, recently discovered to negatively regulate p53 by targeting it for degradation. Though it is known that Trim24 and p53 interact, it is not known if the interaction is direct and what effect this interaction has on the function of TRIM24 and p53. My study aims to elucidate the specific interaction domains of p53 and TRIM24. To determine the specific domains of p53 required for interaction with TRIM24, we performed co-immuoprecipitation (Co-IP) with recombinant full-length Flag-tagged TRIM24 protein and various deletion constructs of in vitro translated GST-p53, as well as the reverse. I found that TRIM24 binds both the carboxy terminus and DNA binding domain of p53. Furthermore, my results show that binding is altered when post-translational modifications of p53 are present, suggesting that the interaction between p53 and TRIM24 may be affected by these post-translational modifications. To determine the specific domains of TRIM24 required for p53 interaction, we performed GST pull-downs with in vitro translated, Flag-TRIM24 protein constructs and recombinant GST-p53 protein purified from E. coli. We found that the Linker region is sufficient for interaction of p53 and TRIM24. Taken together, these data indicate that the interaction between p53 and TRIM24 does occur in vitro and that interaction may be influenced by post-translational modifications of the proteins.