11 resultados para in-domain data requirement
em DigitalCommons@The Texas Medical Center
Resumo:
Microarray technology is a high-throughput method for genotyping and gene expression profiling. Limited sensitivity and specificity are one of the essential problems for this technology. Most of existing methods of microarray data analysis have an apparent limitation for they merely deal with the numerical part of microarray data and have made little use of gene sequence information. Because it's the gene sequences that precisely define the physical objects being measured by a microarray, it is natural to make the gene sequences an essential part of the data analysis. This dissertation focused on the development of free energy models to integrate sequence information in microarray data analysis. The models were used to characterize the mechanism of hybridization on microarrays and enhance sensitivity and specificity of microarray measurements. ^ Cross-hybridization is a major obstacle factor for the sensitivity and specificity of microarray measurements. In this dissertation, we evaluated the scope of cross-hybridization problem on short-oligo microarrays. The results showed that cross hybridization on arrays is mostly caused by oligo fragments with a run of 10 to 16 nucleotides complementary to the probes. Furthermore, a free-energy based model was proposed to quantify the amount of cross-hybridization signal on each probe. This model treats cross-hybridization as an integral effect of the interactions between a probe and various off-target oligo fragments. Using public spike-in datasets, the model showed high accuracy in predicting the cross-hybridization signals on those probes whose intended targets are absent in the sample. ^ Several prospective models were proposed to improve Positional Dependent Nearest-Neighbor (PDNN) model for better quantification of gene expression and cross-hybridization. ^ The problem addressed in this dissertation is fundamental to the microarray technology. We expect that this study will help us to understand the detailed mechanism that determines sensitivity and specificity on the microarrays. Consequently, this research will have a wide impact on how microarrays are designed and how the data are interpreted. ^
Resumo:
This research examines prevalence of alcohol and illicit substance use in the United States and Mexico and associated socio-demographic characteristics. The sources of data for this study are public domain data from the U.S. National Household Survey of Drug Abuse, 1988 (n = 8814), and the Mexican National Survey of Addictions, 1988 (n = 12,579). In addition, this study discusses methodologic issues in cross-cultural and cross-national comparison of behavioral and epidemiologic data from population-based samples. The extent to which patterns of substance abuse vary among subgroups of the U.S. and Mexican populations is assessed, as well as the comparability and equivalence of measures of alcohol and drug use in these national samples.^ The prevalence of alcohol use was somewhat similar in the two countries for all three measures of use: lifetime, past year and past year heavy use, (85.0%, 68.1%, 39.6% and 72.6%, 47.7% and 45.8% for the U.S. and Mexico respectively). The use of illegal substances varied widely between countries, with U.S. respondents reporting significantly higher levels of use than their Mexican counterparts. For example, reported use of any illicit substance in lifetime and past year was 34.2%, 11.6 for the U.S., and 3.3% and 0.6% for Mexico. Despite these differences in prevalence, two demographic characteristics, gender and age, were important correlates of use in both countries. Men in both countries were more likely to report use of alcohol and illicit substances than women. Generally speaking, a greater proportion of respondents in both countries 18 years of age or older reported use of alcohol for all three measures than younger respondents; and a greater proportion of respondents between the ages of 18 and 34 years reported use of illicit substances during lifetime and past year than any other age group.^ Additional substantive research investigating population-based samples and at-risk subgroups is needed to understand the underlying mechanisms of these associations. Further development of cross-culturally meaningful survey methods is warranted to validate comparisons of substance use across countries and societies. ^
Resumo:
Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.
Resumo:
The current state of health and biomedicine includes an enormity of heterogeneous data ‘silos’, collected for different purposes and represented differently, that are presently impossible to share or analyze in toto. The greatest challenge for large-scale and meaningful analyses of health-related data is to achieve a uniform data representation for data extracted from heterogeneous source representations. Based upon an analysis and categorization of heterogeneities, a process for achieving comparable data content by using a uniform terminological representation is developed. This process addresses the types of representational heterogeneities that commonly arise in healthcare data integration problems. Specifically, this process uses a reference terminology, and associated "maps" to transform heterogeneous data to a standard representation for comparability and secondary use. The capture of quality and precision of the “maps” between local terms and reference terminology concepts enhances the meaning of the aggregated data, empowering end users with better-informed queries for subsequent analyses. A data integration case study in the domain of pediatric asthma illustrates the development and use of a reference terminology for creating comparable data from heterogeneous source representations. The contribution of this research is a generalized process for the integration of data from heterogeneous source representations, and this process can be applied and extended to other problems where heterogeneous data needs to be merged.
Resumo:
The task of encoding and processing complex sensory input requires many types of transsynaptic signals. This requirement is served in part by an extensive group of neurotransmitter substances which may include thirty or more different compounds. At the next level of information processing, the existence of multiple receptors for a given neurotransmitter appears to be a widely used mechanism to generate multiple responses to a given first messenger (Snyder and Goodman, 1980). Despite the wealth of published data on GABA receptors, the existence of more than one GABA receptor was in doubt until the mid 1980's. Presently there is still disagreement on the number of types of GABA receptors, estimates for which range from two to four (DeFeudis, 1983; Johnston, 1985). Part of the problem in evaluating data concerning multiple receptor types is the lack of information on the number of gene products and their subsequent supramolecular organization in different neurons. In order to evaluate the question concerning the diversity of GABA receptors in the nervous system, we must rely on indirect information derived from a wide variety of experimental techniques. These include pharmacological binding studies to membrane fractions, electrophysiological studies, localization studies, purification studies, and functional assays. Almost all parts of the central and peripheral nervous system use GABA as a neurotransmitter, and these experimental techniques have therefore been applied to many different parts of the nervous system for the analysis of GABA receptor characteristics. We are left with a large amount of data from a wide variety of techniques derived from many parts of the nervous system. When this project was initiated in 1983, there were only a handful of pharmacological tools to assess the question of multiple GABA receptors. The approach adopted was to focus on a single model system, using a variety of experimental techniques, in order to evaluate the existence of multiple forms of GABA receptors. Using the in vitro rabbit retina, a combination of pharmacological binding studies, functional release studies and partial purification studies were undertaken to examine the GABA receptor composition of this tissue. Three types of GABA receptors were observed: Al receptors coupled to benzodiazepine and barbiturate modulation, and A2 or uncoupled GABA-A receptors, and GABA-B receptors. These results are evaluated and discussed in light of recent findings by others concerning the number and subtypes of GABA receptors in the nervous system. ^
Resumo:
Many studies in biostatistics deal with binary data. Some of these studies involve correlated observations, which can complicate the analysis of the resulting data. Studies of this kind typically arise when a high degree of commonality exists between test subjects. If there exists a natural hierarchy in the data, multilevel analysis is an appropriate tool for the analysis. Two examples are the measurements on identical twins, or the study of symmetrical organs or appendages such as in the case of ophthalmic studies. Although this type of matching appears ideal for the purposes of comparison, analysis of the resulting data while ignoring the effect of intra-cluster correlation has been shown to produce biased results.^ This paper will explore the use of multilevel modeling of simulated binary data with predetermined levels of correlation. Data will be generated using the Beta-Binomial method with varying degrees of correlation between the lower level observations. The data will be analyzed using the multilevel software package MlwiN (Woodhouse, et al, 1995). Comparisons between the specified intra-cluster correlation of these data and the estimated correlations, using multilevel analysis, will be used to examine the accuracy of this technique in analyzing this type of data. ^
Resumo:
The difficulty of detecting differential gene expression in microarray data has existed for many years. Several correction procedures try to avoid the family-wise error rate in multiple comparison process, including the Bonferroni and Sidak single-step p-value adjustments, Holm's step-down correction method, and Benjamini and Hochberg's false discovery rate (FDR) correction procedure. Each multiple comparison technique has its advantages and weaknesses. We studied each multiple comparison method through numerical studies (simulations) and applied the methods to the real exploratory DNA microarray data, which detect of molecular signatures in papillary thyroid cancer (PTC) patients. According to our results of simulation studies, Benjamini and Hochberg step-up FDR controlling procedure is the best process among these multiple comparison methods and we discovered 1277 potential biomarkers among 54675 probe sets after applying the Benjamini and Hochberg's method to PTC microarray data.^
Resumo:
Objective Interruptions are known to have a negative impact on activity performance. Understanding how an interruption contributes to human error is limited because there is not a standard method for analyzing and classifying interruptions. Qualitative data are typically analyzed by either a deductive or an inductive method. Both methods have limitations. In this paper a hybrid method was developed that integrates deductive and inductive methods for the categorization of activities and interruptions recorded during an ethnographic study of physicians and registered nurses in a Level One Trauma Center. Understanding the effects of interruptions is important for designing and evaluating informatics tools in particular and for improving healthcare quality and patient safety in general. Method The hybrid method was developed using a deductive a priori classification framework with the provision of adding new categories discovered inductively in the data. The inductive process utilized line-by-line coding and constant comparison as stated in Grounded Theory. Results The categories of activities and interruptions were organized into a three-tiered hierarchy of activity. Validity and reliability of the categories were tested by categorizing a medical error case external to the study. No new categories of interruptions were identified during analysis of the medical error case. Conclusions Findings from this study provide evidence that the hybrid model of categorization is more complete than either a deductive or an inductive method alone. The hybrid method developed in this study provides the methodical support for understanding, analyzing, and managing interruptions and workflow.
Resumo:
OBJECTIVE: Interruptions are known to have a negative impact on activity performance. Understanding how an interruption contributes to human error is limited because there is not a standard method for analyzing and classifying interruptions. Qualitative data are typically analyzed by either a deductive or an inductive method. Both methods have limitations. In this paper, a hybrid method was developed that integrates deductive and inductive methods for the categorization of activities and interruptions recorded during an ethnographic study of physicians and registered nurses in a Level One Trauma Center. Understanding the effects of interruptions is important for designing and evaluating informatics tools in particular as well as improving healthcare quality and patient safety in general. METHOD: The hybrid method was developed using a deductive a priori classification framework with the provision of adding new categories discovered inductively in the data. The inductive process utilized line-by-line coding and constant comparison as stated in Grounded Theory. RESULTS: The categories of activities and interruptions were organized into a three-tiered hierarchy of activity. Validity and reliability of the categories were tested by categorizing a medical error case external to the study. No new categories of interruptions were identified during analysis of the medical error case. CONCLUSIONS: Findings from this study provide evidence that the hybrid model of categorization is more complete than either a deductive or an inductive method alone. The hybrid method developed in this study provides the methodical support for understanding, analyzing, and managing interruptions and workflow.
Resumo:
Adenosine has been implicated in the pathogenesis of chronic lung diseases such as asthma and chronic obstructive pulmonary disease. In vitro studies suggest that activation of the A2B adenosine receptor (A2BAR) results in proinflammatory and profibrotic effects relevant to the progression of lung diseases; however, in vivo data supporting these observations are lacking. Adenosine deaminase-deficient (ADA-deficient) mice develop pulmonary inflammation and injury that are dependent on increased lung adenosine levels. To investigate the role of the A2BAR in vivo, ADA-deficient mice were treated with the selective A2BAR antagonist CVT-6883, and pulmonary inflammation, fibrosis, and airspace integrity were assessed. Untreated and vehicle-treated ADA-deficient mice developed pulmonary inflammation, fibrosis, and enlargement of alveolar airspaces; conversely, CVT-6883-treated ADA-deficient mice showed less pulmonary inflammation, fibrosis, and alveolar airspace enlargement. A2BAR antagonism significantly reduced elevations in proinflammatory cytokines and chemokines as well as mediators of fibrosis and airway destruction. In addition, treatment with CVT-6883 attenuated pulmonary inflammation and fibrosis in wild-type mice subjected to bleomycin-induced lung injury. These findings suggest that A2BAR signaling influences pathways critical for pulmonary inflammation and injury in vivo. Thus in chronic lung diseases associated with increased adenosine, antagonism of A2BAR-mediated responses may prove to be a beneficial therapy.
Resumo:
A micro-electrospray interface was developed specifically for the neurobiological applications described in this dissertation. Incorporation of a unique nano-flow liquid chromatography micro-electrospray "needle" into the micro-electrospray interface (micro-ES/MS) increased the sensitivity of the mass spectrometric assay by $\sim$1000 fold and thus permitted the first analysis of specific neuroactive compounds in brain extracellular fluid collected by in vivo microdialysis (Md).^ Initial in vivo data presented deals with the pharmacodynamics of a novel GABA$\sb{\rm B}$ antagonist and the availability of the compound in its parent (unmetabolized) form to the brain of the anesthetized rat. Next, the first structurally specific endogenous release of (Met) $\sp5$-enkephalin was demonstrated in unanesthetized freely-moving animals (release of $\sim$6.5 fmole of (Met) $\sp5$-enkephalin into the dialysate by direct neuronal depolarization). The Md/micro-ES/MS system was used to test the acute effects of drugs of abuse on the endogenous release of (Met) $\sp5$-enkephalin from the globus pallidus/ventral pallidum brain region in rats. Four drugs known to be abused by man (morphine, cocaine, methamphetamine and diazepam) were tested. Morphine and cocaine both elicited a two-fold or more increase in the release of (Met) $\sp5$-enkephalin over vehicle controls. Diazepam elicited a small decrease in (Met) $\sp5$-enkephalin levels and methamphetamine showed no significant effect on (Met) $\sp5$-enkephalin. These results imply that (Met) $\sp5$-enkephalin may be involved in the reward pathway of certain drugs of abuse. ^