11 resultados para error rates

em DigitalCommons@The Texas Medical Center


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Despite current enthusiasm for investigation of gene-gene interactions and gene-environment interactions, the essential issue of how to define and detect gene-environment interactions remains unresolved. In this report, we define gene-environment interactions as a stochastic dependence in the context of the effects of the genetic and environmental risk factors on the cause of phenotypic variation among individuals. We use mutual information that is widely used in communication and complex system analysis to measure gene-environment interactions. We investigate how gene-environment interactions generate the large difference in the information measure of gene-environment interactions between the general population and a diseased population, which motives us to develop mutual information-based statistics for testing gene-environment interactions. We validated the null distribution and calculated the type 1 error rates for the mutual information-based statistics to test gene-environment interactions using extensive simulation studies. We found that the new test statistics were more powerful than the traditional logistic regression under several disease models. Finally, in order to further evaluate the performance of our new method, we applied the mutual information-based statistics to three real examples. Our results showed that P-values for the mutual information-based statistics were much smaller than that obtained by other approaches including logistic regression models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Linkage disequilibrium methods can be used to find genes influencing quantitative trait variation in humans. Linkage disequilibrium methods can require smaller sample sizes than linkage equilibrium methods, such as the variance component approach to find loci with a specific effect size. The increase in power is at the expense of requiring more markers to be typed to scan the entire genome. This thesis compares different linkage disequilibrium methods to determine which factors influence the power to detect disequilibrium. The costs of disequilibrium and equilibrium tests were compared to determine whether the savings in phenotyping costs when using disequilibrium methods outweigh the additional genotyping costs.^ Nine linkage disequilibrium tests were examined by simulation. Five tests involve selecting isolated unrelated individuals while four involved the selection of parent child trios (TDT). All nine tests were found to be able to identify disequilibrium with the correct significance level in Hardy-Weinberg populations. Increasing linked genetic variance and trait allele frequency were found to increase the power to detect disequilibrium, while increasing the number of generations and distance between marker and trait loci decreased the power to detect disequilibrium. Discordant sampling was used for several of the tests. It was found that the more stringent the sampling, the greater the power to detect disequilibrium in a sample of given size. The power to detect disequilibrium was not affected by the presence of polygenic effects.^ When the trait locus had more than two trait alleles, the power of the tests maximized to less than one. For the simulation methods used here, when there were more than two-trait alleles there was a probability equal to 1-heterozygosity of the marker locus that both trait alleles were in disequilibrium with the same marker allele, resulting in the marker being uninformative for disequilibrium.^ The five tests using isolated unrelated individuals were found to have excess error rates when there was disequilibrium due to population admixture. Increased error rates also resulted from increased unlinked major gene effects, discordant trait allele frequency, and increased disequilibrium. Polygenic effects did not affect the error rates. The TDT, Transmission Disequilibrium Test, based tests were not liable to any increase in error rates.^ For all sample ascertainment costs, for recent mutations ($<$100 generations) linkage disequilibrium tests were less expensive than the variance component test to carry out. Candidate gene scans saved even more money. The use of recently admixed populations also decreased the cost of performing a linkage disequilibrium test. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Linkage and association studies are major analytical tools to search for susceptibility genes for complex diseases. With the availability of large collection of single nucleotide polymorphisms (SNPs) and the rapid progresses for high throughput genotyping technologies, together with the ambitious goals of the International HapMap Project, genetic markers covering the whole genome will be available for genome-wide linkage and association studies. In order not to inflate the type I error rate in performing genome-wide linkage and association studies, multiple adjustment for the significant level for each independent linkage and/or association test is required, and this has led to the suggestion of genome-wide significant cut-off as low as 5 × 10 −7. Almost no linkage and/or association study can meet such a stringent threshold by the standard statistical methods. Developing new statistics with high power is urgently needed to tackle this problem. This dissertation proposes and explores a class of novel test statistics that can be used in both population-based and family-based genetic data by employing a completely new strategy, which uses nonlinear transformation of the sample means to construct test statistics for linkage and association studies. Extensive simulation studies are used to illustrate the properties of the nonlinear test statistics. Power calculations are performed using both analytical and empirical methods. Finally, real data sets are analyzed with the nonlinear test statistics. Results show that the nonlinear test statistics have correct type I error rates, and most of the studied nonlinear test statistics have higher power than the standard chi-square test. This dissertation introduces a new idea to design novel test statistics with high power and might open new ways to mapping susceptibility genes for complex diseases. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Research studies on the association between exposures to air contaminants and disease frequently use worn dosimeters to measure the concentration of the contaminant of interest. But investigation of exposure determinants requires additional knowledge beyond concentration, i.e., knowledge about personal activity such as whether the exposure occurred in a building or outdoors. Current studies frequently depend upon manual activity logging to record location. This study's purpose was to evaluate the use of a worn data logger recording three environmental parameters—temperature, humidity, and light intensity—as well as time of day, to determine indoor or outdoor location, with an ultimate aim of eliminating the need to manually log location or at least providing a method to verify such logs. For this study, data collection was limited to a single geographical area (Houston, Texas metropolitan area) during a single season (winter) using a HOBO H8 four-channel data logger. Data for development of a Location Model were collected using the logger for deliberate sampling of programmed activities in outdoor, building, and vehicle locations at various times of day. The Model was developed by analyzing the distributions of environmental parameters by location and time to establish a prioritized set of cut points for assessing locations. The final Model consisted of four "processors" that varied these priorities and cut points. Data to evaluate the Model were collected by wearing the logger during "typical days" while maintaining a location log. The Model was tested by feeding the typical day data into each processor and generating assessed locations for each record. These assessed locations were then compared with true locations recorded in the manual log to determine accurate versus erroneous assessments. The utility of each processor was evaluated by calculating overall error rates across all times of day, and calculating individual error rates by time of day. Unfortunately, the error rates were large, such that there would be no benefit in using the Model. Another analysis in which assessed locations were classified as either indoor (including both building and vehicle) or outdoor yielded slightly lower error rates that still precluded any benefit of the Model's use.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The central objective of this dissertation was to determine the feasibility of self-completed advance directives (AD) in older persons suffering from mild and moderate stages of dementia. This was accomplished by identifying differences in ability to complete AD among elderly subjects with increasing degrees of dementia and cognitive incompetence. Secondary objectives were to describe and compare advance directives completed by elders and identified proxy decision makers. Secondary objectives were accomplished by measuring the agreement between advance directives completed by proxy and elder, and comparing that agreement across groups defined by the elder's cognitive status. This cross-sectional study employed a structured interview to elicit AD, followed by a similar interview with a proxy decision maker identified by the elder. A stratified sampling scheme recruited elders with normal cognition, mild, and moderate forms of dementia using the Mini Mental-State Exam (MMSE). The Hopkins Competency Assessment Test (HCAT) was used for evaluation of competency to make medical decisions. Analysis was conducted on "between group" (non-demented $\leftrightarrow$ mild dementia $\leftrightarrow$ moderate dementia, and competent $\leftrightarrow$ incompetent) and "within group" (elder $\leftrightarrow$ family member) variation.^ The 118 elderly subjects interviewed were generally male, Caucasian, and of low socioeconomic status. Mean age was 77. Overall, elders preferred a "trial of therapy" regarding AD rather than to "always receive the therapy". No intervention was refused outright more often than it was accepted. A test-retest of elders' AD revealed stable responses. Eleven logic checks measured appropriateness of AD responses independent of preference. No difference was found in logic error rates between elders grouped by MMSE or HCAT. Agreement between proxy and elder responses showed significant dissimilarity, indicating that proxies were not making the same medical decisions as the elders.^ Conclusions based on these data are: (1) Self reporting AD is feasible among elders showing signs of cognitive impairment and they should be given all opportunities to complete advance directives, (2) variation in preferences for advance directives in cognitively impaired elders should not be assumed to be the effects of their impairment alone, (3) proxies do not appear to forego life-prolonging interventions in the face of increasing impairment in their ward, however, their advance directives choices are frequently not those of the elder they represent. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Errors in the administration of medication represent a significant loss of medical resources and pose life altering or life threatening risks to patients. This paper considered the question, what impact do Computerized Physician Order Entry (CPOE) systems have on medication errors in the hospital inpatient environment? Previous reviews have examined evidence of the impact of CPOE on medication errors, but have come to ambiguous conclusions as to the impact of CPOE and decision support systems (DSS). Forty-three papers were identified. Thirty-one demonstrated a significant reduction in prescribing error rates for all or some drug types; decreases in minor errors were most often reported. Several studies reported increases in the rate of duplicate orders and failures to remove contraindicated drugs, often attributed to inappropriate design or to an inability to operate the system properly. The evidence on the effectiveness of CPOE to reduce errors in medication administration is compelling though it is limited by modest study sample sizes and designs. ^

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Research suggests women respond to the aggression-inducing effects of alcohol in a manner similar to men. Highly aggressive men are more prone to alcohol-induced aggression, but this relationship is less clear for women. This study examined whether alcohol consumption would differentially affect laboratory-measured aggression in a sample of aggressive and non-aggressive women and how those differences might be related to components of impulsive behavior. In 39 women recruited from the community (two groups: with and without histories of physical fighting) ages 21–40, laboratory aggressive behavior was assessed following placebo and 0.80 g/kg alcohol consumption (all women experienced both conditions). Baseline laboratory impulsive behavior of three impulsivity models was later assessed in the same women. In the aggression model (PSAP), participants were provoked by periodic subtractions of money, which were blamed on a fictitious partner. Aggression was operationalized as the responses the participant made to subtract money from that partner. The three components of impulsivity that were tested included: (1) response initiation (IMT/DMT), premature responses made prior to the completion of stimulus processing, (2) response inhibition (GoStop), a failure to inhibit an already initiated response, and (3) consequence sensitivity (SKIP and TCIP), the choice for a smaller-sooner reward over a larger-later reward. I hypothesized that, compared to women with no history of physical fighting, women with a history of physical fighting would exhibit higher rates of alcohol-induced laboratory aggression and higher rates of baseline impulsive responding (particularly for the IMT/DMT), which would also be related to the alcohol-induced increases aggression. Consistent with studies in men, the aggressive women showed strong associations between laboratory aggression and self-report measures, while the non-aggressive women did not. However, unlike men, following alcohol consumption it was the non-aggressive women's laboratory aggression that was related to their self-reports of aggression and impulsivity. Additionally, response initiation measures of impulsivity distinguished the two groups, while response inhibition and consequence sensitivity measures did not; commission error rates on the IMT/DMT were higher in the aggressive women compared to the non-aggressive women. Regression analyses of the behavioral measures showed no relationship between the aggression and impulsivity performance of the two groups. These results suggest that the behavioral (and potentially biological) mechanism underlying aggressive behavior of women is different than that of men. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Statement of the problem and public health significance. Hospitals were designed to be a safe haven and respite from disease and illness. However, a large body of evidence points to preventable errors in hospitals as the eighth leading cause of death among Americans. Twelve percent of Americans, or over 33.8 million people, are hospitalized each year. This population represents a significant portion of at risk citizens exposed to hospital medical errors. Since the number of annual deaths due to hospital medical errors is estimated to exceed 44,000, the magnitude of this tragedy makes it a significant public health problem. ^ Specific aims. The specific aims of this study were threefold. First, this study aimed to analyze the state of the states' mandatory hospital medical error reporting six years after the release of the influential IOM report, "To Err is Human." The second aim was to identify barriers to reporting of medical errors by hospital personnel. The third aim was to identify hospital safety measures implemented to reduce medical errors and enhance patient safety. ^ Methods. A descriptive, longitudinal, retrospective design was used to address the first stated objective. The study data came from the twenty-one states with mandatory hospital reporting programs which report aggregate hospital error data that is accessible to the public by way of states' websites. The data analysis included calculations of expected number of medical errors for each state according to IOM rates. Where possible, a comparison was made between state reported data and the calculated IOM expected number of errors. A literature review was performed to achieve the second study aim, identifying barriers to reporting medical errors. The final aim was accomplished by telephone interviews of principal patient safety/quality officers from five Texas hospitals with more than 700 beds. ^ Results. The state medical error data suggests vast underreporting of hospital medical errors to the states. The telephone interviews suggest that hospitals are working at reducing medical errors and creating safer environments for patients. The literature review suggests the underreporting of medical errors at the state level stems from underreporting of errors at the delivery level. ^