10 resultados para Error correction methods
em DigitalCommons@The Texas Medical Center
Resumo:
Next-generation sequencing (NGS) technology has become a prominent tool in biological and biomedical research. However, NGS data analysis, such as de novo assembly, mapping and variants detection is far from maturity, and the high sequencing error-rate is one of the major problems. . To minimize the impact of sequencing errors, we developed a highly robust and efficient method, MTM, to correct the errors in NGS reads. We demonstrated the effectiveness of MTM on both single-cell data with highly non-uniform coverage and normal data with uniformly high coverage, reflecting that MTM’s performance does not rely on the coverage of the sequencing reads. MTM was also compared with Hammer and Quake, the best methods for correcting non-uniform and uniform data respectively. For non-uniform data, MTM outperformed both Hammer and Quake. For uniform data, MTM showed better performance than Quake and comparable results to Hammer. By making better error correction with MTM, the quality of downstream analysis, such as mapping and SNP detection, was improved. SNP calling is a major application of NGS technologies. However, the existence of sequencing errors complicates this process, especially for the low coverage (
Resumo:
The difficulty of detecting differential gene expression in microarray data has existed for many years. Several correction procedures try to avoid the family-wise error rate in multiple comparison process, including the Bonferroni and Sidak single-step p-value adjustments, Holm's step-down correction method, and Benjamini and Hochberg's false discovery rate (FDR) correction procedure. Each multiple comparison technique has its advantages and weaknesses. We studied each multiple comparison method through numerical studies (simulations) and applied the methods to the real exploratory DNA microarray data, which detect of molecular signatures in papillary thyroid cancer (PTC) patients. According to our results of simulation studies, Benjamini and Hochberg step-up FDR controlling procedure is the best process among these multiple comparison methods and we discovered 1277 potential biomarkers among 54675 probe sets after applying the Benjamini and Hochberg's method to PTC microarray data.^
Resumo:
BACKGROUND: Quantitative myocardial PET perfusion imaging requires partial volume corrections. METHODS: Patients underwent ECG-gated, rest-dipyridamole, myocardial perfusion PET using Rb-82 decay corrected in Bq/cc for diastolic, systolic, and combined whole cycle ungated images. Diastolic partial volume correction relative to systole was determined from the systolic/diastolic activity ratio, systolic partial volume correction from phantom dimensions comparable to systolic LV wall thicknesses and whole heart cycle partial volume correction for ungated images from fractional systolic-diastolic duration for systolic and diastolic partial volume corrections. RESULTS: For 264 PET perfusion images from 159 patients (105 rest-stress image pairs, 54 individual rest or stress images), average resting diastolic partial volume correction relative to systole was 1.14 ± 0.04, independent of heart rate and within ±1.8% of stress images (1.16 ± 0.04). Diastolic partial volume corrections combined with those for phantom dimensions comparable to systolic LV wall thickness gave an average whole heart cycle partial volume correction for ungated images of 1.23 for Rb-82 compared to 1.14 if positron range were negligible as for F-18. CONCLUSION: Quantitative myocardial PET perfusion imaging requires partial volume correction, herein demonstrated clinically from systolic/diastolic absolute activity ratios combined with phantom data accounting for Rb-82 positron range.
Resumo:
Calcium levels in spines play a significant role in determining the sign and magnitude of synaptic plasticity. The magnitude of calcium influx into spines is highly dependent on influx through N-methyl D-aspartate (NMDA) receptors, and therefore depends on the number of postsynaptic NMDA receptors in each spine. We have calculated previously how the number of postsynaptic NMDA receptors determines the mean and variance of calcium transients in the postsynaptic density, and how this alters the shape of plasticity curves. However, the number of postsynaptic NMDA receptors in the postsynaptic density is not well known. Anatomical methods for estimating the number of NMDA receptors produce estimates that are very different than those produced by physiological techniques. The physiological techniques are based on the statistics of synaptic transmission and it is difficult to experimentally estimate their precision. In this paper we use stochastic simulations in order to test the validity of a physiological estimation technique based on failure analysis. We find that the method is likely to underestimate the number of postsynaptic NMDA receptors, explain the source of the error, and re-derive a more precise estimation technique. We also show that the original failure analysis as well as our improved formulas are not robust to small estimation errors in key parameters.
Resumo:
Linkage disequilibrium methods can be used to find genes influencing quantitative trait variation in humans. Linkage disequilibrium methods can require smaller sample sizes than linkage equilibrium methods, such as the variance component approach to find loci with a specific effect size. The increase in power is at the expense of requiring more markers to be typed to scan the entire genome. This thesis compares different linkage disequilibrium methods to determine which factors influence the power to detect disequilibrium. The costs of disequilibrium and equilibrium tests were compared to determine whether the savings in phenotyping costs when using disequilibrium methods outweigh the additional genotyping costs.^ Nine linkage disequilibrium tests were examined by simulation. Five tests involve selecting isolated unrelated individuals while four involved the selection of parent child trios (TDT). All nine tests were found to be able to identify disequilibrium with the correct significance level in Hardy-Weinberg populations. Increasing linked genetic variance and trait allele frequency were found to increase the power to detect disequilibrium, while increasing the number of generations and distance between marker and trait loci decreased the power to detect disequilibrium. Discordant sampling was used for several of the tests. It was found that the more stringent the sampling, the greater the power to detect disequilibrium in a sample of given size. The power to detect disequilibrium was not affected by the presence of polygenic effects.^ When the trait locus had more than two trait alleles, the power of the tests maximized to less than one. For the simulation methods used here, when there were more than two-trait alleles there was a probability equal to 1-heterozygosity of the marker locus that both trait alleles were in disequilibrium with the same marker allele, resulting in the marker being uninformative for disequilibrium.^ The five tests using isolated unrelated individuals were found to have excess error rates when there was disequilibrium due to population admixture. Increased error rates also resulted from increased unlinked major gene effects, discordant trait allele frequency, and increased disequilibrium. Polygenic effects did not affect the error rates. The TDT, Transmission Disequilibrium Test, based tests were not liable to any increase in error rates.^ For all sample ascertainment costs, for recent mutations ($<$100 generations) linkage disequilibrium tests were less expensive than the variance component test to carry out. Candidate gene scans saved even more money. The use of recently admixed populations also decreased the cost of performing a linkage disequilibrium test. ^
Resumo:
In numerous intervention studies and education field trials, random assignment to treatment occurs in clusters rather than at the level of observation. This departure of random assignment of units may be due to logistics, political feasibility, or ecological validity. Data within the same cluster or grouping are often correlated. Application of traditional regression techniques, which assume independence between observations, to clustered data produce consistent parameter estimates. However such estimators are often inefficient as compared to methods which incorporate the clustered nature of the data into the estimation procedure (Neuhaus 1993).1 Multilevel models, also known as random effects or random components models, can be used to account for the clustering of data by estimating higher level, or group, as well as lower level, or individual variation. Designing a study, in which the unit of observation is nested within higher level groupings, requires the determination of sample sizes at each level. This study investigates the design and analysis of various sampling strategies for a 3-level repeated measures design on the parameter estimates when the outcome variable of interest follows a Poisson distribution. ^ Results study suggest that second order PQL estimation produces the least biased estimates in the 3-level multilevel Poisson model followed by first order PQL and then second and first order MQL. The MQL estimates of both fixed and random parameters are generally satisfactory when the level 2 and level 3 variation is less than 0.10. However, as the higher level error variance increases, the MQL estimates become increasingly biased. If convergence of the estimation algorithm is not obtained by PQL procedure and higher level error variance is large, the estimates may be significantly biased. In this case bias correction techniques such as bootstrapping should be considered as an alternative procedure. For larger sample sizes, those structures with 20 or more units sampled at levels with normally distributed random errors produced more stable estimates with less sampling variance than structures with an increased number of level 1 units. For small sample sizes, sampling fewer units at the level with Poisson variation produces less sampling variation, however this criterion is no longer important when sample sizes are large. ^ 1Neuhaus J (1993). “Estimation efficiency and Tests of Covariate Effects with Clustered Binary Data”. Biometrics , 49, 989–996^
Resumo:
Statement of the problem and public health significance. Hospitals were designed to be a safe haven and respite from disease and illness. However, a large body of evidence points to preventable errors in hospitals as the eighth leading cause of death among Americans. Twelve percent of Americans, or over 33.8 million people, are hospitalized each year. This population represents a significant portion of at risk citizens exposed to hospital medical errors. Since the number of annual deaths due to hospital medical errors is estimated to exceed 44,000, the magnitude of this tragedy makes it a significant public health problem. ^ Specific aims. The specific aims of this study were threefold. First, this study aimed to analyze the state of the states' mandatory hospital medical error reporting six years after the release of the influential IOM report, "To Err is Human." The second aim was to identify barriers to reporting of medical errors by hospital personnel. The third aim was to identify hospital safety measures implemented to reduce medical errors and enhance patient safety. ^ Methods. A descriptive, longitudinal, retrospective design was used to address the first stated objective. The study data came from the twenty-one states with mandatory hospital reporting programs which report aggregate hospital error data that is accessible to the public by way of states' websites. The data analysis included calculations of expected number of medical errors for each state according to IOM rates. Where possible, a comparison was made between state reported data and the calculated IOM expected number of errors. A literature review was performed to achieve the second study aim, identifying barriers to reporting medical errors. The final aim was accomplished by telephone interviews of principal patient safety/quality officers from five Texas hospitals with more than 700 beds. ^ Results. The state medical error data suggests vast underreporting of hospital medical errors to the states. The telephone interviews suggest that hospitals are working at reducing medical errors and creating safer environments for patients. The literature review suggests the underreporting of medical errors at the state level stems from underreporting of errors at the delivery level. ^
Resumo:
Medication errors, one of the most frequent types of medical errors, are a common cause of patient harm in hospital systems today. Nurses at the bedside are in a position to encounter many of these errors since they are there at the start of the process (ordering/prescribing) and the end of the process (administration). One of the recommendations from the IOM (Institute of Medicine) report, "To Err is Human," was for organizations to identify and learn from medical errors through event reporting systems. While many organizations have reporting systems in place, research studies report a significant amount of underreporting by nurses. A systematic review of the literature was performed to identify contributing factors related to the reporting and not reporting of medication errors by nurses at the bedside.^ Articles included in the literature review were primary or secondary studies, dated January 1, 2000 – July 2009, related to nursing medication error reporting. All 634 articles were reviewed with an algorithm developed to standardize the review process and help filter out those that did not meet the study criteria. In addition, 142 article bibliographies were reviewed to find additional studies that were not found in the original literature search.^ After reviewing the 634 articles and the additional 108 articles discovered in the bibliography review, 41 articles met the study criteria and were used in the systematic literature review results.^ Fear of punitive reactions to medication errors was a frequent barrier to error reporting. Nurses fear reactions from their leadership, peers, patients and their families, nursing boards, and the media. Anonymous reporting systems and departments/organizations with a strong safety culture in place helped to encourage the reporting of medication errors by nursing staff.^ Many of the studies included in this literature review do not allow results that can be generalized. The majority of them took place in single institutions/organizations with limited sample sizes. Stronger studies with larger sample sizes need to be performed, utilizing data collection methods that have been validated, to determine stronger correlations between safety cultures and nurse error reporting.^
Resumo:
Background. Over 39.9% of the adult population forty or older in the United States has refractive error, little is known about the etiology of this condition and associated risk factors and their entailed mechanism due to the paucity of data regarding the changes of refractive error for the adult population over time.^ Aim. To evaluate risk factors over a long term, 5-year period, in refractive error changes among persons 43 or older by testing the hypothesis that age, gender, systemic diseases, nuclear sclerosis and baseline refractive errors are all significantly associated with refractive errors changes in patients at a Dallas, Texas private optometric office.^ Methods. A retrospective chart review of subjective refraction, eye health, and self-report health history was done on patients at a private optometric office who were 43 or older in 2000 who had eye examinations both in 2000 and 2005. Aphakic and pseudophakic eyes were excluded as well as eyes with best corrected Snellen visual acuity of 20/40 and worse. After exclusions, refraction was obtained on 114 right eyes and 114 left eyes. Spherical equivalent (sum of sphere + ½ cylinder) was used as the measure of refractive error.^ Results. Similar changes in refractive error were observed for the two eyes. The 5-year change in spherical power was in a hyperopic direction for younger age groups and in a myopic direction for older subjects, P<0.0001. The gender-adjusted mean change in refractive error in right eyes of persons aged 43 to 54, 55 to 64, 65 to 74, and 75 or older at baseline was +0.43D, +0.46 D, -0.09 D, and -0.23D, respectively. Refractive change was strongly related to baseline nuclear cataract severity; grades 4 to 5 were associated with a myopic shift (-0.38 D, P< 0.0001). The mean age-adjusted change in refraction was +0.27 D for hyperopic eyes, +0.56 D for emmetropic eyes, and +0.26 D for myopic eyes.^ Conclusions. This report has documented refractive error changes in an older population and confirmed reported trends of a hyperopic shift before age 65 and a myopic shift thereafter associated with the development of nuclear cataract.^
New methods for quantification and analysis of quantitative real-time polymerase chain reaction data
Resumo:
Quantitative real-time polymerase chain reaction (qPCR) is a sensitive gene quantitation method that has been widely used in the biological and biomedical fields. The currently used methods for PCR data analysis, including the threshold cycle (CT) method, linear and non-linear model fitting methods, all require subtracting background fluorescence. However, the removal of background fluorescence is usually inaccurate, and therefore can distort results. Here, we propose a new method, the taking-difference linear regression method, to overcome this limitation. Briefly, for each two consecutive PCR cycles, we subtracted the fluorescence in the former cycle from that in the later cycle, transforming the n cycle raw data into n-1 cycle data. Then linear regression was applied to the natural logarithm of the transformed data. Finally, amplification efficiencies and the initial DNA molecular numbers were calculated for each PCR run. To evaluate this new method, we compared it in terms of accuracy and precision with the original linear regression method with three background corrections, being the mean of cycles 1-3, the mean of cycles 3-7, and the minimum. Three criteria, including threshold identification, max R2, and max slope, were employed to search for target data points. Considering that PCR data are time series data, we also applied linear mixed models. Collectively, when the threshold identification criterion was applied and when the linear mixed model was adopted, the taking-difference linear regression method was superior as it gave an accurate estimation of initial DNA amount and a reasonable estimation of PCR amplification efficiencies. When the criteria of max R2 and max slope were used, the original linear regression method gave an accurate estimation of initial DNA amount. Overall, the taking-difference linear regression method avoids the error in subtracting an unknown background and thus it is theoretically more accurate and reliable. This method is easy to perform and the taking-difference strategy can be extended to all current methods for qPCR data analysis.^