985 resultados para error rates


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers. Results: In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at P < 10(-5) for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers. Conclusions: Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Next-generation sequencing (NGS) is a valuable tool for the detection and quantification of HIV-1 variants in vivo. However, these technologies require detailed characterization and control of artificially induced errors to be applicable for accurate haplotype reconstruction. To investigate the occurrence of substitutions, insertions, and deletions at the individual steps of RT-PCR and NGS, 454 pyrosequencing was performed on amplified and non-amplified HIV-1 genomes. Artificial recombination was explored by mixing five different HIV-1 clonal strains (5-virus-mix) and applying different RT-PCR conditions followed by 454 pyrosequencing. Error rates ranged from 0.04-0.66% and were similar in amplified and non-amplified samples. Discrepancies were observed between forward and reverse reads, indicating that most errors were introduced during the pyrosequencing step. Using the 5-virus-mix, non-optimized, standard RT-PCR conditions introduced artificial recombinants in a fraction of at least 30% of the reads that subsequently led to an underestimation of true haplotype frequencies. We minimized the fraction of recombinants down to 0.9-2.6% by optimized, artifact-reducing RT-PCR conditions. This approach enabled correct haplotype reconstruction and frequency estimations consistent with reference data obtained by single genome amplification. RT-PCR conditions are crucial for correct frequency estimation and analysis of haplotypes in heterogeneous virus populations. We developed an RT-PCR procedure to generate NGS data useful for reliable haplotype reconstruction and quantification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of presence/absence data in wildlife management and biological surveys is widespread. There is a growing interest in quantifying the sources of error associated with these data. We show that false-negative errors (failure to record a species when in fact it is present) can have a significant impact on statistical estimation of habitat models using simulated data. Then we introduce an extension of logistic modeling, the zero-inflated binomial (ZIB) model that permits the estimation of the rate of false-negative errors and the correction of estimates of the probability of occurrence for false-negative errors by using repeated. visits to the same site. Our simulations show that even relatively low rates of false negatives bias statistical estimates of habitat effects. The method with three repeated visits eliminates the bias, but estimates are relatively imprecise. Six repeated visits improve precision of estimates to levels comparable to that achieved with conventional statistics in the absence of false-negative errors In general, when error rates are less than or equal to50% greater efficiency is gained by adding more sites, whereas when error rates are >50% it is better to increase the number of repeated visits. We highlight the flexibility of the method with three case studies, clearly demonstrating the effect of false-negative errors for a range of commonly used survey methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Hantaviruses are rodent-borne Bunyaviruses that infect the Arvicolinae, Murinae, and Sigmodontinae subfamilies of Muridae. The rate of molecular evolution in the hantaviruses has been previously estimated at approximately 10(-7) nucleotide substitutions per site, per year (substitutions/site/year), based on the assumption of codivergence and hence shared divergence times with their rodent hosts. If substantiated, this would make the hantaviruses among the slowest evolving of all RNA viruses. However, as hantaviruses replicate with an RNA-dependent RNA polymerase, with error rates in the region of one mutation per genome replication, this low rate of nucleotide substitution is anomalous. Here, we use a Bayesian coalescent approach to estimate the rate of nucleotide substitution from serially sampled gene sequence data for hantaviruses known to infect each of the 3 rodent subfamilies: Araraquara virus ( Sigmodontinae), Dobrava virus ( Murinae), Puumala virus ( Arvicolinae), and Tula virus ( Arvicolinae). Our results reveal that hantaviruses exhibit shortterm substitution rates of 10(-2) to 10(-4) substitutions/site/year and so are within the range exhibited by other RNA viruses. The disparity between this substitution rate and that estimated assuming rodent-hantavirus codivergence suggests that the codivergence hypothesis may need to be reevaluated.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Restriction site-associated DNA sequencing (RADseq) provides researchers with the ability to record genetic polymorphism across thousands of loci for nonmodel organisms, potentially revolutionizing the field of molecular ecology. However, as with other genotyping methods, RADseq is prone to a number of sources of error that may have consequential effects for population genetic inferences, and these have received only limited attention in terms of the estimation and reporting of genotyping error rates. Here we use individual sample replicates, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome. We then use sample replicates to (i) optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci; and (ii) quantify error rates for loci, alleles and single-nucleotide polymorphisms. As an empirical example, we use a double-digest RAD data set of a nonmodel plant species, Berberis alpina, collected from high-altitude mountains in Mexico.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Restriction site-associated DNA sequencing (RADseq) provides researchers with the ability to record genetic polymorphism across thousands of loci for nonmodel organisms, potentially revolutionizing the field of molecular ecology. However, as with other genotyping methods, RADseq is prone to a number of sources of error that may have consequential effects for population genetic inferences, and these have received only limited attention in terms of the estimation and reporting of genotyping error rates. Here we use individual sample replicates, under the expectation of identical genotypes, to quantify genotyping error in the absence of a reference genome. We then use sample replicates to (i) optimize de novo assembly parameters within the program Stacks, by minimizing error and maximizing the retrieval of informative loci; and (ii) quantify error rates for loci, alleles and single-nucleotide polymorphisms. As an empirical example, we use a double-digest RAD data set of a nonmodel plant species, Berberis alpina, collected from high-altitude mountains in Mexico.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

While channel coding is a standard method of improving a system’s energy efficiency in digital communications, its practice does not extend to high-speed links. Increasing demands in network speeds are placing a large burden on the energy efficiency of high-speed links and render the benefit of channel coding for these systems a timely subject. The low error rates of interest and the presence of residual intersymbol interference (ISI) caused by hardware constraints impede the analysis and simulation of coded high-speed links. Focusing on the residual ISI and combined noise as the dominant error mechanisms, this paper analyses error correlation through concepts of error region, channel signature, and correlation distance. This framework provides a deeper insight into joint error behaviours in high-speed links, extends the range of statistical simulation for coded high-speed links, and provides a case against the use of biased Monte Carlo methods in this setting

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: Medication errors are an important cause of morbidity and mortality in primary care. The aims of this study are to determine the effectiveness, cost effectiveness and acceptability of a pharmacist-led information-technology-based complex intervention compared with simple feedback in reducing proportions of patients at risk from potentially hazardous prescribing and medicines management in general (family) practice. Methods: Research subject group: "At-risk" patients registered with computerised general practices in two geographical regions in England. Design: Parallel group pragmatic cluster randomised trial. Interventions: Practices will be randomised to either: (i) Computer-generated feedback; or (ii) Pharmacist-led intervention comprising of computer-generated feedback, educational outreach and dedicated support. Primary outcome measures: The proportion of patients in each practice at six and 12 months post intervention: - with a computer-recorded history of peptic ulcer being prescribed non-selective non-steroidal anti-inflammatory drugs - with a computer-recorded diagnosis of asthma being prescribed beta-blockers - aged 75 years and older receiving long-term prescriptions for angiotensin converting enzyme inhibitors or loop diuretics without a recorded assessment of renal function and electrolytes in the preceding 15 months. Secondary outcome measures; These relate to a number of other examples of potentially hazardous prescribing and medicines management. Economic analysis: An economic evaluation will be done of the cost per error avoided, from the perspective of the UK National Health Service (NHS), comparing the pharmacist-led intervention with simple feedback. Qualitative analysis: A qualitative study will be conducted to explore the views and experiences of health care professionals and NHS managers concerning the interventions, and investigate possible reasons why the interventions prove effective, or conversely prove ineffective. Sample size: 34 practices in each of the two treatment arms would provide at least 80% power (two-tailed alpha of 0.05) to demonstrate a 50% reduction in error rates for each of the three primary outcome measures in the pharmacist-led intervention arm compared with a 11% reduction in the simple feedback arm. Discussion: At the time of submission of this article, 72 general practices have been recruited (36 in each arm of the trial) and the interventions have been delivered. Analysis has not yet been undertaken.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In order to examine metacognitive accuracy (i.e., the relationship between metacognitive judgment and memory performance), researchers often rely on by-participant analysis, where metacognitive accuracy (e.g., resolution, as measured by the gamma coefficient or signal detection measures) is computed for each participant and the computed values are entered into group-level statistical tests such as the t-test. In the current work, we argue that the by-participant analysis, regardless of the accuracy measurements used, would produce a substantial inflation of Type-1 error rates, when a random item effect is present. A mixed-effects model is proposed as a way to effectively address the issue, and our simulation studies examining Type-1 error rates indeed showed superior performance of mixed-effects model analysis as compared to the conventional by-participant analysis. We also present real data applications to illustrate further strengths of mixed-effects model analysis. Our findings imply that caution is needed when using the by-participant analysis, and recommend the mixed-effects model analysis.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Escherichia coli dnaQ gene encodes the proofreading 3' exonuclease (epsilon subunit) of DNA polymerase III holoenzyme and is a critical determinant of chromosomal replication fidelity. We constructed by site-specific mutagenesis a mutant, dnaQ926, by changing two conserved amino acid residues (Asp-12-->Ala and Glu-14-->Ala) in the Exo I motif, which, by analogy to other proofreading exonucleases, is essential for the catalytic activity. When residing on a plasmid, dnaQ926 confers a strong, dominant mutator phenotype, suggesting that the protein, although deficient in exonuclease activity, still binds to the polymerase subunit (alpha subunit or dnaE gene product). When dnaQ926 was transferred to the chromosome, replacing the wild-type gene, the cells became inviable. However, viable dnaQ926 strains could be obtained if they contained one of the dnaE alleles previously characterized in our laboratory as antimutator alleles or if it carried a multicopy plasmid containing the E. coli mutL+ gene. These results suggest that loss of proofreading exonuclease activity in dnaQ926 is lethal due to excessive error rates (error catastrophe). Error catastrophe results from both the loss of proofreading and the subsequent saturation of DNA mismatch repair. The probability of lethality by excessive mutation is supported by calculations estimating the number of inactivating mutations in essential genes per chromosome replication.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Medication errors are associated with significant morbidity and people with mental health problems may be particularly susceptible to medication errors due to various factors. Primary care has a key role in improving medication safety in this vulnerable population. The complexity of services, involving primary and secondary care and social services, and potential training issues may increase error rates, with physical medicines representing a particular risk. Service users may be cognitively impaired and fail to identify an error placing additional responsibilities on clinicians. The potential role of carers in error prevention and medication safety requires further elaboration. A potential lack of trust between service users and clinicians may impair honest communication about medication issues leading to errors. There is a need for detailed research within this field.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We investigate the use of different direct detection modulation formats in a wavelength switched optical network. We find the minimum time it takes a tunable sampled grating distributed Bragg reflector laser to recover after switching from one wavelength channel to another for different modulation formats. The recovery time is investigated utilizing a field programmable gate array which operates as a time resolved bit error rate detector. The detector offers 93 ps resolution operating at 10.7 Gb/s and allows for all the data received to contribute to the measurement, allowing low bit error rates to be measured at high speed. The recovery times for 10.7 Gb/s non-return-to-zero on–off keyed modulation, 10.7 Gb/s differentially phase shift keyed signal and 21.4 Gb/s differentially quadrature phase shift keyed formats can be as low as 4 ns, 7 ns and 40 ns, respectively. The time resolved phase noise associated with laser settling is simultaneously measured for 21.4 Gb/s differentially quadrature phase shift keyed data and it shows that the phase noise coupled with frequency error is the primary limitation on transmitting immediately after a laser switching event.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We investigate the use of different direct detection modulation formats in a wavelength switched optical network. We find the minimum time it takes a tunable sampled grating distributed Bragg reflector laser to recover after switching from one wavelength channel to another for different modulation formats. The recovery time is investigated utilizing a field programmable gate array which operates as a time resolved bit error rate detector. The detector offers 93 ps resolution operating at 10.7 Gb/s and allows for all the data received to contribute to the measurement, allowing low bit error rates to be measured at high speed. The recovery times for 10.7 Gb/s non-return-to-zero on–off keyed modulation, 10.7 Gb/s differentially phase shift keyed signal and 21.4 Gb/s differentially quadrature phase shift keyed formats can be as low as 4 ns, 7 ns and 40 ns, respectively. The time resolved phase noise associated with laser settling is simultaneously measured for 21.4 Gb/s differentially quadrature phase shift keyed data and it shows that the phase noise coupled with frequency error is the primary limitation on transmitting immediately after a laser switching event.