901 resultados para false positives


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The impact of erroneous genotypes having passed standard quality control (QC) can be severe in genome-wide association studies, genotype imputation, and estimation of heritability and prediction of genetic risk based on single nucleotide polymorphisms (SNP). To detect such genotyping errors, a simple two-locus QC method, based on the difference in test statistic of association between single SNPs and pairs of SNPs, was developed and applied. The proposed approach could detect many problematic SNPs with statistical significance even when standard single SNP QC analyses fail to detect them in real data. Depending on the data set used, the number of erroneous SNPs that were not filtered out by standard single SNP QC but detected by the proposed approach varied from a few hundred to thousands. Using simulated data, it was shown that the proposed method was powerful and performed better than other tested existing methods. The power of the proposed approach to detect erroneous genotypes was approximately 80% for a 3% error rate per SNP. This novel QC approach is easy to implement and computationally efficient, and can lead to a better quality of genotypes for subsequent genotype-phenotype investigations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Bloom filter is a space efficient randomized data structure for representing a set and supporting membership queries. Bloom filters intrinsically allow false positives. However, the space savings they offer outweigh the disadvantage if the false positive rates are kept sufficiently low. Inspired by the recent application of the Bloom filter in a novel multicast forwarding fabric, this paper proposes a variant of the Bloom filter, the optihash. The optihash introduces an optimization for the false positive rate at the stage of Bloom filter formation using the same amount of space at the cost of slightly more processing than the classic Bloom filter. Often Bloom filters are used in situations where a fixed amount of space is a primary constraint. We present the optihash as a good alternative to Bloom filters since the amount of space is the same and the improvements in false positives can justify the additional processing. Specifically, we show via simulations and numerical analysis that using the optihash the false positives occurrences can be reduced and controlled at a cost of small additional processing. The simulations are carried out for in-packet forwarding. In this framework, the Bloom filter is used as a compact link/route identifier and it is placed in the packet header to encode the route. At each node, the Bloom filter is queried for membership in order to make forwarding decisions. A false positive in the forwarding decision is translated into packets forwarded along an unintended outgoing link. By using the optihash, false positives can be reduced. The optimization processing is carried out in an entity termed the Topology Manger which is part of the control plane of the multicast forwarding fabric. This processing is only carried out on a per-session basis, not for every packet. The aim of this paper is to present the optihash and evaluate its false positive performances via simulations in order to measure the influence of different parameters on the false positive rate. The false positive rate for the optihash is then compared with the false positive probability of the classic Bloom filter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Some unexpected promiscuous inhibitors were observed in a virtual screening protocol applied to select cruzain inhibitors from the ZINC database. Physical-chemical and pharmacophore model filters were used to reduce the database size. The selected compounds were docked into the cruzain active site. Six hit compounds were tested as inhibitors. Although the compounds were designed to be nucleophilically attacked by the catalytic cysteine of cruzain, three of them showed typical promiscuous behavior, revealing that false positives are a prevalent concern in VS programs. (C) 2007 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract Background A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative probability model. We can use as alternative model a null model. This is the scoring technique used by sequence analysis tools such as HMMER, SAM and INFERNAL. The most prevalent null models are position-independent residue distributions that include: the uniform distribution, genomic distribution, family-specific distribution and the target sequence distribution. This paper presents a study to evaluate the impact of the choice of a null model in the final result of classifications. In particular, we are interested in minimizing the number of false predictions in a classification. This is a crucial issue to reduce costs of biological validation. Results For all the tests, the target null model presented the lowest number of false positives, when using random sequences as a test. The study was performed in DNA sequences using GC content as the measure of content bias, but the results should be valid also for protein sequences. To broaden the application of the results, the study was performed using randomly generated sequences. Previous studies were performed on aminoacid sequences, using only one probabilistic model (HMM) and on a specific benchmark, and lack more general conclusions about the performance of null models. Finally, a benchmark test with P. falciparum confirmed these results. Conclusions Of the evaluated models the best suited for classification are the uniform model and the target model. However, the use of the uniform model presents a GC bias that can cause more false positives for candidate sequences with extreme compositional bias, a characteristic not described in previous studies. In these cases the target model is more dependable for biological validation due to its higher specificity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

False-positive and false-negative values were calculated for five different designs of the trend test and it was demonstrated that a design suggested by Portier and Hoel in 1984 for a different problem produced the lowest false-positive and false-negative rates when applied to historical spontaneous tumor rate data for Fischer Rats. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study investigated the role of contextual factors in personnel selection. Specifically, I explored if specific job factors such as the wage, training, available applicant pool and security concerns around a job, influenced personnel decisions. Additionally, I explored if the individual differences of decision makers played a role in how the previously mentioned job factors affected their decisions. A policy-capturing methodology was employed to determine the weight participants place on the job factors when selecting candidates for different jobs. Regression and correlational analyses were computed with the beta weights obtained from individual regression analyses. The results obtained from the two samples (student and general population) revealed that specific job characteristics did indeed influence personnel decisions. Participants were more concerned with making mistakes and thus less likely to accept candidates when selecting candidates for jobs having high salary and/or high training requirements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

70.00% 70.00%

Publicador:

Resumo:

GC-MS data on veterinary drug residues in bovine urine are used for controlling the illegal practice of fattening cattle. According to current detection criteria, peak patterns of preferably four ions should agree within 10 or 20% from a corresponding standard pattern. These criteria are rigid, rather arbitrary and do not match daily practice. A new model, based on multivariate modeling of log peak abundance ratios, provides a theoretical basis for the identification of analytes and optimizes the balance between the avoidance of false positives and false negatives. The performance of the model is demonstrated on data provided by five laboratories, each supplying GC-MS measurements on the detection of clenbuterol, dienestrol and 19 beta-nortestosterone in urine. The proposed model shows a better performance than confirmation by using the current criteria and provides a statistical basis for inspection criteria in terms of error probabilities.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The unresolved issue of false-positive D-dimer results in the diagnostic workup of pulmonary embolism Pulmonary embolism (PE) remains a difficult diagnosis as it lacks specific symptoms and clinical signs. After the determination of the pretest PE probability by a validated clinical score, D-dimers (DD) is the initial blood test in the majority of patients whose probability is low or intermediate. The low specificity of DD results in a high number of false-positives that then require thoracic angio-CT. A new clinical decision rule, called the Pulmonary Embolism Rule-out criteria (PERC), identifies patients at such low risk that PE can be safely ruled-out without a DD test. Its safety has been confirmed in US emergency departments, but retrospective European studies showed that it would lead to 5-7% of undiagnosed PE. Alternative strategies are needed to reduce the proportion of false-positive DD results.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent studies have indicated that research practices in psychology may be susceptible to factors that increase false-positive rates, raising concerns about the possible prevalence of false-positive findings. The present article discusses several practices that may run counter to the inflation of false-positive rates. Taking these practices into account would lead to a more balanced view on the false-positive issue. Specifically, we argue that an inflation of false-positive rates would diminish, sometimes to a substantial degree, when researchers (a) have explicit a priori theoretical hypotheses, (b) include multiple replication studies in a single paper, and (c) collect additional data based on observed results. We report findings from simulation studies and statistical evidence that support these arguments. Being aware of these preventive factors allows researchers not to overestimate the pervasiveness of false-positives in psychology and to gauge the susceptibility of a paper to possible false-positives in practical and fair ways.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Current IEEE 802.11 wireless networks are vulnerable to session hijacking attacks as the existing standards fail to address the lack of authentication of management frames and network card addresses, and rely on loosely coupled state machines. Even the new WLAN security standard - IEEE 802.11i does not address these issues. In our previous work, we proposed two new techniques for improving detection of session hijacking attacks that are passive, computationally inexpensive, reliable, and have minimal impact on network performance. These techniques utilise unspoofable characteristics from the MAC protocol and the physical layer to enhance confidence in the intrusion detection process. This paper extends our earlier work and explores usability, robustness and accuracy of these intrusion detection techniques by applying them to eight distinct test scenarios. A correlation engine has also been introduced to maintain the false positives and false negatives at a manageable level. We also explore the process of selecting optimum thresholds for both detection techniques. For the purposes of our experiments, Snort-Wireless open source wireless intrusion detection system was extended to implement these new techniques and the correlation engine. Absence of any false negatives and low number of false positives in all eight test scenarios successfully demonstrated the effectiveness of the correlation engine and the accuracy of the detection techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper investigates the use of the FAB-MAP appearance-only SLAM algorithm as a method for performing visual data association for RatSLAM, a semi-metric full SLAM system. While both systems have shown the ability to map large (60-70km) outdoor locations of approximately the same scale, for either larger areas or across longer time periods both algorithms encounter difficulties with false positive matches. By combining these algorithms using a mapping between appearance and pose space, both false positives and false negatives generated by FAB-MAP are significantly reduced during outdoor mapping using a forward-facing camera. The hybrid FAB-MAP-RatSLAM system developed demonstrates the potential for successful SLAM over large periods of time.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Identification of hot spots, also known as the sites with promise, black spots, accident-prone locations, or priority investigation locations, is an important and routine activity for improving the overall safety of roadway networks. Extensive literature focuses on methods for hot spot identification (HSID). A subset of this considerable literature is dedicated to conducting performance assessments of various HSID methods. A central issue in comparing HSID methods is the development and selection of quantitative and qualitative performance measures or criteria. The authors contend that currently employed HSID assessment criteria—namely false positives and false negatives—are necessary but not sufficient, and additional criteria are needed to exploit the ordinal nature of site ranking data. With the intent to equip road safety professionals and researchers with more useful tools to compare the performances of various HSID methods and to improve the level of HSID assessments, this paper proposes four quantitative HSID evaluation tests that are, to the authors’ knowledge, new and unique. These tests evaluate different aspects of HSID method performance, including reliability of results, ranking consistency, and false identification consistency and reliability. It is intended that road safety professionals apply these different evaluation tests in addition to existing tests to compare the performances of various HSID methods, and then select the most appropriate HSID method to screen road networks to identify sites that require further analysis. This work demonstrates four new criteria using 3 years of Arizona road section accident data and four commonly applied HSID methods [accident frequency ranking, accident rate ranking, accident reduction potential, and empirical Bayes (EB)]. The EB HSID method reveals itself as the superior method in most of the evaluation tests. In contrast, identifying hot spots using accident rate rankings performs the least well among the tests. The accident frequency and accident reduction potential methods perform similarly, with slight differences explained. The authors believe that the four new evaluation tests offer insight into HSID performance heretofore unavailable to analysts and researchers.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Identifying crash “hotspots”, “blackspots”, “sites with promise”, or “high risk” locations is standard practice in departments of transportation throughout the US. The literature is replete with the development and discussion of statistical methods for hotspot identification (HSID). Theoretical derivations and empirical studies have been used to weigh the benefits of various HSID methods; however, a small number of studies have used controlled experiments to systematically assess various methods. Using experimentally derived simulated data—which are argued to be superior to empirical data, three hot spot identification methods observed in practice are evaluated: simple ranking, confidence interval, and Empirical Bayes. Using simulated data, sites with promise are known a priori, in contrast to empirical data where high risk sites are not known for certain. To conduct the evaluation, properties of observed crash data are used to generate simulated crash frequency distributions at hypothetical sites. A variety of factors is manipulated to simulate a host of ‘real world’ conditions. Various levels of confidence are explored, and false positives (identifying a safe site as high risk) and false negatives (identifying a high risk site as safe) are compared across methods. Finally, the effects of crash history duration in the three HSID approaches are assessed. The results illustrate that the Empirical Bayes technique significantly outperforms ranking and confidence interval techniques (with certain caveats). As found by others, false positives and negatives are inversely related. Three years of crash history appears, in general, to provide an appropriate crash history duration.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bananas are hosts to a large number of banana streak virus (BSV) species. However, diagnostic methods for BSV are inadequate because of the considerable genetic and serological diversity amongst BSV isolates and the presence of integrated BSV sequences in some banana cultivars which leads to false positives. In this study, a sequence non-specific, rolling-circle amplification (RCA) technique was developed and shown to overcome these limitations for the detection and subsequent characterisation of BSV isolates infecting banana. This technique was shown to discriminate between integrated and episomal BSV DNA, specifically detecting the latter in several banana cultivars known to contain episomal and/or integrated sequences of Banana streak Mysore virus (BSMyV), Banana streak OL virus (BSOLV) and Banana streak GF virus (BSGFV). Using RCA, the presence of BSMyV and BSOLV was confirmed in Australia, while BSOLV, BSGFV, Banana streak Uganda I virus (BSUgIV), Banana streak Uganda L virus (BSUgLV) and Banana streak Uganda M virus (BSUgMV) were detected in Uganda. This is the first confirmed report of episomally-derived BSUglV, BSUgLV and BSUgMV in Uganda. As well as its ability to detect BSV, RCA was shown to detect two other pararetroviruses, Sugarcane bacilliform virus in sugarcane and Cauliflower mosaic virus in turnip.