906 resultados para false positive
Resumo:
Motivation: In order to enhance genome annotation, the fully automatic fold recognition method GenTHREADER has been improved and benchmarked. The previous version of GenTHREADER consisted of a simple neural network which was trained to combine sequence alignment score, length information and energy potentials derived from threading into a single score representing the relationship between two proteins, as designated by CATH. The improved version incorporates PSI-BLAST searches, which have been jumpstarted with structural alignment profiles from FSSP, and now also makes use of PSIPRED predicted secondary structure and bi-directional scoring in order to calculate the final alignment score. Pairwise potentials and solvation potentials are calculated from the given sequence alignment which are then used as inputs to a multi-layer, feed-forward neural network, along with the alignment score, alignment length and sequence length. The neural network has also been expanded to accommodate the secondary structure element alignment (SSEA) score as an extra input and it is now trained to learn the FSSP Z-score as a measurement of similarity between two proteins. Results: The improvements made to GenTHREADER increase the number of remote homologues that can be detected with a low error rate, implying higher reliability of score, whilst also increasing the quality of the models produced. We find that up to five times as many true positives can be detected with low error rate per query. Total MaxSub score is doubled at low false positive rates using the improved method.
Resumo:
A method of automatically identifying and tracking polar-cap plasma patches, utilising data inversion and feature-tracking methods, is presented. A well-established and widely used 4-D ionospheric imaging algorithm, the Multi-Instrument Data Assimilation System (MIDAS), inverts slant total electron content (TEC) data from ground-based Global Navigation Satellite System (GNSS) receivers to produce images of the free electron distribution in the polar-cap ionosphere. These are integrated to form vertical TEC maps. A flexible feature-tracking algorithm, TRACK, previously used extensively in meteorological storm-tracking studies is used to identify and track maxima in the resulting 2-D data fields. Various criteria are used to discriminate between genuine patches and "false-positive" maxima such as the continuously moving day-side maximum, which results from the Earth's rotation rather than plasma motion. Results for a 12-month period at solar minimum, when extensive validation data are available, are presented. The method identifies 71 separate structures consistent with patch motion during this time. The limitations of solar minimum and the consequent small number of patches make climatological inferences difficult, but the feasibility of the method for patches larger than approximately 500 km in scale is demonstrated and a larger study incorporating other parts of the solar cycle is warranted. Possible further optimisation of discrimination criteria, particularly regarding the definition of a patch in terms of its plasma concentration enhancement over the surrounding background, may improve results.
Resumo:
Various fall-detection solutions have been previously proposed to create a reliable surveillance system for elderly people with high requirements on accuracy, sensitivity and specificity. In this paper, an enhanced fall detection system is proposed for elderly person monitoring that is based on smart sensors worn on the body and operating through consumer home networks. With treble thresholds, accidental falls can be detected in the home healthcare environment. By utilizing information gathered from an accelerometer, cardiotachometer and smart sensors, the impacts of falls can be logged and distinguished from normal daily activities. The proposed system has been deployed in a prototype system as detailed in this paper. From a test group of 30 healthy participants, it was found that the proposed fall detection system can achieve a high detection accuracy of 97.5%, while the sensitivity and specificity are 96.8% and 98.1% respectively. Therefore, this system can reliably be developed and deployed into a consumer product for use as an elderly person monitoring device with high accuracy and a low false positive rate.
Resumo:
Contamination of the electroencephalogram (EEG) by artifacts greatly reduces the quality of the recorded signals. There is a need for automated artifact removal methods. However, such methods are rarely evaluated against one another via rigorous criteria, with results often presented based upon visual inspection alone. This work presents a comparative study of automatic methods for removing blink, electrocardiographic, and electromyographic artifacts from the EEG. Three methods are considered; wavelet, blind source separation (BSS), and multivariate singular spectrum analysis (MSSA)-based correction. These are applied to data sets containing mixtures of artifacts. Metrics are devised to measure the performance of each method. The BSS method is seen to be the best approach for artifacts of high signal to noise ratio (SNR). By contrast, MSSA performs well at low SNRs but at the expense of a large number of false positive corrections.
Resumo:
In recent years an increasing number of papers have employed meta-analysis to integrate effect sizes of researchers’ own series of studies within a single paper (“internal meta-analysis”). Although this approach has the obvious advantage of obtaining narrower confidence intervals, we show that it could inadvertently inflate false-positive rates if researchers are motivated to use internal meta-analysis in order to obtain a significant overall effect. Specifically, if one decides whether to stop or continue a further replication experiment depending on the significance of the results in an internal meta-analysis, false-positive rates would increase beyond the nominal level. We conducted a set of Monte-Carlo simulations to demonstrate our argument, and provided a literature review to gauge awareness and prevalence of this issue. Furthermore, we made several recommendations when using internal meta-analysis to make a judgment on statistical significance.
Resumo:
Aim: To validate a non-nutritive sucking (NNS) scoring system for oral feeding in preterm newborns (PTNB). Methods: A cohort study was carried out in two phases. In phase one of the study, 22 mastered speech-language pathologists received the protocol and procedure for a NNS scoring system to evaluate the content and presentation of the form and to define the grading scale. In phase two, six speech-language pathologists evaluated 51 PTNBs weekly, using the defined scoring system. Setting: This study was carried out in the Nursery Annex to the Maternity at the Intensive and Neonatal Pediatrics Service, Instituto da Crianca, Hospital das Clinicas, School of Medicine, University of Sao Paulo (FMUSP) during the period from May 2004 to May 2006. Participants: A total of 28 speech-language pathologist experts and 51 PTNBs. Results: In the first phase of the study, 22 speech-language pathologists selected the criteria, utilized in the NNS evaluation with 80% agreement. In the second phase of the study, the NNS evaluation was carried out on 51 PTNB, and a scoring system of 50 points was proposed, which corresponds to the smallest number of false positive and negative results regarding oral feeding ability. Conclusion: An NNS evaluation system was validated that was able to indicate when oral feeding could safely begin in PTNBs with a high level of agreement among the speech-language pathologists who have participated.
Resumo:
Human respiratory syncytial virus (HRSV) is the main cause of acute lower respiratory tract infections in infants and children. Rapid diagnosis is required to permit appropriate care and treatment and to avoid unnecessary antibiotic use. Reverse transcriptase (RT-PCR) and indirect immunofluorescence assay (IFA) methods have been considered important tools for virus detection due to their high sensitivity and specificity. In order to maximize use-simplicity and minimize the risk of sample cross-contamination inherent in two-step techniques, a RT-PCR method using only a single tube to detect HRSV in clinical samples was developed. Nasopharyngeal aspirates from 226 patients with acute respiratory illness, ranging from infants to 5 years old, were collected at the University Hospital of the University of Sao Paulo (HU-USP), and tested using IFA, one-step RT-PCR, and semi-nested RT-PCR. One hundred and two (45.1%) samples were positive by at least one of the three methods, and 75 (33.2%) were positive by all methods: 92 (40.7%) were positive by one-step RT-PCR, 84 (37.2%) by IFA, and 96 (42.5%) by the semi-nested RT-PCR technique. One-step RT-PCR was shown to be fast, sensitive, and specific for RSV diagnosis, without the added inconvenience and risk of false positive results associated with semi-nested PCR. The combined use of these two methods enhances HRSV detection. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
The gene SNRNP200 is composed of 45 exons and encodes a protein essential for pre-mRNA splicing, the 200 kDa helicase hBrr2. Two mutations in SNRNP200 have recently been associated with autosomal dominant retinitis pigmentosa (adRP), a retinal degenerative disease, in two families from China. In this work we analyzed the entire 35-Kb SNRNP200 genomic region in a cohort of 96 unrelated North American patients with adRP. To complete this large-scale sequencing project, we performed ultra high-throughput sequencing of pooled, untagged PCR products. We then validated the detected DNA changes by Sanger sequencing of individual samples from this cohort and from an additional one of 95 patients. One of the two previously known mutations (p.S1087L) was identified in 3 patients, while 4 new missense changes (p.R681C, p.R681H, p.V683L, p.Y689C) affecting highly conserved codons were identified in 6 unrelated individuals, indicating that the prevalence of SNRNP200-associated adRP is relatively high. We also took advantage of this research to evaluate the pool-and-sequence method, especially with respect to the generation of false positive and negative results. We conclude that, although this strategy can be adopted for rapid discovery of new disease-associated variants, it still requires extensive validation to be used in routine DNA screenings. (C) 2011 Wiley-Liss, Inc.
Resumo:
Only a small fraction of spectra acquired in LC-MS/MS runs matches peptides from target proteins upon database searches. The remaining, operationally termed background, spectra originate from a variety of poorly controlled sources and affect the throughput and confidence of database searches. Here, we report an algorithm and its software implementation that rapidly removes background spectra, regardless of their precise origin. The method estimates the dissimilarity distance between screened MS/MS spectra and unannotated spectra from a partially redundant background library compiled from several control and blank runs. Filtering MS/MS queries enhanced the protein identification capacity when searches lacked spectrum to sequence matching specificity. In sequence-similarity searches it reduced by, on average, 30-fold the number of orphan hits, which were not explicitly related to background protein contaminants and required manual validation. Removing high quality background MS/MS spectra, while preserving in the data set the genuine spectra from target proteins, decreased the false positive rate of stringent database searches and improved the identification of low-abundance proteins.
Resumo:
Data obtained during routine diagnosis of human T-cell lymphotropic virus type 1 (HTLV-1) and 2 (HTLV-2) in ""at-risk"" individuals from Sao Paulo, Brazil using signal-to-cutoff (S/C) values obtained by first, second, and third generation enzyme immunoassay (EIA) kits, were compared. The highest S/C values were obtained with third generation EIA kits, but no correlation was detected between these values and specific antibody reactivity to HTLV-1, HTLV-2, or untyped HTLV (p = 0.302). In addition, use of these third generation kits resulted in HTLV-1/2 false-positive samples. In contrast, first and second generation EIA kits showed high specificity, and the second generation EIA kits showed the highest efficiency, despite lower S/C values. Using first and second generation EIA kits, significant differences in specific antibody detection of HTLV-1, relative to HTLV-2 (p = 0.019 for first generation and p < 0.001 for second generation EIA kits) and relative to untyped HTLV (p = 0.025 for first generation EIA kits), were observed. These results were explained by the composition and format of the assays. In addition, using receiver operating characteristics (ROC) analysis, a slight adjustment in cutoff values for third generation EIA kits improved their specificities and should be used when HTLV ""at-risk"" populations from this geographic area are to be evaluated. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
BACKGROUND: A major problem in Chagas disease donor screening is the high frequency of samples with inconclusive results. The objective of this study was to describe patterns of serologic results among donors to the three Brazilian REDS-II blood centers and correlate with epidemiologic characteristics. STUDY DESIGN AND METHODS: The centers screened donor samples with one Trypanosoma cruzi lysate enzyme immunoassay (EIA). EIA-reactive samples were tested with a second lysate EIA, a recombinant-antigen based EIA, and an immunfluorescence assay. Based on the serologic results, samples were classified as confirmed positive (CP), probable positive (PP), possible other parasitic infection (POPI), and false positive (FP). RESULTS: In 2007 to 2008, a total of 877 of 615,433 donations were discarded due to Chagas assay reactivity. The prevalences (95% confidence intervals [CIs]) among first-time donors for CP, PP, POPI, and FP patterns were 114 (99-129), 26 (19-34), 10 (5-14), and 96 (82-110) per 100,000 donations, respectively. CP and PP had similar patterns of prevalence when analyzed by age, sex, education, and location, suggesting that PP cases represent true T. cruzi infections; in contrast the demographics of donors with POPI were distinct and likely unrelated to Chagas disease. No CP cases were detected among 218,514 repeat donors followed for a total of 718,187 person-years. CONCLUSION: We have proposed a classification algorithm that may have practical importance for donor counseling and epidemiologic analyses of T. cruzi-seroreactive donors. The absence of incident T. cruzi infections is reassuring with respect to risk of window phase infections within Brazil and travel-related infections in nonendemic countries such as the United States.
Resumo:
Cytochrome P450 (CYP450) is a class of enzymes where the substrate identification is particularly important to know. It would help medicinal chemists to design drugs with lower side effects due to drug-drug interactions and to extensive genetic polymorphism. Herein, we discuss the application of the 2D and 3D-similarity searches in identifying reference Structures with higher capacity to retrieve Substrates of three important CYP enzymes (CYP2C9, CYP2D6, and CYP3A4). On the basis of the complementarities of multiple reference structures selected by different similarity search methods, we proposed the fusion of their individual Tanimoto scores into a consensus Tanimoto score (T(consensus)). Using this new score, true positive rates of 63% (CYP2C9) and 81% (CYP2D6) were achieved with false positive rates of 4% for the CYP2C9-CYP2D6 data Set. Extended similarity searches were carried out oil a validation data set, and the results showed that by using the T(consensus) score, not only the area of a ROC graph increased, but also more substrates were recovered at the beginning of a ranked list.
Resumo:
A number of recent works have introduced statistical methods for detecting genetic loci that affect phenotypic variability, which we refer to as variability-controlling quantitative trait loci (vQTL). These are genetic variants whose allelic state predicts how much phenotype values will vary about their expected means. Such loci are of great potential interest in both human and non-human genetic studies, one reason being that a detected vQTL could represent a previously undetected interaction with other genes or environmental factors. The simultaneous publication of these new methods in different journals has in many cases precluded opportunity for comparison. We survey some of these methods, the respective trade-offs they imply, and the connections between them. The methods fall into three main groups: classical non-parametric, fully parametric, and semi-parametric two-stage approximations. Choosing between alternatives involves balancing the need for robustness, flexibility, and speed. For each method, we identify important assumptions and limitations, including those of practical importance, such as their scope for including covariates and random effects. We show in simulations that both parametric methods and their semi-parametric approximations can give elevated false positive rates when they ignore mean-variance relationships intrinsic to the data generation process. We conclude that choice of method depends on the trait distribution, the need to include non-genetic covariates, and the population size and structure, coupled with a critical evaluation of how these fit with the assumptions of the statistical model.
Resumo:
Outliers são observações que parecem ser inconsistentes com as demais. Também chamadas de valores atípicos, extremos ou aberrantes, estas inconsistências podem ser causadas por mudanças de política ou crises econômicas, ondas inesperadas de frio ou calor, erros de medida ou digitação, entre outras. Outliers não são necessariamente valores incorretos, mas, quando provenientes de erros de medida ou digitação, podem distorcer os resultados de uma análise e levar o pesquisador à conclusões equivocadas. O objetivo deste trabalho é estudar e comparar diferentes métodos para detecção de anormalidades em séries de preços do Índice de Preços ao Consumidor (IPC), calculado pelo Instituto Brasileiro de Economia (IBRE) da Fundação Getulio Vargas (FGV). O IPC mede a variação dos preços de um conjunto fixo de bens e serviços componentes de despesas habituais das famílias com nível de renda situado entre 1 e 33 salários mínimos mensais e é usado principalmente como um índice de referência para avaliação do poder de compra do consumidor. Além do método utilizado atualmente no IBRE pelos analistas de preços, os métodos considerados neste estudo são: variações do Método do IBRE, Método do Boxplot, Método do Boxplot SIQR, Método do Boxplot Ajustado, Método de Cercas Resistentes, Método do Quartil, do Quartil Modificado, Método do Desvio Mediano Absoluto e Algoritmo de Tukey. Tais métodos foram aplicados em dados pertencentes aos municípios Rio de Janeiro e São Paulo. Para que se possa analisar o desempenho de cada método, é necessário conhecer os verdadeiros valores extremos antecipadamente. Portanto, neste trabalho, tal análise foi feita assumindo que os preços descartados ou alterados pelos analistas no processo de crítica são os verdadeiros outliers. O Método do IBRE é bastante correlacionado com os preços alterados ou descartados pelos analistas. Sendo assim, a suposição de que os preços alterados ou descartados pelos analistas são os verdadeiros valores extremos pode influenciar os resultados, fazendo com que o mesmo seja favorecido em comparação com os demais métodos. No entanto, desta forma, é possível computar duas medidas através das quais os métodos são avaliados. A primeira é a porcentagem de acerto do método, que informa a proporção de verdadeiros outliers detectados. A segunda é o número de falsos positivos produzidos pelo método, que informa quantos valores precisaram ser sinalizados para um verdadeiro outlier ser detectado. Quanto maior for a proporção de acerto gerada pelo método e menor for a quantidade de falsos positivos produzidos pelo mesmo, melhor é o desempenho do método. Sendo assim, foi possível construir um ranking referente ao desempenho dos métodos, identificando o melhor dentre os analisados. Para o município do Rio de Janeiro, algumas das variações do Método do IBRE apresentaram desempenhos iguais ou superiores ao do método original. Já para o município de São Paulo, o Método do IBRE apresentou o melhor desempenho. Em trabalhos futuros, espera-se testar os métodos em dados obtidos por simulação ou que constituam bases largamente utilizadas na literatura, de forma que a suposição de que os preços descartados ou alterados pelos analistas no processo de crítica são os verdadeiros outliers não interfira nos resultados.
Resumo:
Leishmania infantum and Trypanosoma cruzi are trypanosomatids of medical importance and are, respectively, the etiologic agents of visceral leishmaniasis (VL) and Chagas disease (CD) in Brazil. People infected with L. infantum or T. cruzi may develop asymptomatically, enabling the transmission of pathogens through blood transfusion and / or organs. The assessment of the infection by T. cruzi is included among the tests performed for screening blood donors in Brazil, however, there is no availability of tests for Leishmania. Serological tests for T. cruzi are very sensitive, but not specific, and may have cross-reactions with other microorganisms. Thus, the aim of this study was to determine the prevalence of Leishmania infection in blood donors and assess whether the serological test for T. cruzi detect L. infantum. Among the 300 blood samples from donors, discarded in 2011, 61 were T. cruzi positive, 203 were from donors with other infections and 36 were from handbags with low blood volume, but without infection. We also assessed 144 samples from donors without infections and able to donate blood, totaling 444 subjects. DNA was extracted from blood samples of all to perform quantitative PCR (qPCR) to detect Leishmania DNA. The buffy coat obtained from all samples was grown in Schneider medium supplemented and NNN. All samples were evaluated for the presence of anti-Leishmania antibody. The serological results indicate a percentage of 22% of Leishmania infection in blood samples obtained from discarded bags. A total of 60% of samples positive in ELISA for T. cruzi were negative by IFI, used as confirmatory test, ie 60% false positive for Chagas. Among these samples false positive for Chagas, 72% were positive by ELISA for Leishmania characterizing the occurrence of cross reaction between serologic assays. Of the 300 cultures performed, 18 grew parasites that were typed by qPCR and specific isoenzymes, found the species Leishmania infantum crops. Among the 18 cultures, 4 were purged from scholarships for low volume and all negative serology blood bank, thus demonstrating that there is a real risk of Leishmania transmission via transfusion. It is concluded that in an area endemic for leishmaniasis in Brazil, serological diagnosis performed to detect infection by T. cruzi among blood donors can identify infection by L. infantum and although cause false positive for Chagas, this cross-reactivity reduces the risk of Leishmania infection via blood transfusion, since tests are not applied specific detection of the parasite. In this way, there remains the need to discuss the implementation of a specific serological screening test for Leishmania in endemic countries such as Brazil