906 resultados para false positive
Resumo:
Pedestrians’ use of mp3 players or mobile phones can pose the risk of being hit by motor vehicles. We present an approach for detecting a crash risk level using the computing power and the microphone of mobile devices that can be used to alert the user in advance of an approaching vehicle so as to avoid a crash. A single feature extractor classifier is not usually able to deal with the diversity of risky acoustic scenarios. In this paper, we address the problem of detection of vehicles approaching a pedestrian by a novel, simple, non resource intensive acoustic method. The method uses a set of existing statistical tools to mine signal features. Audio features are adaptively thresholded for relevance and classified with a three component heuristic. The resulting Acoustic Hazard Detection (AHD) system has a very low false positive detection rate. The results of this study could help mobile device manufacturers to embed the presented features into future potable devices and contribute to road safety.
Resumo:
Prostate cancer is a significant health problem faced by aging men. Currently, diagnostic strategies for the detection of prostate cancer are either unreliable, yielding high numbers of false positive results, or too invasive to be used widely as screening tests. Furthermore, the current therapeutic strategies for the treatment of the disease carry considerable side effects. Although organ confined prostate cancer can be curable, most detectable clinical symptoms occur in advanced disease when primary tumour cells have metastasised to distant sites - usually lymph nodes and bone. Many growth factors and steroids assist the continued growth and maintenance of prostatic tumour cells. Of these mitogens, androgens are important in the development of the normal prostate but are also required to sustain the growth of prostate cancer cells in the early stage of the disease. Not only are androgens required in the early stage of disease, but also many other growth factors and hormones interact to cause uncontrolled proliferation of malignant cells. The early, androgen sensitive phase of disease is followed by an androgen insensitive phase, whereby androgens are no longer required to stimulate the growth of the tumour cells. Growth factors such as transforming growth factor and (TGF/), epidermal growth factor (EGF), basic fibroblast growth factor (bFGF), insulin-like growth factors (IGFs), Vitamin D and thyroid hormone have been suggested to be important at this stage of disease. Interestingly, some of the kallikrein family of genes, including prostate specific antigen (PSA), the current serum diagnostic marker for prostate cancer, are regulated by androgens and many of the aforementioned growth factors. The kallikrein gene family is a group of serine proteases that are involved in a diverse range of physiological processes: regulation of local blood flow, angiogenesis, tissue invasion and mitogenesis. The earliest members of the kallikrein gene family (KLK1-KLK3) have been strongly associated with general disease states, such as hypertension, inflammation, pancreatitis and renal disease, but are also linked to various cancers. Recently, this family was extended to include 15 genes (KLK1-15). Several newer members of the kallikrein family have been implicated in the carcinogenesis and tumour metastasis of hormone-dependent cancers such as prostate, breast, endometrial and ovarian cancer. The aims of this project were to investigate the expression of the newly identified kallikrein, KLK4, in benign and malignant prostate tissues, and prostate cancer cell lines. This thesis has demonstrated the elevated expression of KLK4 mRNA transcripts in malignant prostate tissue compared to benign prostates. Additionally, expression of the full length KLK4 transcript was detected in the androgen dependent prostate cancer cell line, LNCaP. Based on the above finding, the LNCaP cell line was chosen to assess the potential regulation of full length KLK4 by androgen, thyroid hormone and epidermal growth factor. KLK4 mRNA and protein was found to be up-regulated by androgen and a combination of androgen and thyroid hormone. Thyroid hormone alone produced no significant change in KLK4 mRNA or protein over the control. Epidermal growth factor treatment also resulted in elevated expression levels of KLK4 mRNA and protein. To assess the potential functional role(s) of KLK4/hK4 in processes associated with tumour progression, full length KLK4 was transfected into PC-3 cells - a prostate cancer cell line originally derived from a secondary bone lesion. The KLK4/hK4 over-expressing cells were assessed for their proliferation, migration, invasion and attachment properties. The KLK4 over-expressing clones exhibited a marked change in morphology, indicative of a more aggressive phenotype. The KLK4 clones were irregularly shaped with compromised adhesion to the growth surface. In contrast, the control cell lines (parent PC-3 and empty vector clones) retained a rounded morphology with obvious cell to cell adhesion, as well as significant adhesion to their growth surface. The KLK4 clones exhibited significantly greater attachment to Collagen I and IV than native PC-3s and empty vector controls. Over a 12 hour period, in comparison to the control cells, the KLK4 clones displayed an increase in migration towards PC-3 native conditioned media, a 3 fold increase towards conditioned media from an osteoblastic cell line (Saos-2) and no change in migration towards conditioned media from neonatal foreskin fibroblast cells or 20% foetal bovine serum. Furthermore, the increase in migration exhibited by the KLK4 clones was partially blocked by the serine protease inhibitor, aprotinin. The data presented in this thesis suggests that KLK4/hK4 is important in prostate carcinogenesis due to its over-expression in malignant prostate tissues, its regulation by hormones and growth factors associated with prostate disease and the functional consequences of over-expression of KLK4/hK4 in the PC-3 cell line. These results indicate that KLK4/hK4 may play an important role in tumour invasion and bone metastasis via increased attachment to the bone matrix protein, Collagen I, and enhanced migration due to soluble factors produced by osteoblast cells. This suggestion is further supported by the morphological changes displayed by the KLK4 over-expressing cells. Overall, this data suggests that KLK4/hK4 should be further studied to more fully investigate the potential value of KLK4/hK4 as a diagnostic/prognostic biomarker or in therapeutic applications.
Resumo:
Studies continue to report ancient DNA sequences and viable microbial cells that are many millions of years old. In this paper we evaluate some of the most extravagant claims of geologically ancient DNA. We conclude that although exciting, the reports suffer from inadequate experimental setup and insufficient authentication of results. Consequently, it remains doubtful whether amplifiable DNA sequences and viable bacteria can survive over geological timescales. To enhance the credibility of future studies and assist in discarding false-positive results, we propose a rigorous set of authentication criteria for work with geologically ancient DNA.
Resumo:
Background Cohort studies can provide valuable evidence of cause and effect relationships but are subject to loss of participants over time, limiting the validity of findings. Computerised record linkage offers a passive and ongoing method of obtaining health outcomes from existing routinely collected data sources. However, the quality of record linkage is reliant upon the availability and accuracy of common identifying variables. We sought to develop and validate a method for linking a cohort study to a state-wide hospital admissions dataset with limited availability of unique identifying variables. Methods A sample of 2000 participants from a cohort study (n = 41 514) was linked to a state-wide hospitalisations dataset in Victoria, Australia using the national health insurance (Medicare) number and demographic data as identifying variables. Availability of the health insurance number was limited in both datasets; therefore linkage was undertaken both with and without use of this number and agreement tested between both algorithms. Sensitivity was calculated for a sub-sample of 101 participants with a hospital admission confirmed by medical record review. Results Of the 2000 study participants, 85% were found to have a record in the hospitalisations dataset when the national health insurance number and sex were used as linkage variables and 92% when demographic details only were used. When agreement between the two methods was tested the disagreement fraction was 9%, mainly due to "false positive" links when demographic details only were used. A final algorithm that used multiple combinations of identifying variables resulted in a match proportion of 87%. Sensitivity of this final linkage was 95%. Conclusions High quality record linkage of cohort data with a hospitalisations dataset that has limited identifiers can be achieved using combinations of a national health insurance number and demographic data as identifying variables.
Resumo:
Acoustic sensors provide an effective means of monitoring biodiversity at large spatial and temporal scales. They can continuously and passively record large volumes of data over extended periods, however these data must be analysed to detect the presence of vocal species. Automated analysis of acoustic data for large numbers of species is complex and can be subject to high levels of false positive and false negative results. Manual analysis by experienced users can produce accurate results, however the time and effort required to process even small volumes of data can make manual analysis prohibitive. Our research examined the use of sampling methods to reduce the cost of analysing large volumes of acoustic sensor data, while retaining high levels of species detection accuracy. Utilising five days of manually analysed acoustic sensor data from four sites, we examined a range of sampling rates and methods including random, stratified and biologically informed. Our findings indicate that randomly selecting 120, one-minute samples from the three hours immediately following dawn provided the most effective sampling method. This method detected, on average 62% of total species after 120 one-minute samples were analysed, compared to 34% of total species from traditional point counts. Our results demonstrate that targeted sampling methods can provide an effective means for analysing large volumes of acoustic sensor data efficiently and accurately.
Resumo:
Background: Right-to-left shunting via a patent foramen ovale (PFO) has a recognized association with embolic events in younger patients. The use of agitated saline contrast imaging (ASCi) for detecting atrial shunting is well documented, however optimal technique is not well described. The purpose of this study is to assess the efficacy and safety of ASCi via TTE for assessment of right-to-left atrial communication in a large cohort of patients. Method: A retrospective review was undertaken of 1162 consecutive transthoracic (TTE) ASCi studies, of which 195 had also undergone clinically indicated transesophageal (TEE) echo. ASCi shunt results were compared with color flow imaging (CFI) and the role of provocative maneuvers (PM) assessed. Results: 403 TTE studies (35%) had paradoxical shunting seen during ASCi. Of these, 48% were positive with PM only. There was strong agreement between TTE ASCi and reported TEE findings (99% sensitivity, 85% specificity), with six false positive and two false negative results. In hindsight, the latter were likely due to suboptimal right atrial opacification, and the former due to transpulmonary shunting. TTE CFI was found to be insensitive (22%) for the detection of a PFO compared with TTE ASCi. Conclusions: TTE ASCi is minimally invasive and highly accurate for the detection of right-to-left atrial communication when PM are used. TTE CFI was found to be insensitive for PFO screening. It is recommended that TTE ASCi should be considered the initial diagnostic tool for the detection of PFO in clinical practice. A dedicated protocol should be followed to ensure adequate agitated saline contrast delivery and performance of provocative maneuvers.
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Resumo:
This paper presents an efficient face detection method suitable for real-time surveillance applications. Improved efficiency is achieved by constraining the search window of an AdaBoost face detector to pre-selected regions. Firstly, the proposed method takes a sparse grid of sample pixels from the image to reduce whole image scan time. A fusion of foreground segmentation and skin colour segmentation is then used to select candidate face regions. Finally, a classifier-based face detector is applied only to selected regions to verify the presence of a face (the Viola-Jones detector is used in this paper). The proposed system is evaluated using 640 x 480 pixels test images and compared with other relevant methods. Experimental results show that the proposed method reduces the detection time to 42 ms, where the Viola-Jones detector alone requires 565 ms (on a desktop processor). This improvement makes the face detector suitable for real-time applications. Furthermore, the proposed method requires 50% of the computation time of the best competing method, while reducing the false positive rate by 3.2% and maintaining the same hit rate.
Resumo:
Anomaly detection compensates shortcomings of signature-based detection such as protecting against Zero-Day exploits. However, Anomaly Detection can be resource-intensive and is plagued by a high false-positive rate. In this work, we address these problems by presenting a Cooperative Intrusion Detection approach for the AIS, the Artificial Immune System, as an example for an anomaly detection approach. In particular we show, how the cooperative approach reduces the false-positive rate of the detection and how the overall detection process can be organized to account for the resource constraints of the participating devices. Evaluations are carried out with the novel network simulation environment NeSSi as well as formally with an extension to the epidemic spread model SIR
Resumo:
Population-wide associations between loci due to linkage disequilibrium can be used to map quantitative trait loci (QTL) with high resolution. However, spurious associations between markers and QTL can also arise as a consequence of population stratification. Statistical methods that cannot differentiate between loci associations due to linkage disequilibria from those caused in other ways can render false-positive results. The transmission-disequilibrium test (TDT) is a robust test for detecting QTL. The TDT exploits within-family associations that are not affected by population stratification. However, some TDTs are formulated in a rigid-form, with reduced potential applications. In this study we generalize TDT using mixed linear models to allow greater statistical flexibility. Allelic effects are estimated with two independent parameters: one exploiting the robust within-family information and the other the potentially biased between-family information. A significant difference between these two parameters can be used as evidence for spurious association. This methodology was then used to test the effects of the fourth melanocortin receptor (MC4R) on production traits in the pig. The new analyses supported the previously reported results; i.e., the studied polymorphism is either causal of in very strong linkage disequilibrium with the causal mutation, and provided no evidence for spurious association.
Resumo:
In a classification problem typically we face two challenging issues, the diverse characteristic of negative documents and sometimes a lot of negative documents that are closed to positive documents. Therefore, it is hard for a single classifier to clearly classify incoming documents into classes. This paper proposes a novel gradual problem solving to create a two-stage classifier. The first stage identifies reliable negatives (negative documents with weak positive characteristics). It concentrates on minimizing the number of false negative documents (recall-oriented). We use Rocchio, an existing recall based classifier, for this stage. The second stage is a precision-oriented “fine tuning”, concentrates on minimizing the number of false positive documents by applying pattern (a statistical phrase) mining techniques. In this stage a pattern-based scoring is followed by threshold setting (thresholding). Experiment shows that our statistical phrase based two-stage classifier is promising.
Resumo:
Acoustic sensors can be used to estimate species richness for vocal species such as birds. They can continuously and passively record large volumes of data over extended periods. These data must subsequently be analyzed to detect the presence of vocal species. Automated analysis of acoustic data for large numbers of species is complex and can be subject to high levels of false positive and false negative results. Manual analysis by experienced surveyors can produce accurate results; however the time and effort required to process even small volumes of data can make manual analysis prohibitive. This study examined the use of sampling methods to reduce the cost of analyzing large volumes of acoustic sensor data, while retaining high levels of species detection accuracy. Utilizing five days of manually analyzed acoustic sensor data from four sites, we examined a range of sampling frequencies and methods including random, stratified, and biologically informed. We found that randomly selecting 120 one-minute samples from the three hours immediately following dawn over five days of recordings, detected the highest number of species. On average, this method detected 62% of total species from 120 one-minute samples, compared to 34% of total species detected from traditional area search methods. Our results demonstrate that targeted sampling methods can provide an effective means for analyzing large volumes of acoustic sensor data efficiently and accurately. Development of automated and semi-automated techniques is required to assist in analyzing large volumes of acoustic sensor data. Read More: http://www.esajournals.org/doi/abs/10.1890/12-2088.1
Resumo:
Genetic variability in the strength and precision of fear memory is hypothesised to contribute to the etiology of anxiety disorders, including post-traumatic stress disorder. We generated fear-susceptible (F-S) or fear-resistant (F-R) phenotypes from an F8 advanced intercross line (AIL) of C57BL/6J and DBA/2J inbred mice by selective breeding. We identified specific traits underlying individual variability in Pavlovian conditioned fear learning and memory. Offspring of selected lines differed in the acquisition of conditioned fear. Furthermore, F-S mice showed greater cued fear memory and generalised fear in response to a novel context than F-R mice. F-S mice showed greater basal corticosterone levels and hypothalamic corticotrophin-releasing hormone (CRH) mRNA levels than F-R mice, consistent with higher hypothalamic-pituitary-adrenal (HPA) axis drive. Hypothalamic mineralocorticoid receptor and CRH receptor 1 mRNA levels were decreased in F-S mice as compared with F-R mice. Manganese-enhanced magnetic resonance imaging (MEMRI) was used to investigate basal levels of brain activity. MEMRI identified a pattern of increased brain activity in F-S mice that was driven primarily by the hippocampus and amygdala, indicating excessive limbic circuit activity in F-S mice as compared with F-R mice. Thus, selection pressure applied to the AIL population leads to the accumulation of heritable trait-relevant characteristics within each line, whereas non-behaviorally relevant traits remain distributed. Selected lines therefore minimise false-positive associations between behavioral phenotypes and physiology. We demonstrate that intrinsic differences in HPA axis function and limbic excitability contribute to phenotypic differences in the acquisition and consolidation of associative fear memory. Identification of system-wide traits predisposing to variability in fear memory may help in the direction of more targeted and efficacious treatments for fear-related pathology. Through short-term selection in a B6D2 advanced intercross line we created mouse populations divergent for the retention of Pavlovian fear memory. Trait distinctions in HPA-axis drive and fear network circuitry could be made between naïve animals in the two lines. These data demonstrate underlying physiological and neurological differences between Fear-Susceptible and Fear-Resistant animals in a natural population. F-S and F-R mice may therefore be relevant to a spectrum of disorders including depression, anxiety disorders and PTSD for which altered fear processing occurs.
Resumo:
Hot spot identification (HSID) aims to identify potential sites—roadway segments, intersections, crosswalks, interchanges, ramps, etc.—with disproportionately high crash risk relative to similar sites. An inefficient HSID methodology might result in either identifying a safe site as high risk (false positive) or a high risk site as safe (false negative), and consequently lead to the misuse the available public funds, to poor investment decisions, and to inefficient risk management practice. Current HSID methods suffer from issues like underreporting of minor injury and property damage only (PDO) crashes, challenges of accounting for crash severity into the methodology, and selection of a proper safety performance function to model crash data that is often heavily skewed by a preponderance of zeros. Addressing these challenges, this paper proposes a combination of a PDO equivalency calculation and quantile regression technique to identify hot spots in a transportation network. In particular, issues related to underreporting and crash severity are tackled by incorporating equivalent PDO crashes, whilst the concerns related to the non-count nature of equivalent PDO crashes and the skewness of crash data are addressed by the non-parametric quantile regression technique. The proposed method identifies covariate effects on various quantiles of a population, rather than the population mean like most methods in practice, which more closely corresponds with how black spots are identified in practice. The proposed methodology is illustrated using rural road segment data from Korea and compared against the traditional EB method with negative binomial regression. Application of a quantile regression model on equivalent PDO crashes enables identification of a set of high-risk sites that reflect the true safety costs to the society, simultaneously reduces the influence of under-reported PDO and minor injury crashes, and overcomes the limitation of traditional NB model in dealing with preponderance of zeros problem or right skewed dataset.