957 resultados para false negative rate
Resumo:
Motivation: An important problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. We provide a straightforward and easily implemented method for estimating the posterior probability that an individual gene is null. The problem can be expressed in a two-component mixture framework, using an empirical Bayes approach. Current methods of implementing this approach either have some limitations due to the minimal assumptions made or with more specific assumptions are computationally intensive. Results: By converting to a z-score the value of the test statistic used to test the significance of each gene, we propose a simple two-component normal mixture that models adequately the distribution of this score. The usefulness of our approach is demonstrated on three real datasets.
Resumo:
Nucleic acid amplification tests (NAATs) for the detection of Neisseria gonorrhoeae became available in the early 1990s. Although offering several advantages over traditional detection methods, N. gonorrhoeae NAATs do have some limitations. These include cost, risk of carryover contamination, inhibition, and inability to provide antibiotic resistance data. In addition, there are sequence-related limitations that are unique to N. gonorrhoeae NAATs. In particular, false-positive results are a major consideration. These primarily stem from the frequent horizontal genetic exchange occurring within the Neisseria genus, leading to commensal Neisseria species acquiring N. gonorrhoeae genes. Furthermore, some N. gonorrhoeae subtypes may lack specific sequences targeted by a particular NAAT. Therefore, NAAT false-negative results because of sequence variation may occur in some gonococcal populations. Overall, the N. gonorrhoeae species continues to present a considerable challenge for molecular diagnostics. The need to evaluate N. gonorrhoeae NAATs before their use in any new patient population and to educate physicians on the limitations of these tests is emphasized in this review.
Resumo:
PTS1 proteins are peroxisomal matrix proteins that have a well conserved targeting motif at the C-terminal end. However, this motif is present in many non peroxisomal proteins as well, thus predicting peroxisomal proteins involves differentiating fake PTS1 signals from actual ones. In this paper we report on the development of an SVM classifier with a separately trained logistic output function. The model uses an input window containing 12 consecutive residues at the C-terminus and the amino acid composition of the full sequence. The final model gives a Matthews Correlation Coefficient of 0.77, representing an increase of 54% compared with the well-known PeroxiP predictor. We test the model by applying it to several proteomes of eukaryotes for which there is no evidence of a peroxisome, producing a false positive rate of 0.088%.
Resumo:
This Thesis addresses the problem of automated false-positive free detection of epileptic events by the fusion of information extracted from simultaneously recorded electro-encephalographic (EEG) and the electrocardiographic (ECG) time-series. The approach relies on a biomedical case for the coupling of the Brain and Heart systems through the central autonomic network during temporal lobe epileptic events: neurovegetative manifestations associated with temporal lobe epileptic events consist of alterations to the cardiac rhythm. From a neurophysiological perspective, epileptic episodes are characterised by a loss of complexity of the state of the brain. The description of arrhythmias, from a probabilistic perspective, observed during temporal lobe epileptic events and the description of the complexity of the state of the brain, from an information theory perspective, are integrated in a fusion-of-information framework towards temporal lobe epileptic seizure detection. The main contributions of the Thesis include the introduction of a biomedical case for the coupling of the Brain and Heart systems during temporal lobe epileptic seizures, partially reported in the clinical literature; the investigation of measures for the characterisation of ictal events from the EEG time series towards their integration in a fusion-of-knowledge framework; the probabilistic description of arrhythmias observed during temporal lobe epileptic events towards their integration in a fusion-of-knowledge framework; and the investigation of the different levels of the fusion-of-information architecture at which to perform the combination of information extracted from the EEG and ECG time-series. The performance of the method designed in the Thesis for the false-positive free automated detection of epileptic events achieved a false-positives rate of zero on the dataset of long-term recordings used in the Thesis.
Resumo:
Oxidative post-translational modifications (oxPTMs) can alter the function of proteins, and are important in the redox regulation of cell behaviour. The most informative technique to detect and locate oxPTMs within proteins is mass spectrometry (MS). However, proteomic MS data are usually searched against theoretical databases using statistical search engines, and the occurrence of unspecified or multiple modifications, or other unexpected features, can lead to failure to detect the modifications and erroneous identifications of oxPTMs. We have developed a new approach for mining data from accurate mass instruments that allows multiple modifications to be examined. Accurate mass extracted ion chromatograms (XIC) for specific reporter ions from peptides containing oxPTMs were generated from standard LC-MSMS data acquired on a rapid-scanning high-resolution mass spectrometer (ABSciex 5600 Triple TOF). The method was tested using proteins from human plasma or isolated LDL. A variety of modifications including chlorotyrosine, nitrotyrosine, kynurenine, oxidation of lysine, and oxidized phospholipid adducts were detected. For example, the use of a reporter ion at 184.074 Da/e, corresponding to phosphocholine, was used to identify for the first time intact oxidized phosphatidylcholine adducts on LDL. In all cases the modifications were confirmed by manual sequencing. ApoB-100 containing oxidized lipid adducts was detected even in healthy human samples, as well as LDL from patients with chronic kidney disease. The accurate mass XIC method gave a lower false positive rate than normal database searching using statistical search engines, and identified more oxidatively modified peptides. A major advantage was that additional modifications could be searched after data collection, and multiple modifications on a single peptide identified. The oxPTMs present on albumin and ApoB-100 have potential as indicators of oxidative damage in ageing or inflammatory diseases.
Resumo:
We propose a novel template matching approach for the discrimination of handwritten and machine-printed text. We first pre-process the scanned document images by performing denoising, circles/lines exclusion and word-block level segmentation. We then align and match characters in a flexible sized gallery with the segmented regions, using parallelised normalised cross-correlation. The experimental results over the Pattern Recognition & Image Analysis Research Lab-Natural History Museum (PRImA-NHM) dataset show remarkably high robustness of the algorithm in classifying cluttered, occluded and noisy samples, in addition to those with significant high missing data. The algorithm, which gives 84.0% classification rate with false positive rate 0.16 over the dataset, does not require training samples and generates compelling results as opposed to the training-based approaches, which have used the same benchmark.
Resumo:
Traffic incidents are non-recurring events that can cause a temporary reduction in roadway capacity. They have been recognized as a major contributor to traffic congestion on our nation’s highway systems. To alleviate their impacts on capacity, automatic incident detection (AID) has been applied as an incident management strategy to reduce the total incident duration. AID relies on an algorithm to identify the occurrence of incidents by analyzing real-time traffic data collected from surveillance detectors. Significant research has been performed to develop AID algorithms for incident detection on freeways; however, similar research on major arterial streets remains largely at the initial stage of development and testing. This dissertation research aims to identify design strategies for the deployment of an Artificial Neural Network (ANN) based AID algorithm for major arterial streets. A section of the US-1 corridor in Miami-Dade County, Florida was coded in the CORSIM microscopic simulation model to generate data for both model calibration and validation. To better capture the relationship between the traffic data and the corresponding incident status, Discrete Wavelet Transform (DWT) and data normalization were applied to the simulated data. Multiple ANN models were then developed for different detector configurations, historical data usage, and the selection of traffic flow parameters. To assess the performance of different design alternatives, the model outputs were compared based on both detection rate (DR) and false alarm rate (FAR). The results show that the best models were able to achieve a high DR of between 90% and 95%, a mean time to detect (MTTD) of 55-85 seconds, and a FAR below 4%. The results also show that a detector configuration including only the mid-block and upstream detectors performs almost as well as one that also includes a downstream detector. In addition, DWT was found to be able to improve model performance, and the use of historical data from previous time cycles improved the detection rate. Speed was found to have the most significant impact on the detection rate, while volume was found to contribute the least. The results from this research provide useful insights on the design of AID for arterial street applications.
Resumo:
The 9/11 Act mandates the inspection of 100% of cargo shipments entering the U.S. by 2012 and 100% inspection of air cargo by March 2010. So far, only 5% of inbound shipping containers are inspected thoroughly while air cargo inspections have fared better at 50%. Government officials have admitted that these milestones cannot be met since the appropriate technology does not exist. This research presents a novel planar solid phase microextraction (PSPME) device with enhanced surface area and capacity for collection of the volatile chemical signatures in air that are emitted from illicit compounds for direct introduction into ion mobility spectrometers (IMS) for detection. These IMS detectors are widely used to detect particles of illicit substances and do not have to be adapted specifically to this technology. For static extractions, PDMS and sol-gel PDMS PSPME devices provide significant increases in sensitivity over conventional fiber SPME. Results show a 50–400 times increase in mass detected of piperonal and a 2–4 times increase for TNT. In a blind study of 6 cases suspected to contain varying amounts of MDMA, PSPME-IMS correctly detected 5 positive cases with no false positives or negatives. One of these cases had minimal amounts of MDMA resulting in a false negative response for fiber SPME-IMS. A La (dihed) phase chemistry has shown an increase in the extraction efficiency of TNT and 2,4-DNT and enhanced retention over time. An alternative PSPME device was also developed for the rapid (seconds) dynamic sampling and preconcentration of large volumes of air for direct thermal desorption into an IMS. This device affords high extraction efficiencies due to strong retention properties under ambient conditions resulting in ppt detection limits when 3.5 L of air are sampled over the course of 10 seconds. Dynamic PSPME was used to sample the headspace over the following: MDMA tablets (12–40 ng detected of piperonal), high explosives (Pentolite) (0.6 ng detected of TNT), and several smokeless powders (26–35 ng of 2,4-DNT and 11–74 ng DPA detected). PSPME-IMS technology is flexible to end-user needs, is low-cost, rapid, sensitive, easy to use, easy to implement, and effective. ^
Resumo:
This dissertation develops an innovative approach towards less-constrained iris biometrics. Two major contributions are made in this research endeavor: (1) Designed an award-winning segmentation algorithm in the less-constrained environment where image acquisition is made of subjects on the move and taken under visible lighting conditions, and (2) Developed a pioneering iris biometrics method coupling segmentation and recognition of the iris based on video of moving persons under different acquisitions scenarios. The first part of the dissertation introduces a robust and fast segmentation approach using still images contained in the UBIRIS (version 2) noisy iris database. The results show accuracy estimated at 98% when using 500 randomly selected images from the UBIRIS.v2 partial database, and estimated at 97% in a Noisy Iris Challenge Evaluation (NICE.I) in an international competition that involved 97 participants worldwide involving 35 countries, ranking this research group in sixth position. This accuracy is achieved with a processing speed nearing real time. The second part of this dissertation presents an innovative segmentation and recognition approach using video-based iris images. Following the segmentation stage which delineates the iris region through a novel segmentation strategy, some pioneering experiments on the recognition stage of the less-constrained video iris biometrics have been accomplished. In the video-based and less-constrained iris recognition, the test or subject iris videos/images and the enrolled iris images are acquired with different acquisition systems. In the matching step, the verification/identification result was accomplished by comparing the similarity distance of encoded signature from test images with each of the signature dataset from the enrolled iris images. With the improvements gained, the results proved to be highly accurate under the unconstrained environment which is more challenging. This has led to a false acceptance rate (FAR) of 0% and a false rejection rate (FRR) of 17.64% for 85 tested users with 305 test images from the video, which shows great promise and high practical implications for iris biometrics research and system design.
Resumo:
Weakly electric fish produce a dual function electric signal that makes them ideal models for the study of sensory computation and signal evolution. This signal, the electric organ discharge (EOD), is used for communication and navigation. In some families of gymnotiform electric fish, the EOD is a dynamic signal that increases in amplitude during social interactions. Amplitude increase could facilitate communication by increasing the likelihood of being sensed by others or by impressing prospective mates or rivals. Conversely, by increasing its signal amplitude a fish might increase its sensitivity to objects by lowering its electrolocation detection threshold. To determine how EOD modulations elicited in the social context affect electrolocation, I developed an automated and fast method for measuring electroreception thresholds using a classical conditioning paradigm. This method employs a moving shelter tube, which these fish occupy at rest during the day, paired with an electrical stimulus. A custom built and programmed robotic system presents the electrical stimulus to the fish, slides the shelter tube requiring them to follow, and records video of their movements. I trained the electric fish of the genus Sternopygus was trained to respond to a resistive stimulus on this apparatus in 2 days. The motion detection algorithm correctly identifies the responses 91% of the time, with a false positive rate of only 4%. This system allows for a large number of trials, decreasing the amount of time needed to determine behavioral electroreception thresholds. This novel method enables the evaluation the evolutionary interplay between two conflicting sensory forces, social communication and navigation.
Resumo:
Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as ƒ-test is performed during each node's split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.
Resumo:
Traffic incidents are non-recurring events that can cause a temporary reduction in roadway capacity. They have been recognized as a major contributor to traffic congestion on our national highway systems. To alleviate their impacts on capacity, automatic incident detection (AID) has been applied as an incident management strategy to reduce the total incident duration. AID relies on an algorithm to identify the occurrence of incidents by analyzing real-time traffic data collected from surveillance detectors. Significant research has been performed to develop AID algorithms for incident detection on freeways; however, similar research on major arterial streets remains largely at the initial stage of development and testing. This dissertation research aims to identify design strategies for the deployment of an Artificial Neural Network (ANN) based AID algorithm for major arterial streets. A section of the US-1 corridor in Miami-Dade County, Florida was coded in the CORSIM microscopic simulation model to generate data for both model calibration and validation. To better capture the relationship between the traffic data and the corresponding incident status, Discrete Wavelet Transform (DWT) and data normalization were applied to the simulated data. Multiple ANN models were then developed for different detector configurations, historical data usage, and the selection of traffic flow parameters. To assess the performance of different design alternatives, the model outputs were compared based on both detection rate (DR) and false alarm rate (FAR). The results show that the best models were able to achieve a high DR of between 90% and 95%, a mean time to detect (MTTD) of 55-85 seconds, and a FAR below 4%. The results also show that a detector configuration including only the mid-block and upstream detectors performs almost as well as one that also includes a downstream detector. In addition, DWT was found to be able to improve model performance, and the use of historical data from previous time cycles improved the detection rate. Speed was found to have the most significant impact on the detection rate, while volume was found to contribute the least. The results from this research provide useful insights on the design of AID for arterial street applications.
Resumo:
Funding — Forest Enterprise Scotland and the University of Aberdeen provided funding for the project. The Carnegie Trust supported the lead author, E. McHenry, in this research through the award of a tuition fees bursary.
Resumo:
Colorectal cancer is the third most commonly diagnosed cancer, accounting for 53,219 deaths in 2007 and an estimated 146,970 new cases in the USA during 2009. The combination of FDG PET and CT has proven to be of great benefit for the assessment of colorectal cancer. This is most evident in the detection of occult metastases, particularly intra- or extrahepatic sites of disease, that would preclude a curative procedure or in the detection of local recurrence. FDG PET is generally not used for the diagnosis of colorectal cancer although there are circumstances where PET-CT may make the initial diagnosis, particularly with its more widespread use. In addition, precancerous adenomatous polyps can also be detected incidentally on whole-body images performed for other indications; sensitivity increases with increasing polyp size. False-negative FDG PET findings have been reported with mucinous adenocarcinoma, and false-positive findings have been reported due to inflammatory conditions such as diverticulitis, colitis, and postoperative scarring. Therefore, detailed evaluation of the CT component of a PET/CT exam, including assessment of the entire colon, is essential.
Resumo:
Human use of the oceans is increasingly in conflict with conservation of endangered species. Methods for managing the spatial and temporal placement of industries such as military, fishing, transportation and offshore energy, have historically been post hoc; i.e. the time and place of human activity is often already determined before assessment of environmental impacts. In this dissertation, I build robust species distribution models in two case study areas, US Atlantic (Best et al. 2012) and British Columbia (Best et al. 2015), predicting presence and abundance respectively, from scientific surveys. These models are then applied to novel decision frameworks for preemptively suggesting optimal placement of human activities in space and time to minimize ecological impacts: siting for offshore wind energy development, and routing ships to minimize risk of striking whales. Both decision frameworks relate the tradeoff between conservation risk and industry profit with synchronized variable and map views as online spatial decision support systems.
For siting offshore wind energy development (OWED) in the U.S. Atlantic (chapter 4), bird density maps are combined across species with weights of OWED sensitivity to collision and displacement and 10 km2 sites are compared against OWED profitability based on average annual wind speed at 90m hub heights and distance to transmission grid. A spatial decision support system enables toggling between the map and tradeoff plot views by site. A selected site can be inspected for sensitivity to a cetaceans throughout the year, so as to capture months of the year which minimize episodic impacts of pre-operational activities such as seismic airgun surveying and pile driving.
Routing ships to avoid whale strikes (chapter 5) can be similarly viewed as a tradeoff, but is a different problem spatially. A cumulative cost surface is generated from density surface maps and conservation status of cetaceans, before applying as a resistance surface to calculate least-cost routes between start and end locations, i.e. ports and entrance locations to study areas. Varying a multiplier to the cost surface enables calculation of multiple routes with different costs to conservation of cetaceans versus cost to transportation industry, measured as distance. Similar to the siting chapter, a spatial decisions support system enables toggling between the map and tradeoff plot view of proposed routes. The user can also input arbitrary start and end locations to calculate the tradeoff on the fly.
Essential to the input of these decision frameworks are distributions of the species. The two preceding chapters comprise species distribution models from two case study areas, U.S. Atlantic (chapter 2) and British Columbia (chapter 3), predicting presence and density, respectively. Although density is preferred to estimate potential biological removal, per Marine Mammal Protection Act requirements in the U.S., all the necessary parameters, especially distance and angle of observation, are less readily available across publicly mined datasets.
In the case of predicting cetacean presence in the U.S. Atlantic (chapter 2), I extracted datasets from the online OBIS-SEAMAP geo-database, and integrated scientific surveys conducted by ship (n=36) and aircraft (n=16), weighting a Generalized Additive Model by minutes surveyed within space-time grid cells to harmonize effort between the two survey platforms. For each of 16 cetacean species guilds, I predicted the probability of occurrence from static environmental variables (water depth, distance to shore, distance to continental shelf break) and time-varying conditions (monthly sea-surface temperature). To generate maps of presence vs. absence, Receiver Operator Characteristic (ROC) curves were used to define the optimal threshold that minimizes false positive and false negative error rates. I integrated model outputs, including tables (species in guilds, input surveys) and plots (fit of environmental variables, ROC curve), into an online spatial decision support system, allowing for easy navigation of models by taxon, region, season, and data provider.
For predicting cetacean density within the inner waters of British Columbia (chapter 3), I calculated density from systematic, line-transect marine mammal surveys over multiple years and seasons (summer 2004, 2005, 2008, and spring/autumn 2007) conducted by Raincoast Conservation Foundation. Abundance estimates were calculated using two different methods: Conventional Distance Sampling (CDS) and Density Surface Modelling (DSM). CDS generates a single density estimate for each stratum, whereas DSM explicitly models spatial variation and offers potential for greater precision by incorporating environmental predictors. Although DSM yields a more relevant product for the purposes of marine spatial planning, CDS has proven to be useful in cases where there are fewer observations available for seasonal and inter-annual comparison, particularly for the scarcely observed elephant seal. Abundance estimates are provided on a stratum-specific basis. Steller sea lions and harbour seals are further differentiated by ‘hauled out’ and ‘in water’. This analysis updates previous estimates (Williams & Thomas 2007) by including additional years of effort, providing greater spatial precision with the DSM method over CDS, novel reporting for spring and autumn seasons (rather than summer alone), and providing new abundance estimates for Steller sea lion and northern elephant seal. In addition to providing a baseline of marine mammal abundance and distribution, against which future changes can be compared, this information offers the opportunity to assess the risks posed to marine mammals by existing and emerging threats, such as fisheries bycatch, ship strikes, and increased oil spill and ocean noise issues associated with increases of container ship and oil tanker traffic in British Columbia’s continental shelf waters.
Starting with marine animal observations at specific coordinates and times, I combine these data with environmental data, often satellite derived, to produce seascape predictions generalizable in space and time. These habitat-based models enable prediction of encounter rates and, in the case of density surface models, abundance that can then be applied to management scenarios. Specific human activities, OWED and shipping, are then compared within a tradeoff decision support framework, enabling interchangeable map and tradeoff plot views. These products make complex processes transparent for gaming conservation, industry and stakeholders towards optimal marine spatial management, fundamental to the tenets of marine spatial planning, ecosystem-based management and dynamic ocean management.