900 resultados para Detection sensitivity
Resumo:
For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results highlighting the trade-off between indexing speed, search speed and search accuracy. The Figure of Merit is further improved by up to 25% using a novel proposal to utilise word-level language modelling during DMLS indexing. Analysis shows that this use of language modelling can, however, be unhelpful or even disadvantageous for terms with a very low language model probability. The DMLS approach to STD involves generating an index of phone sequences using phone recognition. An alternative approach to phonetic STD is also investigated that instead indexes probabilistic acoustic scores in the form of a posterior-feature matrix. A state-of-the-art system is described and its use for STD is explored through several experiments on spontaneous conversational telephone speech. A novel technique and framework is proposed for discriminatively training such a system to directly maximise the Figure of Merit. This results in a 13% improvement in the Figure of Merit on held-out data. The framework is also found to be particularly useful for index compression in conjunction with the proposed optimisation technique, providing for a substantial index compression factor in addition to an overall gain in the Figure of Merit. These contributions significantly advance the state-of-the-art in phonetic STD, by improving the utility of such systems in a wide range of applications.
Resumo:
Double-stranded RNA species ranging in molecular weight from 0.95 to 6.3 × 106 were detected in grapevines in New York. We recently showed that two of the species (Mr = 5.3 and 4.4 × 106) are associated with rupestris stem pitting disease. In this report, we show that the other eight detectable dsRNA species are associated with the powdery mildew fungus, Uncinula necator. These dsRNAs associated with the powdery mildew fungus were previously detected in leaves and epidermal stem tissue of grapevines infected with powdery mildew. The same dsRNA species were also detected from extracts of isolated cleistothecia and conidia of U. necator devoid of plant tissue. Isometric and rigid rodlike particles were observed in single cleistothecia preparations when examined under transmission electron microscopy.
Resumo:
Rupestris stem pitting (rSP), a graft-transmissible grapevine disease, can be identified only by its reaction (pitted wood) on inoculated Vitis rupestris ‘St. George.’ DsRNA was extracted from grapevines from California and Canada that indexed positive for rSP on St. George. Two distinct dsRNA species (B and C) (Mr = 5.3 × 106 and 4.4 × 106, respectively) were detected from the stem tissue of rSP-positive samples. Although similar dsRNA species (B and C) were detected in extracts of grapevines from New York, the association of dsRNA B and C with rSP in New York samples was not consistent. Also, eight different dsRNAs, known to be associated with the powdery mildew fungus, Uncinula necator, were detected in leaves of New York samples. In New York, the dsRNAs were not observed in leaves or stem samples collected from June through late August during the 1988 and 1989 growing seasons, suggesting that dsRNA detection in the grape tissue is variable throughout the season. We suggest that dsRNA species B and C are associated with rSP disease. The inconsistent results with New York samples are discussed.
Resumo:
Purpose. The objective of this study was to explore the discriminative capacity of non-contact corneal esthesiometry (NCCE) when compared with the neuropathy disability score (NDS) score—a validated, standard method of diagnosing clinically significant diabetic neuropathy. Methods. Eighty-one participants with type 2 diabetes, no history of ocular disease, trauma, or surgery and no history of systemic disease that may affect the cornea were enrolled. Participants were ineligible if there was history of neuropathy due to non-diabetic cause or current diabetic foot ulcer or infection. Corneal sensitivity threshold was measured on the eye of dominant hand side at a distance of 10 mm from the center of the cornea using a stimulus duration of 0.9 s. The NDS was measured producing a score ranging from 0 to 10. To determine the optimal cutoff point of corneal sensitivity that identified the presence of neuropathy (diagnosed by NDS), the Youden index and “closest-to-(0,1)” criteria were used. Results. The receiver-operator characteristic curve for NCCE for the presence of neuropathy (NDS ≥3) had an area under the curve of 0.73 (p = 0.001) and, for the presence of moderate neuropathy (NDS ≥6), area of 0.71 (p = 0.003). By using the Youden index, for an NDS ≥3, the sensitivity of NCCE was 70% and specificity was 75%, and a corneal sensitivity threshold of 0.66 mbar or higher indicated the presence of neuropathy. When NDS ≥6 (indicating risk of foot ulceration) was applied, the sensitivity was 52% with a specificity of 85%. Conclusions. NCCE is a sensitive test for the diagnosis of minimal and more advanced diabetic neuropathy and may serve as a useful surrogate marker for diabetic and perhaps other neuropathies.
Resumo:
Corneal-height data are typically measured with videokeratoscopes and modeled using a set of orthogonal Zernike polynomials. We address the estimation of the number of Zernike polynomials, which is formalized as a model-order selection problem in linear regression. Classical information-theoretic criteria tend to overestimate the corneal surface due to the weakness of their penalty functions, while bootstrap-based techniques tend to underestimate the surface or require extensive processing. In this paper, we propose to use the efficient detection criterion (EDC), which has the same general form of information-theoretic-based criteria, as an alternative to estimating the optimal number of Zernike polynomials. We first show, via simulations, that the EDC outperforms a large number of information-theoretic criteria and resampling-based techniques. We then illustrate that using the EDC for real corneas results in models that are in closer agreement with clinical expectations and provides means for distinguishing normal corneal surfaces from astigmatic and keratoconic surfaces.
Resumo:
For several reasons, the Fourier phase domain is less favored than the magnitude domain in signal processing and modeling of speech. To correctly analyze the phase, several factors must be considered and compensated, including the effect of the step size, windowing function and other processing parameters. Building on a review of these factors, this paper investigates a spectral representation based on the Instantaneous Frequency Deviation, but in which the step size between processing frames is used in calculating phase changes, rather than the traditional single sample interval. Reflecting these longer intervals, the term delta-phase spectrum is used to distinguish this from instantaneous derivatives. Experiments show that mel-frequency cepstral coefficients features derived from the delta-phase spectrum (termed Mel-Frequency delta-phase features) can produce broadly similar performance to equivalent magnitude domain features for both voice activity detection and speaker recognition tasks. Further, it is shown that the fusion of the magnitude and phase representations yields performance benefits over either in isolation.
Resumo:
This paper presents a method of voice activity detection (VAD) suitable for high noise scenarios, based on the fusion of two complementary systems. The first system uses a proposed non-Gaussianity score (NGS) feature based on normal probability testing. The second system employs a histogram distance score (HDS) feature that detects changes in the signal through conducting a template-based similarity measure between adjacent frames. The decision outputs by the two systems are then merged using an open-by-reconstruction fusion stage. Accuracy of the proposed method was compared to several baseline VAD methods on a database created using real recordings of a variety of high-noise environments.
Resumo:
In automatic facial expression detection, very accurate registration is desired which can be achieved via a deformable model approach where a dense mesh of 60-70 points on the face is used, such as an active appearance model (AAM). However, for applications where manually labeling frames is prohibitive, AAMs do not work well as they do not generalize well to unseen subjects. As such, a more coarse approach is taken for person-independent facial expression detection, where just a couple of key features (such as face and eyes) are tracked using a Viola-Jones type approach. The tracked image is normally post-processed to encode for shift and illumination invariance using a linear bank of filters. Recently, it was shown that this preprocessing step is of no benefit when close to ideal registration has been obtained. In this paper, we present a system based on the Constrained Local Model (CLM) which is a generic or person-independent face alignment algorithm which gains high accuracy. We show these results against the LBP feature extraction on the CK+ and GEMEP datasets.
Resumo:
Methicillin-resistant Staphylococcus Aureus (MRSA) is a pathogen that continues to be of major concern in hospitals. We develop models and computational schemes based on observed weekly incidence data to estimate MRSA transmission parameters. We extend the deterministic model of McBryde, Pettitt, and McElwain (2007, Journal of Theoretical Biology 245, 470–481) involving an underlying population of MRSA colonized patients and health-care workers that describes, among other processes, transmission between uncolonized patients and colonized health-care workers and vice versa. We develop new bivariate and trivariate Markov models to include incidence so that estimated transmission rates can be based directly on new colonizations rather than indirectly on prevalence. Imperfect sensitivity of pathogen detection is modeled using a hidden Markov process. The advantages of our approach include (i) a discrete valued assumption for the number of colonized health-care workers, (ii) two transmission parameters can be incorporated into the likelihood, (iii) the likelihood depends on the number of new cases to improve precision of inference, (iv) individual patient records are not required, and (v) the possibility of imperfect detection of colonization is incorporated. We compare our approach with that used by McBryde et al. (2007) based on an approximation that eliminates the health-care workers from the model, uses Markov chain Monte Carlo and individual patient data. We apply these models to MRSA colonization data collected in a small intensive care unit at the Princess Alexandra Hospital, Brisbane, Australia.
Resumo:
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second zone system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state of the art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
Resumo:
A recent advance in biosecurity surveillance design aims to benefit island conservation through early and improved detection of incursions by non-indigenous species. The novel aspects of the design are that it achieves a specified power of detection in a cost-managed system, while acknowledging heterogeneity of risk in the study area and stratifying the area to target surveillance deployment. The design also utilises a variety of surveillance system components, such as formal scientific surveys, trapping methods, and incidental sightings by non-biologist observers. These advances in design were applied to black rats (Rattus rattus) representing the group of invasive rats including R. norvegicus, and R. exulans, which are potential threats to Barrow Island, Australia, a high value conservation nature reserve where a proposed liquefied natural gas development is a potential source of incursions. Rats are important to consider as they are prevalent invaders worldwide, difficult to detect early when present in low numbers, and able to spread and establish relatively quickly after arrival. The ‘exemplar’ design for the black rat is then applied in a manner that enables the detection of a range of non-indigenous species of rat that could potentially be introduced. Many of the design decisions were based on expert opinion as data gaps exist in empirical data. The surveillance system was able to take into account factors such as collateral effects on native species, the availability of limited resources on an offshore island, financial costs, demands on expertise and other logistical constraints. We demonstrate the flexibility and robustness of the surveillance system and discuss how it could be updated as empirical data are collected to supplement expert opinion and provide a basis for adaptive management. Overall, the surveillance system promotes an efficient use of resources while providing defined power to detect early rat incursions, translating to reduced environmental, resourcing and financial costs.
Resumo:
As organizations reach to higher levels of business process management maturity, they often find themselves maintaining repositories of hundreds or even thousands of process models, representing valuable knowledge about their operations. Over time, process model repositories tend to accumulate duplicate fragments (also called clones) as new process models are created or extended by copying and merging fragments from other models. This calls for methods to detect clones in process models, so that these clones can be refactored as separate subprocesses in order to improve maintainability. This paper presents an indexing structure to support the fast detection of clones in large process model repositories. The proposed index is based on a novel combination of a method for process model decomposition (specifically the Refined Process Structure Tree), with established graph canonization and string matching techniques. Experiments show that the algorithm scales to repositories with hundreds of models. The experimental results also show that a significant number of non-trivial clones can be found in process model repositories taken from industrial practice.
Resumo:
This work proposes to improve spoken term detection (STD) accuracy by optimising the Figure of Merit (FOM). In this article, the index takes the form of phonetic posterior-feature matrix. Accuracy is improved by formulating STD as a discriminative training problem and directly optimising the FOM, through its use as an objective function to train a transformation of the index. The outcome of indexing is then a matrix of enhanced posterior-features that are directly tailored for the STD task. The technique is shown to improve the FOM by up to 13% on held-out data. Additional analysis explores the effect of the technique on phone recognition accuracy, examines the actual values of the learned transform, and demonstrates that using an extended training data set results in further improvement in the FOM.
Resumo:
Detection of Region of Interest (ROI) in a video leads to more efficient utilization of bandwidth. This is because any ROIs in a given frame can be encoded in higher quality than the rest of that frame, with little or no degradation of quality from the perception of the viewers. Consequently, it is not necessary to uniformly encode the whole video in high quality. One approach to determine ROIs is to use saliency detectors to locate salient regions. This paper proposes a methodology for obtaining ground truth saliency maps to measure the effectiveness of ROI detection by considering the role of user experience during the labelling process of such maps. User perceptions can be captured and incorporated into the definition of salience in a particular video, taking advantage of human visual recall within a given context. Experiments with two state-of-the-art saliency detectors validate the effectiveness of this approach to validating visual saliency in video. This paper will provide the relevant datasets associated with the experiments.