958 resultados para Error detection
Resumo:
We present a novel approach for the detection of severe obstructive sleep apnea (OSA) based on patients' voices introducing nonlinear measures to describe sustained speech dynamics. Nonlinear features were combined with state-of-the-art speech recognition systems using statistical modeling techniques (Gaussian mixture models, GMMs) over cepstral parameterization (MFCC) for both continuous and sustained speech. Tests were performed on a database including speech records from both severe OSA and control speakers. A 10 % relative reduction in classification error was obtained for sustained speech when combining MFCC-GMM and nonlinear features, and 33 % when fusing nonlinear features with both sustained and continuous MFCC-GMM. Accuracy reached 88.5 % allowing the system to be used in OSA early detection. Tests showed that nonlinear features and MFCCs are lightly correlated on sustained speech, but uncorrelated on continuous speech. Results also suggest the existence of nonlinear effects in OSA patients' voices, which should be found in continuous speech.
Resumo:
La presente Tesis analiza las posibilidades que ofrecen en la actualidad las tecnologías del habla para la detección de patologías clínicas asociadas a la vía aérea superior. El estudio del habla que tradicionalmente cubre tanto la producción como el proceso de transformación del mensaje y las señales involucradas, desde el emisor hasta alcanzar al receptor, ofrece una vía de estudio alternativa para estas patologías. El hecho de que la señal emitida no solo contiene este mensaje, sino también información acerca del locutor, ha motivado el desarrollo de sistemas orientados a la identificación y verificación de la identidad de los locutores. Estos trabajos han recibido recientemente un nuevo impulso, orientándose tanto hacia la caracterización de rasgos que son comunes a varios locutores, como a las diferencias existentes entre grabaciones de un mismo locutor. Los primeros resultan especialmente relevantes para esta Tesis dado que estos rasgos podrían evidenciar la presencia de características relacionadas con una cierta condición común a varios locutores, independiente de su identidad. Tal es el caso que se enfrenta en esta Tesis, donde los rasgos identificados se relacionarían con una de la patología particular y directamente vinculada con el sistema de físico de conformación del habla. El caso del Síndrome de Apneas Hipopneas durante el Sueno (SAHS) resulta paradigmático. Se trata de una patología con una elevada prevalencia mundo, que aumenta con la edad. Los pacientes de esta patología experimentan episodios de cese involuntario de la respiración durante el sueño, que se prolongan durante varios segundos y que se reproducen a lo largo de la noche impidiendo el correcto descanso. En el caso de la apnea obstructiva, estos episodios se deben a la imposibilidad de mantener un camino abierto a través de la vía aérea, de forma que el flujo de aire se ve interrumpido. En la actualidad, el diagnostico de estos pacientes se realiza a través de un estudio polisomnográfico, que se centra en el análisis de los episodios de apnea durante el sueño, requiriendo que el paciente permanezca en el hospital durante una noche. La complejidad y el elevado coste de estos procedimientos, unidos a las crecientes listas de espera, han evidenciado la necesidad de contar con técnicas rápidas de detección, que si bien podrían no obtener tasas tan elevadas, permitirían reorganizar las listas de espera en función del grado de severidad de la patología en cada paciente. Entre otros, los sistemas de diagnostico por imagen, así como la caracterización antropométrica de los pacientes, han evidenciado la existencia de patrones anatómicos que tendrían influencia directa sobre el habla. Los trabajos dedicados al estudio del SAHS en lo relativo a como esta afecta al habla han sido escasos y algunos de ellos incluso contradictorios. Sin embargo, desde finales de la década de 1980 se conoce la existencia de patrones específicos relativos a la articulación, la fonación y la resonancia. Sin embargo, su descripción resultaba difícilmente aprovechable a través de un sistema de reconocimiento automático, pero apuntaba la existencia de un nexo entre voz y SAHS. En los últimos anos las técnicas de procesado automático han permitido el desarrollo de sistemas automáticos que ya son capaces de identificar diferencias significativas en el habla de los pacientes del SAHS, y que los distinguen de los locutores sanos. Por contra, poco se conoce acerca de la conexión entre estos nuevos resultados, los sé que habían obtenido en el pasado y la patogénesis del SAHS. Esta Tesis continua la labor desarrollada en este ámbito considerando específicamente: el estudio de la forma en que el SAHS afecta el habla de los pacientes, la mejora en las tasas de clasificación automática y la combinación de la información obtenida con los predictores utilizados por los especialistas clínicos en sus evaluaciones preliminares. Las dos primeras tareas plantean problemas simbióticos, pero diferentes. Mientras el estudio de la conexión entre el SAHS y el habla requiere de modelos acotados que puedan ser interpretados con facilidad, los sistemas de reconocimiento se sirven de un elevado número de dimensiones para la caracterización y posterior identificación de patrones. Así, la primera tarea debe permitirnos avanzar en la segunda, al igual que la incorporación de los predictores utilizados por los especialistas clínicos. La Tesis aborda el estudio tanto del habla continua como del habla sostenida, con el fin de aprovechar las sinergias y diferencias existentes entre ambas. En el análisis del habla continua se tomo como punto de partida un esquema que ya fue evaluado con anterioridad, y sobre el cual se ha tratado la evaluación y optimización de la representación del habla, así como la caracterización de los patrones específicos asociados al SAHS. Ello ha evidenciado la conexión entre el SAHS y los elementos fundamentales de la señal de voz: los formantes. Los resultados obtenidos demuestran que el éxito de estos sistemas se debe, fundamentalmente, a la capacidad de estas representaciones para describir dichas componentes, obviando las dimensiones ruidosas o con poca capacidad discriminativa. El esquema resultante ofrece una tasa de error por debajo del 18%, sirviéndose de clasificadores notablemente menos complejos que los descritos en el estado del arte y de una única grabación de voz de corta duración. En relación a la conexión entre el SAHS y los patrones observados, fue necesario considerar las diferencias inter- e intra-grupo, centrándonos en la articulación característica del locutor, sustituyendo los complejos modelos de clasificación por el estudio de los promedios espectrales. El resultado apunta con claridad hacia ciertas regiones del eje de frecuencias, sugiriendo la existencia de un estrechamiento sistemático en la sección del tracto en la región de la orofaringe, ya prevista en la patogénesis de este síndrome. En cuanto al habla sostenida, se han reproducido los estudios realizados sobre el habla continua en grabaciones de la vocal /a/ sostenida. Los resultados son cualitativamente análogos a los anteriores, si bien en este caso las tasas de clasificación resultan ser más bajas. Con el objetivo de identificar el sentido de este resultado se reprodujo el estudio de los promedios espectrales y de la variabilidad inter e intra-grupo. Ambos estudios mostraron importantes diferencias con los anteriores que podrían explicar estos resultados. Sin embargo, el habla sostenida ofrece otras oportunidades al establecer un entorno controlado para el estudio de la fonación, que también había sido identificada como una fuente de información para la detección del SAHS. De su estudio se pudo observar que, en el conjunto de datos disponibles, no existen variaciones que pudieran asociarse fácilmente con la fonación. Únicamente aquellas dimensiones que describen la distribución de energía a lo largo del eje de frecuencia evidenciaron diferencias significativas, apuntando, una vez más, en la dirección de las resonancias espectrales. Analizados los resultados anteriores, la Tesis afronta la fusión de ambas fuentes de información en un único sistema de clasificación. Con ello es posible mejorar las tasas de clasificación, bajo la hipótesis de que la información presente en el habla continua y el habla sostenida es fundamentalmente distinta. Esta tarea se realizo a través de un sencillo esquema de fusión que obtuvo un 88.6% de aciertos en clasificación (tasa de error del 11.4%), lo que representa una mejora significativa respecto al estado del arte. Finalmente, la combinación de este clasificador con los predictores utilizados por los especialistas clínicos ofreció una tasa del 91.3% (tasa de error de 8.7%), que se encuentra dentro del margen ofrecido por esquemas más costosos e intrusivos, y que a diferencia del propuesto, no pueden ser utilizados en la evaluación previa de los pacientes. Con todo, la Tesis ofrece una visión clara sobre la relación entre el SAHS y el habla, evidenciando el grado de madurez alcanzado por la tecnología del habla en la caracterización y detección del SAHS, poniendo de manifiesto que su uso para la evaluación de los pacientes ya sería posible, y dejando la puerta abierta a futuras investigaciones que continúen el trabajo aquí iniciado. ABSTRACT This Thesis explores the potential of speech technologies for the detection of clinical disorders connected to the upper airway. The study of speech traditionally covers both the production process and post processing of the signals involved, from the speaker up to the listener, offering an alternative path to study these pathologies. The fact that utterances embed not just the encoded message but also information about the speaker, has motivated the development of automatic systems oriented to the identification and verificaton the speaker’s identity. These have recently been boosted and reoriented either towards the characterization of traits that are common to several speakers, or to the differences between records of the same speaker collected under different conditions. The first are particularly relevant to this Thesis as these patterns could reveal the presence of features that are related to a common condition shared among different speakers, regardless of their identity. Such is the case faced in this Thesis, where the traits identified would relate to a particular pathology, directly connected to the speech production system. The Obstructive Sleep Apnea syndrome (OSA) is a paradigmatic case for analysis. It is a disorder with high prevalence among adults and affecting a larger number of them as they grow older. Patients suffering from this disorder experience episodes of involuntary cessation of breath during sleep that may last a few seconds and reproduce throughout the night, preventing proper rest. In the case of obstructive apnea, these episodes are related to the collapse of the pharynx, which interrupts the air flow. Currently, OSA diagnosis is done through a polysomnographic study, which focuses on the analysis of apnea episodes during sleep, requiring the patient to stay at the hospital for the whole night. The complexity and high cost of the procedures involved, combined with the waiting lists, have evidenced the need for screening techniques, which perhaps would not achieve outstanding performance rates but would allow clinicians to reorganize these lists ranking patients according to the severity of their condition. Among others, imaging diagnosis and anthropometric characterization of patients have evidenced the existence of anatomical patterns related to OSA that have direct influence on speech. Contributions devoted to the study of how this disorder affects scpeech are scarce and somehow contradictory. However, since the late 1980s the existence of specific patterns related to articulation, phonation and resonance is known. By that time these descriptions were virtually useless when coming to the development of an automatic system, but pointed out the existence of a link between speech and OSA. In recent years automatic processing techniques have evolved and are now able to identify significant differences in the speech of OSAS patients when compared to records from healthy subjects. Nevertheless, little is known about the connection between these new results with those published in the past and the pathogenesis of the OSA syndrome. This Thesis is aimed to progress beyond the previous research done in this area by addressing: the study of how OSA affects patients’ speech, the enhancement of automatic OSA classification based on speech analysis, and its integration with the information embedded in the predictors generally used by clinicians in preliminary patients’ examination. The first two tasks, though may appear symbiotic at first, are quite different. While studying the connection between speech and OSA requires simple narrow models that can be easily interpreted, classification requires larger models including a large number dimensions for the characterization and posterior identification of the observed patterns. Anyhow, it is clear that any progress made in the first task should allow us to improve our performance on the second one, and that the incorporation of the predictors used by clinicians shall contribute in this same direction. The Thesis considers both continuous and sustained speech analysis, to exploit the synergies and differences between them. On continuous speech analysis, a conventional speech processing scheme, designed and evaluated before this Thesis, was taken as a baseline. Over this initial system several alternative representations of the speech information were proposed, optimized and tested to select those more suitable for the characterization of OSA-specific patterns. Evidences were found on the existence of a connection between OSA and the fundamental constituents of the speech: the formants. Experimental results proved that the success of the proposed solution is well explained by the ability of speech representations to describe these specific OSA-related components, ignoring the noisy ones as well those presenting low discrimination capabilities. The resulting scheme obtained a 18% error rate, on a classification scheme significantly less complex than those described in the literature and operating on a single speech record. Regarding the connection between OSA and the observed patterns, it was necessary to consider inter-and intra-group differences for this analysis, and to focus on the articulation, replacing the complex classification models by the long-term average spectra. Results clearly point to certain regions on the frequency axis, suggesting the existence of a systematic narrowing in the vocal tract section at the oropharynx. This was already described in the pathogenesis of this syndrome. Regarding sustained speech, similar experiments as those conducted on continuous speech were reproduced on sustained phonations of vowel / a /. Results were qualitatively similar to the previous ones, though in this case perfomance rates were found to be noticeably lower. Trying to derive further knowledge from this result, experiments on the long-term average spectra and intraand inter-group variability ratios were also reproduced on sustained speech records. Results on both experiments showed significant differences from the previous ones obtained from continuous speech which could explain the differences observed on peformance. However, sustained speech also provided the opportunity to study phonation within the controlled framework it provides. This was also identified in the literature as a source of information for the detection of OSA. In this study it was found that, for the available dataset, no sistematic differences related to phonation could be found between the two groups of speakers. Only those dimensions which relate energy distribution along the frequency axis provided significant differences, pointing once again towards the direction of resonant components. Once classification schemes on both continuous and sustained speech were developed, the Thesis addressed their combination into a single classification system. Under the assumption that the information in continuous and sustained speech is fundamentally different, it should be possible to successfully merge the two of them. This was tested through a simple fusion scheme which obtained a 88.6% correct classification (11.4% error rate), which represents a significant improvement over the state of the art. Finally, the combination of this classifier with the variables used by clinicians obtained a 91.3% accuracy (8.7% error rate). This is within the range of alternative, but costly and intrusive schemes, which unlike the one proposed can not be used in the preliminary assessment of patients’ condition. In the end, this Thesis has shed new light on the underlying connection between OSA and speech, and evidenced the degree of maturity reached by speech technology on OSA characterization and detection, leaving the door open for future research which shall continue in the multiple directions that have been pointed out and left as future work.
Resumo:
In this letter, we propose a novel method for unsupervised change detection (CD) in multitemporal Erreur Relative Globale Adimensionnelle de Synthese (ERGAS) satellite images by using the relative dimensionless global error in synthesis index locally. In order to obtain the change image, the index is calculated around a pixel neighborhood (3x3 window) processing simultaneously all the spectral bands available. With the objective of finding the binary change masks, six thresholding methods are selected. A comparison between the proposed method and the change vector analysis method is reported. The accuracy CD showed in the experimental results demonstrates the effectiveness of the proposed method.
Resumo:
The aim of this study was to obtain the exact value of the keratometric index (nkexact) and to clinically validate a variable keratometric index (nkadj) that minimizes this error. Methods: The nkexact value was determined by obtaining differences (DPc) between keratometric corneal power (Pk) and Gaussian corneal power (PGauss c ) equal to 0. The nkexact was defined as the value associated with an equivalent difference in the magnitude of DPc for extreme values of posterior corneal radius (r2c) for each anterior corneal radius value (r1c). This nkadj was considered for the calculation of the adjusted corneal power (Pkadj). Values of r1c ∈ (4.2, 8.5) mm and r2c ∈ (3.1, 8.2) mm were considered. Differences of True Net Power with PGauss c , Pkadj, and Pk(1.3375) were calculated in a clinical sample of 44 eyes with keratoconus. Results: nkexact ranged from 1.3153 to 1.3396 and nkadj from 1.3190 to 1.3339 depending on the eye model analyzed. All the nkadj values adjusted perfectly to 8 linear algorithms. Differences between Pkadj and PGauss c did not exceed 60.7 D (Diopter). Clinically, nk = 1.3375 was not valid in any case. Pkadj and True Net Power and Pk(1.3375) and Pkadj were statistically different (P , 0.01), whereas no differences were found between PGauss c and Pkadj (P . 0.01). Conclusions: The use of a single value of nk for the calculation of the total corneal power in keratoconus has been shown to be imprecise, leading to inaccuracies in the detection and classification of this corneal condition. Furthermore, our study shows the relevance of corneal thickness in corneal power calculations in keratoconus.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-04
Resumo:
Thesis (Master's)--University of Washington, 2016-06
Resumo:
This letter presents an analytical model for evaluating the Bit Error Rate (BER) of a Direct Sequence Code Division Multiple Access (DS-CDMA) system, with M-ary orthogonal modulation and noncoherent detection, employing an array antenna operating in a Nakagami fading environment. An expression of the Signal to Interference plus Noise Ratio (SINR) at the output of the receiver is derived, which allows the BER to be evaluated using a closed form expression. The analytical model is validated by comparing the obtained results with simulation results.
Resumo:
This correspondence considers block detection for blind wireless digital transmission. At high signal-to-noise ratio (SNR), block detection errors are primarily due to the received sequence having multiple possible decoded sequences with the same likelihood. We derive analytic expressions for the probability of detection ambiguity written in terms of a Dedekind zeta function, in the zero noise case with large constellations. Expressions are also provided for finite constellations, which can be evaluated efficiently, independent of the block length. Simulations demonstrate that the analytically derived error floors exist at high SNR.
Resumo:
A field study was performed in a hospital pharmacy aimed at identifying positive and negative influences on the process of detection of and further recovery from initial errors or other failures, thus avoiding negative consequences. Confidential reports and follow-up interviews provided data on 31 near-miss incidents involving such recovery processes. Analysis revealed that organizational culture with regard to following procedures needed reinforcement, that some procedures could be improved, that building in extra checks was worthwhile and that supporting unplanned recovery was essential for problems not covered by procedures. Guidance is given on how performance in recovery could be measured. A case is made for supporting recovery as an addition to prevention-based safety methods.
Resumo:
The matched filter detector is well known as the optimum detector for use in communication, as well as in radar systems for signals corrupted by Additive White Gaussian Noise (A.W.G.N.). Non-coherent F.S.K. and differentially coherent P.S.K. (D.P.S.K.) detection schemes, which employ a new approach in realizing the matched filter processor, are investigated. The new approach utilizes pulse compression techniques, well known in radar systems, to facilitate the implementation of the matched filter in the form of the Pulse Compressor Matched Filter (P.C.M.F.). Both detection schemes feature a mixer- P.C.M.F. Compound as their predetector processor. The Compound is utilized to convert F.S.K. modulation into pulse position modulation, and P.S.K. modulation into pulse polarity modulation. The mechanisms of both detection schemes are studied through examining the properties of the Autocorrelation function (A.C.F.) at the output of the P.C.M.F.. The effects produced by time delay, and carrier interference on the output A.C.F. are determined. Work related to the F.S.K. detection scheme is mostly confined to verifying its validity, whereas the D.P.S.K. detection scheme has not been reported before. Consequently, an experimental system was constructed, which utilized combined hardware and software, and operated under the supervision of a microprocessor system. The experimental system was used to develop error-rate models for both detection schemes under investigation. Performances of both F. S. K. and D.P. S. K. detection schemes were established in the presence of A. W. G. N. , practical imperfections, time delay, and carrier interference. The results highlight the candidacy of both detection schemes for use in the field of digital data communication and, in particular, the D.P.S.K. detection scheme, which performed very close to optimum in a background of A.W.G.N.
Resumo:
The detection of signals in the presence of noise is one of the most basic and important problems encountered by communication engineers. Although the literature abounds with analyses of communications in Gaussian noise, relatively little work has appeared dealing with communications in non-Gaussian noise. In this thesis several digital communication systems disturbed by non-Gaussian noise are analysed. The thesis is divided into two main parts. In the first part, a filtered-Poisson impulse noise model is utilized to calulate error probability characteristics of a linear receiver operating in additive impulsive noise. Firstly the effect that non-Gaussian interference has on the performance of a receiver that has been optimized for Gaussian noise is determined. The factors affecting the choice of modulation scheme so as to minimize the deterimental effects of non-Gaussian noise are then discussed. In the second part, a new theoretical model of impulsive noise that fits well with the observed statistics of noise in radio channels below 100 MHz has been developed. This empirical noise model is applied to the detection of known signals in the presence of noise to determine the optimal receiver structure. The performance of such a detector has been assessed and is found to depend on the signal shape, the time-bandwidth product, as well as the signal-to-noise ratio. The optimal signal to minimize the probability of error of; the detector is determined. Attention is then turned to the problem of threshold detection. Detector structure, large sample performance and robustness against errors in the detector parameters are examined. Finally, estimators of such parameters as. the occurrence of an impulse and the parameters in an empirical noise model are developed for the case of an adaptive system with slowly varying conditions.
Resumo:
We consider the detection of biased information sources in the ubiquitous code-division multiple-access (CDMA) scheme. We propose a simple modification to both the popular single-user matched-filter detector and a recently introduced near-optimal message-passing-based multiuser detector. This modification allows for detecting modulated biased sources directly with no need for source coding. Analytical results and simulations with excellent agreement are provided, demonstrating substantial improvement in bit error rate in comparison with the unmodified detectors and the alternative of source compression. The robustness of error-performance improvement is shown under practical model settings, including bias estimation mismatch and finite-length spreading codes. © 2007 IOP Publishing Ltd.
Resumo:
Purpose To investigate the utility of uncorrected visual acuity measures in screening for refractive error in white school children aged 6-7-years and 12-13-years. Methods The Northern Ireland Childhood Errors of Refraction (NICER) study used a stratified random cluster design to recruit children from schools in Northern Ireland. Detailed eye examinations included assessment of logMAR visual acuity and cycloplegic autorefraction. Spherical equivalent refractive data from the right eye were used to classify significant refractive error as myopia of at least 1DS, hyperopia as greater than +3.50DS and astigmatism as greater than 1.50DC, whether it occurred in isolation or in association with myopia or hyperopia. Results Results are presented from 661 white 12-13-year-old and 392 white 6-7-year-old school-children. Using a cut-off of uncorrected visual acuity poorer than 0.20 logMAR to detect significant refractive error gave a sensitivity of 50% and specificity of 92% in 6-7-year-olds and 73% and 93% respectively in 12-13-year-olds. In 12-13-year-old children a cut-off of poorer than 0.20 logMAR had a sensitivity of 92% and a specificity of 91% in detecting myopia and a sensitivity of 41% and a specificity of 84% in detecting hyperopia. Conclusions Vision screening using logMAR acuity can reliably detect myopia, but not hyperopia or astigmatism in school-age children. Providers of vision screening programs should be cognisant that where detection of uncorrected hyperopic and/or astigmatic refractive error is an aspiration, current UK protocols will not effectively deliver.
Resumo:
This thesis considers two basic aspects of impact damage in composite materials, namely damage severity discrimination and impact damage location by using Acoustic Emissions (AE) and Artificial Neural Networks (ANNs). The experimental work embodies a study of such factors as the application of AE as Non-destructive Damage Testing (NDT), and the evaluation of ANNs modelling. ANNs, however, played an important role in modelling implementation. In the first aspect of the study, different impact energies were used to produce different level of damage in two composite materials (T300/914 and T800/5245). The impacts were detected by their acoustic emissions (AE). The AE waveform signals were analysed and modelled using a Back Propagation (BP) neural network model. The Mean Square Error (MSE) from the output was then used as a damage indicator in the damage severity discrimination study. To evaluate the ANN model, a comparison was made of the correlation coefficients of different parameters, such as MSE, AE energy, AE counts, etc. MSE produced an outstanding result based on the best performance of correlation. In the second aspect, a new artificial neural network model was developed to provide impact damage location on a quasi-isotropic composite panel. It was successfully trained to locate impact sites by correlating the relationship between arriving time differences of AE signals at transducers located on the panel and the impact site coordinates. The performance of the ANN model, which was evaluated by calculating the distance deviation between model output and real location coordinates, supports the application of ANN as an impact damage location identifier. In the study, the accuracy of location prediction decreased when approaching the central area of the panel. Further investigation indicated that this is due to the small arrival time differences, which defect the performance of ANN prediction. This research suggested increasing the number of processing neurons in the ANNs as a practical solution.
Resumo:
Few-mode fiber transmission systems are typically impaired by mode-dependent loss (MDL). In an MDL-impaired link, maximum-likelihood (ML) detection yields a significant advantage in system performance compared to linear equalizers, such as zero-forcing and minimum-mean square error equalizers. However, the computational effort of the ML detection increases exponentially with the number of modes and the cardinality of the constellation. We present two methods that allow for near-ML performance without being afflicted with the enormous computational complexity of ML detection: improved reduced-search ML detection and sphere decoding. Both algorithms are tested regarding their performance and computational complexity in simulations of three and six spatial modes with QPSK and 16QAM constellations.