961 resultados para automatic speech recognition
Resumo:
The identification of people by measuring some traits of individual anatomy or physiology has led to a specific research area called biometric recognition. This thesis is focused on improving fingerprint recognition systems considering three important problems: fingerprint enhancement, fingerprint orientation extraction and automatic evaluation of fingerprint algorithms. An effective extraction of salient fingerprint features depends on the quality of the input fingerprint. If the fingerprint is very noisy, we are not able to detect a reliable set of features. A new fingerprint enhancement method, which is both iterative and contextual, is proposed. This approach detects high-quality regions in fingerprints, selectively applies contextual filtering and iteratively expands like wildfire toward low-quality ones. A precise estimation of the orientation field would greatly simplify the estimation of other fingerprint features (singular points, minutiae) and improve the performance of a fingerprint recognition system. The fingerprint orientation extraction is improved following two directions. First, after the introduction of a new taxonomy of fingerprint orientation extraction methods, several variants of baseline methods are implemented and, pointing out the role of pre- and post- processing, we show how to improve the extraction. Second, the introduction of a new hybrid orientation extraction method, which follows an adaptive scheme, allows to improve significantly the orientation extraction in noisy fingerprints. Scientific papers typically propose recognition systems that integrate many modules and therefore an automatic evaluation of fingerprint algorithms is needed to isolate the contributions that determine an actual progress in the state-of-the-art. The lack of a publicly available framework to compare fingerprint orientation extraction algorithms, motivates the introduction of a new benchmark area called FOE (including fingerprints and manually-marked orientation ground-truth) along with fingerprint matching benchmarks in the FVC-onGoing framework. The success of such framework is discussed by providing relevant statistics: more than 1450 algorithms submitted and two international competitions.
Resumo:
In this paper, the fusion of probabilistic knowledge-based classification rules and learning automata theory is proposed and as a result we present a set of probabilistic classification rules with self-learning capability. The probabilities of the classification rules change dynamically guided by a supervised reinforcement process aimed at obtaining an optimum classification accuracy. This novel classifier is applied to the automatic recognition of digital images corresponding to visual landmarks for the autonomous navigation of an unmanned aerial vehicle (UAV) developed by the authors. The classification accuracy of the proposed classifier and its comparison with well-established pattern recognition methods is finally reported.
Resumo:
La presente Tesis analiza las posibilidades que ofrecen en la actualidad las tecnologías del habla para la detección de patologías clínicas asociadas a la vía aérea superior. El estudio del habla que tradicionalmente cubre tanto la producción como el proceso de transformación del mensaje y las señales involucradas, desde el emisor hasta alcanzar al receptor, ofrece una vía de estudio alternativa para estas patologías. El hecho de que la señal emitida no solo contiene este mensaje, sino también información acerca del locutor, ha motivado el desarrollo de sistemas orientados a la identificación y verificación de la identidad de los locutores. Estos trabajos han recibido recientemente un nuevo impulso, orientándose tanto hacia la caracterización de rasgos que son comunes a varios locutores, como a las diferencias existentes entre grabaciones de un mismo locutor. Los primeros resultan especialmente relevantes para esta Tesis dado que estos rasgos podrían evidenciar la presencia de características relacionadas con una cierta condición común a varios locutores, independiente de su identidad. Tal es el caso que se enfrenta en esta Tesis, donde los rasgos identificados se relacionarían con una de la patología particular y directamente vinculada con el sistema de físico de conformación del habla. El caso del Síndrome de Apneas Hipopneas durante el Sueno (SAHS) resulta paradigmático. Se trata de una patología con una elevada prevalencia mundo, que aumenta con la edad. Los pacientes de esta patología experimentan episodios de cese involuntario de la respiración durante el sueño, que se prolongan durante varios segundos y que se reproducen a lo largo de la noche impidiendo el correcto descanso. En el caso de la apnea obstructiva, estos episodios se deben a la imposibilidad de mantener un camino abierto a través de la vía aérea, de forma que el flujo de aire se ve interrumpido. En la actualidad, el diagnostico de estos pacientes se realiza a través de un estudio polisomnográfico, que se centra en el análisis de los episodios de apnea durante el sueño, requiriendo que el paciente permanezca en el hospital durante una noche. La complejidad y el elevado coste de estos procedimientos, unidos a las crecientes listas de espera, han evidenciado la necesidad de contar con técnicas rápidas de detección, que si bien podrían no obtener tasas tan elevadas, permitirían reorganizar las listas de espera en función del grado de severidad de la patología en cada paciente. Entre otros, los sistemas de diagnostico por imagen, así como la caracterización antropométrica de los pacientes, han evidenciado la existencia de patrones anatómicos que tendrían influencia directa sobre el habla. Los trabajos dedicados al estudio del SAHS en lo relativo a como esta afecta al habla han sido escasos y algunos de ellos incluso contradictorios. Sin embargo, desde finales de la década de 1980 se conoce la existencia de patrones específicos relativos a la articulación, la fonación y la resonancia. Sin embargo, su descripción resultaba difícilmente aprovechable a través de un sistema de reconocimiento automático, pero apuntaba la existencia de un nexo entre voz y SAHS. En los últimos anos las técnicas de procesado automático han permitido el desarrollo de sistemas automáticos que ya son capaces de identificar diferencias significativas en el habla de los pacientes del SAHS, y que los distinguen de los locutores sanos. Por contra, poco se conoce acerca de la conexión entre estos nuevos resultados, los sé que habían obtenido en el pasado y la patogénesis del SAHS. Esta Tesis continua la labor desarrollada en este ámbito considerando específicamente: el estudio de la forma en que el SAHS afecta el habla de los pacientes, la mejora en las tasas de clasificación automática y la combinación de la información obtenida con los predictores utilizados por los especialistas clínicos en sus evaluaciones preliminares. Las dos primeras tareas plantean problemas simbióticos, pero diferentes. Mientras el estudio de la conexión entre el SAHS y el habla requiere de modelos acotados que puedan ser interpretados con facilidad, los sistemas de reconocimiento se sirven de un elevado número de dimensiones para la caracterización y posterior identificación de patrones. Así, la primera tarea debe permitirnos avanzar en la segunda, al igual que la incorporación de los predictores utilizados por los especialistas clínicos. La Tesis aborda el estudio tanto del habla continua como del habla sostenida, con el fin de aprovechar las sinergias y diferencias existentes entre ambas. En el análisis del habla continua se tomo como punto de partida un esquema que ya fue evaluado con anterioridad, y sobre el cual se ha tratado la evaluación y optimización de la representación del habla, así como la caracterización de los patrones específicos asociados al SAHS. Ello ha evidenciado la conexión entre el SAHS y los elementos fundamentales de la señal de voz: los formantes. Los resultados obtenidos demuestran que el éxito de estos sistemas se debe, fundamentalmente, a la capacidad de estas representaciones para describir dichas componentes, obviando las dimensiones ruidosas o con poca capacidad discriminativa. El esquema resultante ofrece una tasa de error por debajo del 18%, sirviéndose de clasificadores notablemente menos complejos que los descritos en el estado del arte y de una única grabación de voz de corta duración. En relación a la conexión entre el SAHS y los patrones observados, fue necesario considerar las diferencias inter- e intra-grupo, centrándonos en la articulación característica del locutor, sustituyendo los complejos modelos de clasificación por el estudio de los promedios espectrales. El resultado apunta con claridad hacia ciertas regiones del eje de frecuencias, sugiriendo la existencia de un estrechamiento sistemático en la sección del tracto en la región de la orofaringe, ya prevista en la patogénesis de este síndrome. En cuanto al habla sostenida, se han reproducido los estudios realizados sobre el habla continua en grabaciones de la vocal /a/ sostenida. Los resultados son cualitativamente análogos a los anteriores, si bien en este caso las tasas de clasificación resultan ser más bajas. Con el objetivo de identificar el sentido de este resultado se reprodujo el estudio de los promedios espectrales y de la variabilidad inter e intra-grupo. Ambos estudios mostraron importantes diferencias con los anteriores que podrían explicar estos resultados. Sin embargo, el habla sostenida ofrece otras oportunidades al establecer un entorno controlado para el estudio de la fonación, que también había sido identificada como una fuente de información para la detección del SAHS. De su estudio se pudo observar que, en el conjunto de datos disponibles, no existen variaciones que pudieran asociarse fácilmente con la fonación. Únicamente aquellas dimensiones que describen la distribución de energía a lo largo del eje de frecuencia evidenciaron diferencias significativas, apuntando, una vez más, en la dirección de las resonancias espectrales. Analizados los resultados anteriores, la Tesis afronta la fusión de ambas fuentes de información en un único sistema de clasificación. Con ello es posible mejorar las tasas de clasificación, bajo la hipótesis de que la información presente en el habla continua y el habla sostenida es fundamentalmente distinta. Esta tarea se realizo a través de un sencillo esquema de fusión que obtuvo un 88.6% de aciertos en clasificación (tasa de error del 11.4%), lo que representa una mejora significativa respecto al estado del arte. Finalmente, la combinación de este clasificador con los predictores utilizados por los especialistas clínicos ofreció una tasa del 91.3% (tasa de error de 8.7%), que se encuentra dentro del margen ofrecido por esquemas más costosos e intrusivos, y que a diferencia del propuesto, no pueden ser utilizados en la evaluación previa de los pacientes. Con todo, la Tesis ofrece una visión clara sobre la relación entre el SAHS y el habla, evidenciando el grado de madurez alcanzado por la tecnología del habla en la caracterización y detección del SAHS, poniendo de manifiesto que su uso para la evaluación de los pacientes ya sería posible, y dejando la puerta abierta a futuras investigaciones que continúen el trabajo aquí iniciado. ABSTRACT This Thesis explores the potential of speech technologies for the detection of clinical disorders connected to the upper airway. The study of speech traditionally covers both the production process and post processing of the signals involved, from the speaker up to the listener, offering an alternative path to study these pathologies. The fact that utterances embed not just the encoded message but also information about the speaker, has motivated the development of automatic systems oriented to the identification and verificaton the speaker’s identity. These have recently been boosted and reoriented either towards the characterization of traits that are common to several speakers, or to the differences between records of the same speaker collected under different conditions. The first are particularly relevant to this Thesis as these patterns could reveal the presence of features that are related to a common condition shared among different speakers, regardless of their identity. Such is the case faced in this Thesis, where the traits identified would relate to a particular pathology, directly connected to the speech production system. The Obstructive Sleep Apnea syndrome (OSA) is a paradigmatic case for analysis. It is a disorder with high prevalence among adults and affecting a larger number of them as they grow older. Patients suffering from this disorder experience episodes of involuntary cessation of breath during sleep that may last a few seconds and reproduce throughout the night, preventing proper rest. In the case of obstructive apnea, these episodes are related to the collapse of the pharynx, which interrupts the air flow. Currently, OSA diagnosis is done through a polysomnographic study, which focuses on the analysis of apnea episodes during sleep, requiring the patient to stay at the hospital for the whole night. The complexity and high cost of the procedures involved, combined with the waiting lists, have evidenced the need for screening techniques, which perhaps would not achieve outstanding performance rates but would allow clinicians to reorganize these lists ranking patients according to the severity of their condition. Among others, imaging diagnosis and anthropometric characterization of patients have evidenced the existence of anatomical patterns related to OSA that have direct influence on speech. Contributions devoted to the study of how this disorder affects scpeech are scarce and somehow contradictory. However, since the late 1980s the existence of specific patterns related to articulation, phonation and resonance is known. By that time these descriptions were virtually useless when coming to the development of an automatic system, but pointed out the existence of a link between speech and OSA. In recent years automatic processing techniques have evolved and are now able to identify significant differences in the speech of OSAS patients when compared to records from healthy subjects. Nevertheless, little is known about the connection between these new results with those published in the past and the pathogenesis of the OSA syndrome. This Thesis is aimed to progress beyond the previous research done in this area by addressing: the study of how OSA affects patients’ speech, the enhancement of automatic OSA classification based on speech analysis, and its integration with the information embedded in the predictors generally used by clinicians in preliminary patients’ examination. The first two tasks, though may appear symbiotic at first, are quite different. While studying the connection between speech and OSA requires simple narrow models that can be easily interpreted, classification requires larger models including a large number dimensions for the characterization and posterior identification of the observed patterns. Anyhow, it is clear that any progress made in the first task should allow us to improve our performance on the second one, and that the incorporation of the predictors used by clinicians shall contribute in this same direction. The Thesis considers both continuous and sustained speech analysis, to exploit the synergies and differences between them. On continuous speech analysis, a conventional speech processing scheme, designed and evaluated before this Thesis, was taken as a baseline. Over this initial system several alternative representations of the speech information were proposed, optimized and tested to select those more suitable for the characterization of OSA-specific patterns. Evidences were found on the existence of a connection between OSA and the fundamental constituents of the speech: the formants. Experimental results proved that the success of the proposed solution is well explained by the ability of speech representations to describe these specific OSA-related components, ignoring the noisy ones as well those presenting low discrimination capabilities. The resulting scheme obtained a 18% error rate, on a classification scheme significantly less complex than those described in the literature and operating on a single speech record. Regarding the connection between OSA and the observed patterns, it was necessary to consider inter-and intra-group differences for this analysis, and to focus on the articulation, replacing the complex classification models by the long-term average spectra. Results clearly point to certain regions on the frequency axis, suggesting the existence of a systematic narrowing in the vocal tract section at the oropharynx. This was already described in the pathogenesis of this syndrome. Regarding sustained speech, similar experiments as those conducted on continuous speech were reproduced on sustained phonations of vowel / a /. Results were qualitatively similar to the previous ones, though in this case perfomance rates were found to be noticeably lower. Trying to derive further knowledge from this result, experiments on the long-term average spectra and intraand inter-group variability ratios were also reproduced on sustained speech records. Results on both experiments showed significant differences from the previous ones obtained from continuous speech which could explain the differences observed on peformance. However, sustained speech also provided the opportunity to study phonation within the controlled framework it provides. This was also identified in the literature as a source of information for the detection of OSA. In this study it was found that, for the available dataset, no sistematic differences related to phonation could be found between the two groups of speakers. Only those dimensions which relate energy distribution along the frequency axis provided significant differences, pointing once again towards the direction of resonant components. Once classification schemes on both continuous and sustained speech were developed, the Thesis addressed their combination into a single classification system. Under the assumption that the information in continuous and sustained speech is fundamentally different, it should be possible to successfully merge the two of them. This was tested through a simple fusion scheme which obtained a 88.6% correct classification (11.4% error rate), which represents a significant improvement over the state of the art. Finally, the combination of this classifier with the variables used by clinicians obtained a 91.3% accuracy (8.7% error rate). This is within the range of alternative, but costly and intrusive schemes, which unlike the one proposed can not be used in the preliminary assessment of patients’ condition. In the end, this Thesis has shed new light on the underlying connection between OSA and speech, and evidenced the degree of maturity reached by speech technology on OSA characterization and detection, leaving the door open for future research which shall continue in the multiple directions that have been pointed out and left as future work.
Resumo:
La diabetes comprende un conjunto de enfermedades metabólicas que se caracterizan por concentraciones de glucosa en sangre anormalmente altas. En el caso de la diabetes tipo 1 (T1D, por sus siglas en inglés), esta situación es debida a una ausencia total de secreción endógena de insulina, lo que impide a la mayoría de tejidos usar la glucosa. En tales circunstancias, se hace necesario el suministro exógeno de insulina para preservar la vida del paciente; no obstante, siempre con la precaución de evitar caídas agudas de la glucemia por debajo de los niveles recomendados de seguridad. Además de la administración de insulina, las ingestas y la actividad física son factores fundamentales que influyen en la homeostasis de la glucosa. En consecuencia, una gestión apropiada de la T1D debería incorporar estos dos fenómenos fisiológicos, en base a una identificación y un modelado apropiado de los mismos y de sus sorrespondientes efectos en el balance glucosa-insulina. En particular, los sistemas de páncreas artificial –ideados para llevar a cabo un control automático de los niveles de glucemia del paciente– podrían beneficiarse de la integración de esta clase de información. La primera parte de esta tesis doctoral cubre la caracterización del efecto agudo de la actividad física en los perfiles de glucosa. Con este objetivo se ha llevado a cabo una revisión sistemática de la literatura y meta-análisis que determinen las respuestas ante varias modalidades de ejercicio para pacientes con T1D, abordando esta caracterización mediante unas magnitudes que cuantifican las tasas de cambio en la glucemia a lo largo del tiempo. Por otro lado, una identificación fiable de los periodos con actividad física es un requisito imprescindible para poder proveer de esa información a los sistemas de páncreas artificial en condiciones libres y ambulatorias. Por esta razón, la segunda parte de esta tesis está enfocada a la propuesta y evaluación de un sistema automático diseñado para reconocer periodos de actividad física, clasificando su nivel de intensidad (ligera, moderada o vigorosa); así como, en el caso de periodos vigorosos, identificando también la modalidad de ejercicio (aeróbica, mixta o de fuerza). En este sentido, ambos aspectos tienen una influencia específica en el mecanismo metabólico que suministra la energía para llevar a cabo el ejercicio y, por tanto, en las respuestas glucémicas en T1D. En este trabajo se aplican varias combinaciones de técnicas de aprendizaje máquina y reconocimiento de patrones sobre la fusión multimodal de señales de acelerometría y ritmo cardíaco, las cuales describen tanto aspectos mecánicos del movimiento como la respuesta fisiológica del sistema cardiovascular ante el ejercicio. Después del reconocimiento de patrones se incorpora también un módulo de filtrado temporal para sacar partido a la considerable coherencia temporal presente en los datos, una redundancia que se origina en el hecho de que en la práctica, las tendencias en cuanto a actividad física suelen mantenerse estables a lo largo de cierto tiempo, sin fluctuaciones rápidas y repetitivas. El tercer bloque de esta tesis doctoral aborda el tema de las ingestas en el ámbito de la T1D. En concreto, se propone una serie de modelos compartimentales y se evalúan éstos en función de su capacidad para describir matemáticamente el efecto remoto de las concetraciones plasmáticas de insulina exógena sobre las tasas de eleiminación de la glucosa atribuible a la ingesta; un aspecto hasta ahora no incorporado en los principales modelos de paciente para T1D existentes en la literatura. Los datos aquí utilizados se obtuvieron gracias a un experimento realizado por el Institute of Metabolic Science (Universidad de Cambridge, Reino Unido) con 16 pacientes jóvenes. En el experimento, de tipo ‘clamp’ con objetivo variable, se replicaron los perfiles individuales de glucosa, según lo observado durante una visita preliminar tras la ingesta de una cena con o bien alta carga glucémica, o bien baja. Los seis modelos mecanísticos evaluados constaban de: a) submodelos de doble compartimento para las masas de trazadores de glucosa, b) un submodelo de único compartimento para reflejar el efecto remoto de la insulina, c) dos tipos de activación de este mismo efecto remoto (bien lineal, bien con un punto de corte), y d) diversas condiciones iniciales. ABSTRACT Diabetes encompasses a series of metabolic diseases characterized by abnormally high blood glucose concentrations. In the case of type 1 diabetes (T1D), this situation is caused by a total absence of endogenous insulin secretion, which impedes the use of glucose by most tissues. In these circumstances, exogenous insulin supplies are necessary to maintain patient’s life; although caution is always needed to avoid acute decays in glycaemia below safe levels. In addition to insulin administrations, meal intakes and physical activity are fundamental factors influencing glucose homoeostasis. Consequently, a successful management of T1D should incorporate these two physiological phenomena, based on an appropriate identification and modelling of these events and their corresponding effect on the glucose-insulin balance. In particular, artificial pancreas systems –designed to perform an automated control of patient’s glycaemia levels– may benefit from the integration of this type of information. The first part of this PhD thesis covers the characterization of the acute effect of physical activity on glucose profiles. With this aim, a systematic review of literature and metaanalyses are conduced to determine responses to various exercise modalities in patients with T1D, assessed via rates-of-change magnitudes to quantify temporal variations in glycaemia. On the other hand, a reliable identification of physical activity periods is an essential prerequisite to feed artificial pancreas systems with information concerning exercise in ambulatory, free-living conditions. For this reason, the second part of this thesis focuses on the proposal and evaluation of an automatic system devised to recognize physical activity, classifying its intensity level (light, moderate or vigorous) and for vigorous periods, identifying also its exercise modality (aerobic, mixed or resistance); since both aspects have a distinctive influence on the predominant metabolic pathway involved in fuelling exercise, and therefore, in the glycaemic responses in T1D. Various combinations of machine learning and pattern recognition techniques are applied on the fusion of multi-modal signal sources, namely: accelerometry and heart rate measurements, which describe both mechanical aspects of movement and the physiological response of the cardiovascular system to exercise. An additional temporal filtering module is incorporated after recognition in order to exploit the considerable temporal coherence (i.e. redundancy) present in data, which stems from the fact that in practice, physical activity trends are often maintained stable along time, instead of fluctuating rapid and repeatedly. The third block of this PhD thesis addresses meal intakes in the context of T1D. In particular, a number of compartmental models are proposed and compared in terms of their ability to describe mathematically the remote effect of exogenous plasma insulin concentrations on the disposal rates of meal-attributable glucose, an aspect which had not yet been incorporated to the prevailing T1D patient models in literature. Data were acquired in an experiment conduced at the Institute of Metabolic Science (University of Cambridge, UK) on 16 young patients. A variable-target glucose clamp replicated their individual glucose profiles, observed during a preliminary visit after ingesting either a high glycaemic-load or a low glycaemic-load evening meal. The six mechanistic models under evaluation here comprised: a) two-compartmental submodels for glucose tracer masses, b) a single-compartmental submodel for insulin’s remote effect, c) two types of activations for this remote effect (either linear or with a ‘cut-off’ point), and d) diverse forms of initial conditions.
Resumo:
This paper describes a module for the prediction of emotions in text chats in Spanish, oriented to its use in specific-domain text-to-speech systems. A general overview of the system is given, and the results of some evaluations carried out with two corpora of real chat messages are described. These results seem to indicate that this system offers a performance similar to other systems described in the literature, for a more complex task than other systems (identification of emotions and emotional intensity in the chat domain).
Resumo:
Rock mass characterization requires a deep geometric understanding of the discontinuity sets affecting rock exposures. Recent advances in Light Detection and Ranging (LiDAR) instrumentation currently allow quick and accurate 3D data acquisition, yielding on the development of new methodologies for the automatic characterization of rock mass discontinuities. This paper presents a methodology for the identification and analysis of flat surfaces outcropping in a rocky slope using the 3D data obtained with LiDAR. This method identifies and defines the algebraic equations of the different planes of the rock slope surface by applying an analysis based on a neighbouring points coplanarity test, finding principal orientations by Kernel Density Estimation and identifying clusters by the Density-Based Scan Algorithm with Noise. Different sources of information —synthetic and 3D scanned data— were employed, performing a complete sensitivity analysis of the parameters in order to identify the optimal value of the variables of the proposed method. In addition, raw source files and obtained results are freely provided in order to allow to a more straightforward method comparison aiming to a more reproducible research.
Resumo:
Mode of access: Internet.
Resumo:
Profound hearing loss is a disability that affects personality and when it involves teenagers before language acquisition, these bio-psychosocial conflicts can be exacerbated, requiring careful evaluation and choice of them for cochlear implant. Aim: To evaluate speech perception by adolescents with profound hearing loss, users of cochlear Implants. Study Design: Prospective. Materials and Methods: Twenty-five individuals with severe or profound pre-lingual hearing loss who underwent cochlear implantation during adolescence, between 10 to 17 years and 11 months, who went through speech perception tests before the implant and 2 years after device activation. For comparison and analysis we used the results from tests of four choice, recognition of vowels and recognition of sentences in a closed setting and the open environment. Results: The average percentage of correct answers in the four choice test before the implant was 46.9% and after 24 months of device use, this value went up to 86.1% in the vowels recognition test, the average difference was 45.13% to 83.13% and the sentences recognition test together in closed and open settings was 19.3% to 60.6% and 1.08% to 20.47% respectively. Conclusion: All patients, although with mixed results, achieved statistical improvement in all speech tests that were employed.
Resumo:
The impact of basal ganglia dysfunction on semantic processing was investigated by comparing the performance of individuals with nonthalamic subcortical (NS) vascular lesions, Parkinson's disease (PD), cortical lesions, and matched controls on a semantic priming task. Unequibiased lexical ambiguity primes were used in auditory prime-target pairs comprising 4 critical conditions; dominant related (e.g., bank-money), subordinate related (e.g., bank-river), dominant unrelated (e.g.,foot-money) and subordinate unrelated (e.g., bat-river). Participants made speeded lexical decisions (word/nonword) on targets using a go-no-go response. When a short prime-target interstimulus interval (ISI) of 200 ins was employed, all groups demonstrated priming for dominant and subordinate conditions, indicating nonselective meaning facilitation and intact automatic lexical processing. Differences emerged at the long ISI (1250 ms), where control and cortical lesion participants evidenced selective facilitation of the dominant meaning, whereas NS and PD groups demonstrated a protracted period of nonselective meaning facilitation. This finding suggests a circumscribed deficit in the selective attentional engagement of the semantic network on the basis of meaning frequency, possibly implicating a disturbance of frontal-subcortical systems influencing inhibitory semantic mechanisms.
Resumo:
Dental implant recognition in patients without available records is a time-consuming and not straightforward task. The traditional method is a complete user-dependent process, where the expert compares a 2D X-ray image of the dental implant with a generic database. Due to the high number of implants available and the similarity between them, automatic/semi-automatic frameworks to aide implant model detection are essential. In this study, a novel computer-aided framework for dental implant recognition is suggested. The proposed method relies on image processing concepts, namely: (i) a segmentation strategy for semi-automatic implant delineation; and (ii) a machine learning approach for implant model recognition. Although the segmentation technique is the main focus of the current study, preliminary details of the machine learning approach are also reported. Two different scenarios are used to validate the framework: (1) comparison of the semi-automatic contours against implant’s manual contours of 125 X-ray images; and (2) classification of 11 known implants using a large reference database of 601 implants. Regarding experiment 1, 0.97±0.01, 2.24±0.85 pixels and 11.12±6 pixels of dice metric, mean absolute distance and Hausdorff distance were obtained, respectively. In experiment 2, 91% of the implants were successfully recognized while reducing the reference database to 5% of its original size. Overall, the segmentation technique achieved accurate implant contours. Although the preliminary classification results prove the concept of the current work, more features and an extended database should be used in a future work.
Resumo:
The first and second authors would like to thank the support of the PhD grants with references SFRH/BD/28817/2006 and SFRH/PROTEC/49517/2009, respectively, from Fundação para a Ciência e Tecnol ogia (FCT). This work was partially done in the scope of the project “Methodologies to Analyze Organs from Complex Medical Images – Applications to Fema le Pelvic Cavity”, wi th reference PTDC/EEA- CRO/103320/2008, financially supported by FCT.
Resumo:
Liver steatosis is mainly a textural abnormality of the hepatic parenchyma due to fat accumulation on the hepatic vesicles. Today, the assessment is subjectively performed by visual inspection. Here a classifier based on features extracted from ultrasound (US) images is described for the automatic diagnostic of this phatology. The proposed algorithm estimates the original ultrasound radio-frequency (RF) envelope signal from which the noiseless anatomic information and the textural information encoded in the speckle noise is extracted. The features characterizing the textural information are the coefficients of the first order autoregressive model that describes the speckle field. A binary Bayesian classifier was implemented and the Bayes factor was calculated. The classification has revealed an overall accuracy of 100%. The Bayes factor could be helpful in the graphical display of the quantitative results for diagnosis purposes.
Resumo:
The relation of automatic auditory discrimination, measured with MMN, with the type of stimuli has not been well established in the literature, despite its importance as an electrophysiological measure of central sound representation. In this study, MMN response was elicited by pure-tone and speech binaurally passive auditory oddball paradigm in a group of 8 normal young adult subjects at the same intensity level (75 dB SPL). The frequency difference in pure-tone oddball was 100 Hz (standard = 1 000 Hz; deviant = 1 100 Hz; same duration = 100 ms), in speech oddball (standard /ba/; deviant /pa/; same duration = 175 ms) the Portuguese phonemes are both plosive bi-labial in order to maintain a narrow frequency band. Differences were found across electrode location between speech and pure-tone stimuli. Larger MMN amplitude, duration and higher latency to speech were verified compared to pure-tone in Cz and Fz as well as significance differences in latency and amplitude between mastoids. Results suggest that speech may be processed differently than non-speech; also it may occur in a later stage due to overlapping processes since more neural resources are required to speech processing.
Resumo:
In the last few years, the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems, the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how pleasant is a voice from a perceptual point of view when the final application is a speech based interface. In this paper we present an objective definition for voice pleasantness based on the composition of a representative feature subset and a new automatic voice pleasantness classification and intensity estimation system. Our study is based on a database composed by European Portuguese female voices but the methodology can be extended to male voices or to other languages. In the objective performance evaluation the system achieved a 9.1% error rate for voice pleasantness classification and a 15.7% error rate for voice pleasantness intensity estimation.