961 resultados para Automatic speech recognition (ASR)


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The identification of people by measuring some traits of individual anatomy or physiology has led to a specific research area called biometric recognition. This thesis is focused on improving fingerprint recognition systems considering three important problems: fingerprint enhancement, fingerprint orientation extraction and automatic evaluation of fingerprint algorithms. An effective extraction of salient fingerprint features depends on the quality of the input fingerprint. If the fingerprint is very noisy, we are not able to detect a reliable set of features. A new fingerprint enhancement method, which is both iterative and contextual, is proposed. This approach detects high-quality regions in fingerprints, selectively applies contextual filtering and iteratively expands like wildfire toward low-quality ones. A precise estimation of the orientation field would greatly simplify the estimation of other fingerprint features (singular points, minutiae) and improve the performance of a fingerprint recognition system. The fingerprint orientation extraction is improved following two directions. First, after the introduction of a new taxonomy of fingerprint orientation extraction methods, several variants of baseline methods are implemented and, pointing out the role of pre- and post- processing, we show how to improve the extraction. Second, the introduction of a new hybrid orientation extraction method, which follows an adaptive scheme, allows to improve significantly the orientation extraction in noisy fingerprints. Scientific papers typically propose recognition systems that integrate many modules and therefore an automatic evaluation of fingerprint algorithms is needed to isolate the contributions that determine an actual progress in the state-of-the-art. The lack of a publicly available framework to compare fingerprint orientation extraction algorithms, motivates the introduction of a new benchmark area called FOE (including fingerprints and manually-marked orientation ground-truth) along with fingerprint matching benchmarks in the FVC-onGoing framework. The success of such framework is discussed by providing relevant statistics: more than 1450 algorithms submitted and two international competitions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, the fusion of probabilistic knowledge-based classification rules and learning automata theory is proposed and as a result we present a set of probabilistic classification rules with self-learning capability. The probabilities of the classification rules change dynamically guided by a supervised reinforcement process aimed at obtaining an optimum classification accuracy. This novel classifier is applied to the automatic recognition of digital images corresponding to visual landmarks for the autonomous navigation of an unmanned aerial vehicle (UAV) developed by the authors. The classification accuracy of the proposed classifier and its comparison with well-established pattern recognition methods is finally reported.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

La presente Tesis analiza las posibilidades que ofrecen en la actualidad las tecnologas del habla para la deteccin de patologas clnicas asociadas a la va area superior. El estudio del habla que tradicionalmente cubre tanto la produccin como el proceso de transformacin del mensaje y las seales involucradas, desde el emisor hasta alcanzar al receptor, ofrece una va de estudio alternativa para estas patologas. El hecho de que la seal emitida no solo contiene este mensaje, sino tambin informacin acerca del locutor, ha motivado el desarrollo de sistemas orientados a la identificacin y verificacin de la identidad de los locutores. Estos trabajos han recibido recientemente un nuevo impulso, orientndose tanto hacia la caracterizacin de rasgos que son comunes a varios locutores, como a las diferencias existentes entre grabaciones de un mismo locutor. Los primeros resultan especialmente relevantes para esta Tesis dado que estos rasgos podran evidenciar la presencia de caractersticas relacionadas con una cierta condicin comn a varios locutores, independiente de su identidad. Tal es el caso que se enfrenta en esta Tesis, donde los rasgos identificados se relacionaran con una de la patologa particular y directamente vinculada con el sistema de fsico de conformacin del habla. El caso del Sndrome de Apneas Hipopneas durante el Sueno (SAHS) resulta paradigmtico. Se trata de una patologa con una elevada prevalencia mundo, que aumenta con la edad. Los pacientes de esta patologa experimentan episodios de cese involuntario de la respiracin durante el sueo, que se prolongan durante varios segundos y que se reproducen a lo largo de la noche impidiendo el correcto descanso. En el caso de la apnea obstructiva, estos episodios se deben a la imposibilidad de mantener un camino abierto a travs de la va area, de forma que el flujo de aire se ve interrumpido. En la actualidad, el diagnostico de estos pacientes se realiza a travs de un estudio polisomnogrfico, que se centra en el anlisis de los episodios de apnea durante el sueo, requiriendo que el paciente permanezca en el hospital durante una noche. La complejidad y el elevado coste de estos procedimientos, unidos a las crecientes listas de espera, han evidenciado la necesidad de contar con tcnicas rpidas de deteccin, que si bien podran no obtener tasas tan elevadas, permitiran reorganizar las listas de espera en funcin del grado de severidad de la patologa en cada paciente. Entre otros, los sistemas de diagnostico por imagen, as como la caracterizacin antropomtrica de los pacientes, han evidenciado la existencia de patrones anatmicos que tendran influencia directa sobre el habla. Los trabajos dedicados al estudio del SAHS en lo relativo a como esta afecta al habla han sido escasos y algunos de ellos incluso contradictorios. Sin embargo, desde finales de la dcada de 1980 se conoce la existencia de patrones especficos relativos a la articulacin, la fonacin y la resonancia. Sin embargo, su descripcin resultaba difcilmente aprovechable a travs de un sistema de reconocimiento automtico, pero apuntaba la existencia de un nexo entre voz y SAHS. En los ltimos anos las tcnicas de procesado automtico han permitido el desarrollo de sistemas automticos que ya son capaces de identificar diferencias significativas en el habla de los pacientes del SAHS, y que los distinguen de los locutores sanos. Por contra, poco se conoce acerca de la conexin entre estos nuevos resultados, los s que haban obtenido en el pasado y la patognesis del SAHS. Esta Tesis continua la labor desarrollada en este mbito considerando especficamente: el estudio de la forma en que el SAHS afecta el habla de los pacientes, la mejora en las tasas de clasificacin automtica y la combinacin de la informacin obtenida con los predictores utilizados por los especialistas clnicos en sus evaluaciones preliminares. Las dos primeras tareas plantean problemas simbiticos, pero diferentes. Mientras el estudio de la conexin entre el SAHS y el habla requiere de modelos acotados que puedan ser interpretados con facilidad, los sistemas de reconocimiento se sirven de un elevado nmero de dimensiones para la caracterizacin y posterior identificacin de patrones. As, la primera tarea debe permitirnos avanzar en la segunda, al igual que la incorporacin de los predictores utilizados por los especialistas clnicos. La Tesis aborda el estudio tanto del habla continua como del habla sostenida, con el fin de aprovechar las sinergias y diferencias existentes entre ambas. En el anlisis del habla continua se tomo como punto de partida un esquema que ya fue evaluado con anterioridad, y sobre el cual se ha tratado la evaluacin y optimizacin de la representacin del habla, as como la caracterizacin de los patrones especficos asociados al SAHS. Ello ha evidenciado la conexin entre el SAHS y los elementos fundamentales de la seal de voz: los formantes. Los resultados obtenidos demuestran que el xito de estos sistemas se debe, fundamentalmente, a la capacidad de estas representaciones para describir dichas componentes, obviando las dimensiones ruidosas o con poca capacidad discriminativa. El esquema resultante ofrece una tasa de error por debajo del 18%, sirvindose de clasificadores notablemente menos complejos que los descritos en el estado del arte y de una nica grabacin de voz de corta duracin. En relacin a la conexin entre el SAHS y los patrones observados, fue necesario considerar las diferencias inter- e intra-grupo, centrndonos en la articulacin caracterstica del locutor, sustituyendo los complejos modelos de clasificacin por el estudio de los promedios espectrales. El resultado apunta con claridad hacia ciertas regiones del eje de frecuencias, sugiriendo la existencia de un estrechamiento sistemtico en la seccin del tracto en la regin de la orofaringe, ya prevista en la patognesis de este sndrome. En cuanto al habla sostenida, se han reproducido los estudios realizados sobre el habla continua en grabaciones de la vocal /a/ sostenida. Los resultados son cualitativamente anlogos a los anteriores, si bien en este caso las tasas de clasificacin resultan ser ms bajas. Con el objetivo de identificar el sentido de este resultado se reprodujo el estudio de los promedios espectrales y de la variabilidad inter e intra-grupo. Ambos estudios mostraron importantes diferencias con los anteriores que podran explicar estos resultados. Sin embargo, el habla sostenida ofrece otras oportunidades al establecer un entorno controlado para el estudio de la fonacin, que tambin haba sido identificada como una fuente de informacin para la deteccin del SAHS. De su estudio se pudo observar que, en el conjunto de datos disponibles, no existen variaciones que pudieran asociarse fcilmente con la fonacin. nicamente aquellas dimensiones que describen la distribucin de energa a lo largo del eje de frecuencia evidenciaron diferencias significativas, apuntando, una vez ms, en la direccin de las resonancias espectrales. Analizados los resultados anteriores, la Tesis afronta la fusin de ambas fuentes de informacin en un nico sistema de clasificacin. Con ello es posible mejorar las tasas de clasificacin, bajo la hiptesis de que la informacin presente en el habla continua y el habla sostenida es fundamentalmente distinta. Esta tarea se realizo a travs de un sencillo esquema de fusin que obtuvo un 88.6% de aciertos en clasificacin (tasa de error del 11.4%), lo que representa una mejora significativa respecto al estado del arte. Finalmente, la combinacin de este clasificador con los predictores utilizados por los especialistas clnicos ofreci una tasa del 91.3% (tasa de error de 8.7%), que se encuentra dentro del margen ofrecido por esquemas ms costosos e intrusivos, y que a diferencia del propuesto, no pueden ser utilizados en la evaluacin previa de los pacientes. Con todo, la Tesis ofrece una visin clara sobre la relacin entre el SAHS y el habla, evidenciando el grado de madurez alcanzado por la tecnologa del habla en la caracterizacin y deteccin del SAHS, poniendo de manifiesto que su uso para la evaluacin de los pacientes ya sera posible, y dejando la puerta abierta a futuras investigaciones que continen el trabajo aqu iniciado. ABSTRACT This Thesis explores the potential of speech technologies for the detection of clinical disorders connected to the upper airway. The study of speech traditionally covers both the production process and post processing of the signals involved, from the speaker up to the listener, offering an alternative path to study these pathologies. The fact that utterances embed not just the encoded message but also information about the speaker, has motivated the development of automatic systems oriented to the identification and verificaton the speakers identity. These have recently been boosted and reoriented either towards the characterization of traits that are common to several speakers, or to the differences between records of the same speaker collected under different conditions. The first are particularly relevant to this Thesis as these patterns could reveal the presence of features that are related to a common condition shared among different speakers, regardless of their identity. Such is the case faced in this Thesis, where the traits identified would relate to a particular pathology, directly connected to the speech production system. The Obstructive Sleep Apnea syndrome (OSA) is a paradigmatic case for analysis. It is a disorder with high prevalence among adults and affecting a larger number of them as they grow older. Patients suffering from this disorder experience episodes of involuntary cessation of breath during sleep that may last a few seconds and reproduce throughout the night, preventing proper rest. In the case of obstructive apnea, these episodes are related to the collapse of the pharynx, which interrupts the air flow. Currently, OSA diagnosis is done through a polysomnographic study, which focuses on the analysis of apnea episodes during sleep, requiring the patient to stay at the hospital for the whole night. The complexity and high cost of the procedures involved, combined with the waiting lists, have evidenced the need for screening techniques, which perhaps would not achieve outstanding performance rates but would allow clinicians to reorganize these lists ranking patients according to the severity of their condition. Among others, imaging diagnosis and anthropometric characterization of patients have evidenced the existence of anatomical patterns related to OSA that have direct influence on speech. Contributions devoted to the study of how this disorder affects scpeech are scarce and somehow contradictory. However, since the late 1980s the existence of specific patterns related to articulation, phonation and resonance is known. By that time these descriptions were virtually useless when coming to the development of an automatic system, but pointed out the existence of a link between speech and OSA. In recent years automatic processing techniques have evolved and are now able to identify significant differences in the speech of OSAS patients when compared to records from healthy subjects. Nevertheless, little is known about the connection between these new results with those published in the past and the pathogenesis of the OSA syndrome. This Thesis is aimed to progress beyond the previous research done in this area by addressing: the study of how OSA affects patients speech, the enhancement of automatic OSA classification based on speech analysis, and its integration with the information embedded in the predictors generally used by clinicians in preliminary patients examination. The first two tasks, though may appear symbiotic at first, are quite different. While studying the connection between speech and OSA requires simple narrow models that can be easily interpreted, classification requires larger models including a large number dimensions for the characterization and posterior identification of the observed patterns. Anyhow, it is clear that any progress made in the first task should allow us to improve our performance on the second one, and that the incorporation of the predictors used by clinicians shall contribute in this same direction. The Thesis considers both continuous and sustained speech analysis, to exploit the synergies and differences between them. On continuous speech analysis, a conventional speech processing scheme, designed and evaluated before this Thesis, was taken as a baseline. Over this initial system several alternative representations of the speech information were proposed, optimized and tested to select those more suitable for the characterization of OSA-specific patterns. Evidences were found on the existence of a connection between OSA and the fundamental constituents of the speech: the formants. Experimental results proved that the success of the proposed solution is well explained by the ability of speech representations to describe these specific OSA-related components, ignoring the noisy ones as well those presenting low discrimination capabilities. The resulting scheme obtained a 18% error rate, on a classification scheme significantly less complex than those described in the literature and operating on a single speech record. Regarding the connection between OSA and the observed patterns, it was necessary to consider inter-and intra-group differences for this analysis, and to focus on the articulation, replacing the complex classification models by the long-term average spectra. Results clearly point to certain regions on the frequency axis, suggesting the existence of a systematic narrowing in the vocal tract section at the oropharynx. This was already described in the pathogenesis of this syndrome. Regarding sustained speech, similar experiments as those conducted on continuous speech were reproduced on sustained phonations of vowel / a /. Results were qualitatively similar to the previous ones, though in this case perfomance rates were found to be noticeably lower. Trying to derive further knowledge from this result, experiments on the long-term average spectra and intraand inter-group variability ratios were also reproduced on sustained speech records. Results on both experiments showed significant differences from the previous ones obtained from continuous speech which could explain the differences observed on peformance. However, sustained speech also provided the opportunity to study phonation within the controlled framework it provides. This was also identified in the literature as a source of information for the detection of OSA. In this study it was found that, for the available dataset, no sistematic differences related to phonation could be found between the two groups of speakers. Only those dimensions which relate energy distribution along the frequency axis provided significant differences, pointing once again towards the direction of resonant components. Once classification schemes on both continuous and sustained speech were developed, the Thesis addressed their combination into a single classification system. Under the assumption that the information in continuous and sustained speech is fundamentally different, it should be possible to successfully merge the two of them. This was tested through a simple fusion scheme which obtained a 88.6% correct classification (11.4% error rate), which represents a significant improvement over the state of the art. Finally, the combination of this classifier with the variables used by clinicians obtained a 91.3% accuracy (8.7% error rate). This is within the range of alternative, but costly and intrusive schemes, which unlike the one proposed can not be used in the preliminary assessment of patients condition. In the end, this Thesis has shed new light on the underlying connection between OSA and speech, and evidenced the degree of maturity reached by speech technology on OSA characterization and detection, leaving the door open for future research which shall continue in the multiple directions that have been pointed out and left as future work.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Durante el proceso de produccin de voz, los factores anatmicos, fisiolgicos o psicosociales del individuo modifican los rganos resonadores, imprimiendo en la voz caractersticas particulares. Los sistemas ASR tratan de encontrar los matices caractersticos de una voz y asociarlos a un individuo o grupo. La edad y sexo de un hablante son factores intrnsecos que estn presentes en la voz. Este trabajo intenta diferenciar esas caractersticas, aislarlas y usarlas para detectar el gnero y la edad de un hablante. Para dicho fin, se ha realizado el estudio y anlisis de las caractersticas basadas en el pulso gltico y el tracto vocal, evitando usar tcnicas clsicas (como pitch y sus derivados) debido a las restricciones propias de dichas tcnicas. Los resultados finales de nuestro estudio alcanzan casi un 100% en reconocimiento de gnero mientras en la tarea de reconocimiento de edad el reconocimiento se encuentra alrededor del 80%. Parece ser que la voz queda afectada por el gnero del hablante y las hormonas, aunque no se aprecie en la audicin. ABSTRACT Particular elements of the voice are printed during the speech production process and are related to anatomical and physiological factors of the phonatory system or psychosocial factors acquired by the speaker. ASR systems attempt to find those peculiar nuances of a voice and associate them to an individual or a group. Age and gender are inherent factors to the speaker which may be represented in voice. This work attempts to differentiate those characteristics, isolate them and use them to detect speakers gender and age. Features based on glottal pulse and vocal tract are studied and analyzed in order to achieve good results in both tasks. Classical methodologies (such as pitch and derivates) are avoided since the requirements of those techniques may be too restrictive. The final scores achieve almost 100% in gender recognition whereas in age recognition those scores are around 80%. Factors related to the gender and hormones seem to affect the voice although they are not audible.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new language recognition technique based on the application of the philosophy of the Shifted Delta Coefficients (SDC) to phone log-likelihood ratio features (PLLR) is described. The new methodology allows the incorporation of long-span phonetic information at a frame-by-frame level while dealing with the temporal length of each phone unit. The proposed features are used to train an i-vector based system and tested on the Albayzin LRE 2012 dataset. The results show a relative improvement of 33.3% in Cavg in comparison with different state-of-the-art acoustic i-vector based systems. On the other hand, the integration of parallel phone ASR systems where each one is used to generate multiple PLLR coefficients which are stacked together and then projected into a reduced dimension are also presented. Finally, the paper shows how the incorporation of state information from the phone ASR contributes to provide additional improvements and how the fusion with the other acoustic and phonotactic systems provides an important improvement of 25.8% over the system presented during the competition.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

La diabetes comprende un conjunto de enfermedades metablicas que se caracterizan por concentraciones de glucosa en sangre anormalmente altas. En el caso de la diabetes tipo 1 (T1D, por sus siglas en ingls), esta situacin es debida a una ausencia total de secrecin endgena de insulina, lo que impide a la mayora de tejidos usar la glucosa. En tales circunstancias, se hace necesario el suministro exgeno de insulina para preservar la vida del paciente; no obstante, siempre con la precaucin de evitar cadas agudas de la glucemia por debajo de los niveles recomendados de seguridad. Adems de la administracin de insulina, las ingestas y la actividad fsica son factores fundamentales que influyen en la homeostasis de la glucosa. En consecuencia, una gestin apropiada de la T1D debera incorporar estos dos fenmenos fisiolgicos, en base a una identificacin y un modelado apropiado de los mismos y de sus sorrespondientes efectos en el balance glucosa-insulina. En particular, los sistemas de pncreas artificial ideados para llevar a cabo un control automtico de los niveles de glucemia del paciente podran beneficiarse de la integracin de esta clase de informacin. La primera parte de esta tesis doctoral cubre la caracterizacin del efecto agudo de la actividad fsica en los perfiles de glucosa. Con este objetivo se ha llevado a cabo una revisin sistemtica de la literatura y meta-anlisis que determinen las respuestas ante varias modalidades de ejercicio para pacientes con T1D, abordando esta caracterizacin mediante unas magnitudes que cuantifican las tasas de cambio en la glucemia a lo largo del tiempo. Por otro lado, una identificacin fiable de los periodos con actividad fsica es un requisito imprescindible para poder proveer de esa informacin a los sistemas de pncreas artificial en condiciones libres y ambulatorias. Por esta razn, la segunda parte de esta tesis est enfocada a la propuesta y evaluacin de un sistema automtico diseado para reconocer periodos de actividad fsica, clasificando su nivel de intensidad (ligera, moderada o vigorosa); as como, en el caso de periodos vigorosos, identificando tambin la modalidad de ejercicio (aerbica, mixta o de fuerza). En este sentido, ambos aspectos tienen una influencia especfica en el mecanismo metablico que suministra la energa para llevar a cabo el ejercicio y, por tanto, en las respuestas glucmicas en T1D. En este trabajo se aplican varias combinaciones de tcnicas de aprendizaje mquina y reconocimiento de patrones sobre la fusin multimodal de seales de acelerometra y ritmo cardaco, las cuales describen tanto aspectos mecnicos del movimiento como la respuesta fisiolgica del sistema cardiovascular ante el ejercicio. Despus del reconocimiento de patrones se incorpora tambin un mdulo de filtrado temporal para sacar partido a la considerable coherencia temporal presente en los datos, una redundancia que se origina en el hecho de que en la prctica, las tendencias en cuanto a actividad fsica suelen mantenerse estables a lo largo de cierto tiempo, sin fluctuaciones rpidas y repetitivas. El tercer bloque de esta tesis doctoral aborda el tema de las ingestas en el mbito de la T1D. En concreto, se propone una serie de modelos compartimentales y se evalan stos en funcin de su capacidad para describir matemticamente el efecto remoto de las concetraciones plasmticas de insulina exgena sobre las tasas de eleiminacin de la glucosa atribuible a la ingesta; un aspecto hasta ahora no incorporado en los principales modelos de paciente para T1D existentes en la literatura. Los datos aqu utilizados se obtuvieron gracias a un experimento realizado por el Institute of Metabolic Science (Universidad de Cambridge, Reino Unido) con 16 pacientes jvenes. En el experimento, de tipo clamp con objetivo variable, se replicaron los perfiles individuales de glucosa, segn lo observado durante una visita preliminar tras la ingesta de una cena con o bien alta carga glucmica, o bien baja. Los seis modelos mecansticos evaluados constaban de: a) submodelos de doble compartimento para las masas de trazadores de glucosa, b) un submodelo de nico compartimento para reflejar el efecto remoto de la insulina, c) dos tipos de activacin de este mismo efecto remoto (bien lineal, bien con un punto de corte), y d) diversas condiciones iniciales. ABSTRACT Diabetes encompasses a series of metabolic diseases characterized by abnormally high blood glucose concentrations. In the case of type 1 diabetes (T1D), this situation is caused by a total absence of endogenous insulin secretion, which impedes the use of glucose by most tissues. In these circumstances, exogenous insulin supplies are necessary to maintain patients life; although caution is always needed to avoid acute decays in glycaemia below safe levels. In addition to insulin administrations, meal intakes and physical activity are fundamental factors influencing glucose homoeostasis. Consequently, a successful management of T1D should incorporate these two physiological phenomena, based on an appropriate identification and modelling of these events and their corresponding effect on the glucose-insulin balance. In particular, artificial pancreas systems designed to perform an automated control of patients glycaemia levels may benefit from the integration of this type of information. The first part of this PhD thesis covers the characterization of the acute effect of physical activity on glucose profiles. With this aim, a systematic review of literature and metaanalyses are conduced to determine responses to various exercise modalities in patients with T1D, assessed via rates-of-change magnitudes to quantify temporal variations in glycaemia. On the other hand, a reliable identification of physical activity periods is an essential prerequisite to feed artificial pancreas systems with information concerning exercise in ambulatory, free-living conditions. For this reason, the second part of this thesis focuses on the proposal and evaluation of an automatic system devised to recognize physical activity, classifying its intensity level (light, moderate or vigorous) and for vigorous periods, identifying also its exercise modality (aerobic, mixed or resistance); since both aspects have a distinctive influence on the predominant metabolic pathway involved in fuelling exercise, and therefore, in the glycaemic responses in T1D. Various combinations of machine learning and pattern recognition techniques are applied on the fusion of multi-modal signal sources, namely: accelerometry and heart rate measurements, which describe both mechanical aspects of movement and the physiological response of the cardiovascular system to exercise. An additional temporal filtering module is incorporated after recognition in order to exploit the considerable temporal coherence (i.e. redundancy) present in data, which stems from the fact that in practice, physical activity trends are often maintained stable along time, instead of fluctuating rapid and repeatedly. The third block of this PhD thesis addresses meal intakes in the context of T1D. In particular, a number of compartmental models are proposed and compared in terms of their ability to describe mathematically the remote effect of exogenous plasma insulin concentrations on the disposal rates of meal-attributable glucose, an aspect which had not yet been incorporated to the prevailing T1D patient models in literature. Data were acquired in an experiment conduced at the Institute of Metabolic Science (University of Cambridge, UK) on 16 young patients. A variable-target glucose clamp replicated their individual glucose profiles, observed during a preliminary visit after ingesting either a high glycaemic-load or a low glycaemic-load evening meal. The six mechanistic models under evaluation here comprised: a) two-compartmental submodels for glucose tracer masses, b) a single-compartmental submodel for insulins remote effect, c) two types of activations for this remote effect (either linear or with a cut-off point), and d) diverse forms of initial conditions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes a module for the prediction of emotions in text chats in Spanish, oriented to its use in specific-domain text-to-speech systems. A general overview of the system is given, and the results of some evaluations carried out with two corpora of real chat messages are described. These results seem to indicate that this system offers a performance similar to other systems described in the literature, for a more complex task than other systems (identification of emotions and emotional intensity in the chat domain).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Rock mass characterization requires a deep geometric understanding of the discontinuity sets affecting rock exposures. Recent advances in Light Detection and Ranging (LiDAR) instrumentation currently allow quick and accurate 3D data acquisition, yielding on the development of new methodologies for the automatic characterization of rock mass discontinuities. This paper presents a methodology for the identification and analysis of flat surfaces outcropping in a rocky slope using the 3D data obtained with LiDAR. This method identifies and defines the algebraic equations of the different planes of the rock slope surface by applying an analysis based on a neighbouring points coplanarity test, finding principal orientations by Kernel Density Estimation and identifying clusters by the Density-Based Scan Algorithm with Noise. Different sources of information synthetic and 3D scanned data were employed, performing a complete sensitivity analysis of the parameters in order to identify the optimal value of the variables of the proposed method. In addition, raw source files and obtained results are freely provided in order to allow to a more straightforward method comparison aiming to a more reproducible research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Profound hearing loss is a disability that affects personality and when it involves teenagers before language acquisition, these bio-psychosocial conflicts can be exacerbated, requiring careful evaluation and choice of them for cochlear implant. Aim: To evaluate speech perception by adolescents with profound hearing loss, users of cochlear Implants. Study Design: Prospective. Materials and Methods: Twenty-five individuals with severe or profound pre-lingual hearing loss who underwent cochlear implantation during adolescence, between 10 to 17 years and 11 months, who went through speech perception tests before the implant and 2 years after device activation. For comparison and analysis we used the results from tests of four choice, recognition of vowels and recognition of sentences in a closed setting and the open environment. Results: The average percentage of correct answers in the four choice test before the implant was 46.9% and after 24 months of device use, this value went up to 86.1% in the vowels recognition test, the average difference was 45.13% to 83.13% and the sentences recognition test together in closed and open settings was 19.3% to 60.6% and 1.08% to 20.47% respectively. Conclusion: All patients, although with mixed results, achieved statistical improvement in all speech tests that were employed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The impact of basal ganglia dysfunction on semantic processing was investigated by comparing the performance of individuals with nonthalamic subcortical (NS) vascular lesions, Parkinson's disease (PD), cortical lesions, and matched controls on a semantic priming task. Unequibiased lexical ambiguity primes were used in auditory prime-target pairs comprising 4 critical conditions; dominant related (e.g., bank-money), subordinate related (e.g., bank-river), dominant unrelated (e.g.,foot-money) and subordinate unrelated (e.g., bat-river). Participants made speeded lexical decisions (word/nonword) on targets using a go-no-go response. When a short prime-target interstimulus interval (ISI) of 200 ins was employed, all groups demonstrated priming for dominant and subordinate conditions, indicating nonselective meaning facilitation and intact automatic lexical processing. Differences emerged at the long ISI (1250 ms), where control and cortical lesion participants evidenced selective facilitation of the dominant meaning, whereas NS and PD groups demonstrated a protracted period of nonselective meaning facilitation. This finding suggests a circumscribed deficit in the selective attentional engagement of the semantic network on the basis of meaning frequency, possibly implicating a disturbance of frontal-subcortical systems influencing inhibitory semantic mechanisms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dental implant recognition in patients without available records is a time-consuming and not straightforward task. The traditional method is a complete user-dependent process, where the expert compares a 2D X-ray image of the dental implant with a generic database. Due to the high number of implants available and the similarity between them, automatic/semi-automatic frameworks to aide implant model detection are essential. In this study, a novel computer-aided framework for dental implant recognition is suggested. The proposed method relies on image processing concepts, namely: (i) a segmentation strategy for semi-automatic implant delineation; and (ii) a machine learning approach for implant model recognition. Although the segmentation technique is the main focus of the current study, preliminary details of the machine learning approach are also reported. Two different scenarios are used to validate the framework: (1) comparison of the semi-automatic contours against implants manual contours of 125 X-ray images; and (2) classification of 11 known implants using a large reference database of 601 implants. Regarding experiment 1, 0.970.01, 2.240.85 pixels and 11.126 pixels of dice metric, mean absolute distance and Hausdorff distance were obtained, respectively. In experiment 2, 91% of the implants were successfully recognized while reducing the reference database to 5% of its original size. Overall, the segmentation technique achieved accurate implant contours. Although the preliminary classification results prove the concept of the current work, more features and an extended database should be used in a future work.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The first and second authors would like to thank the support of the PhD grants with references SFRH/BD/28817/2006 and SFRH/PROTEC/49517/2009, respectively, from Fundao para a Cincia e Tecnol ogia (FCT). This work was partially done in the scope of the project Methodologies to Analyze Organs from Complex Medical Images Applications to Fema le Pelvic Cavity, wi th reference PTDC/EEA- CRO/103320/2008, financially supported by FCT.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Liver steatosis is mainly a textural abnormality of the hepatic parenchyma due to fat accumulation on the hepatic vesicles. Today, the assessment is subjectively performed by visual inspection. Here a classifier based on features extracted from ultrasound (US) images is described for the automatic diagnostic of this phatology. The proposed algorithm estimates the original ultrasound radio-frequency (RF) envelope signal from which the noiseless anatomic information and the textural information encoded in the speckle noise is extracted. The features characterizing the textural information are the coefficients of the first order autoregressive model that describes the speckle field. A binary Bayesian classifier was implemented and the Bayes factor was calculated. The classification has revealed an overall accuracy of 100%. The Bayes factor could be helpful in the graphical display of the quantitative results for diagnosis purposes.