Biblioteca Digital

80 resultados para automatic speech recognition

Feature Extraction from Whole-Sky Ground-Based Images for Cloud-Type Recognition

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images are presented. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image and, finally, others are computed from the image where cloudy pixels are distinguished from clear-sky pixels. The use of the most suitable features in an automatic classification algorithm is also shown and discussed. Both the features and the classifier are developed over images taken by two different camera devices, namely, a total sky imager (TSI) and a whole sky imager (WSC), which are placed in two different areas of the world (Toowoomba, Australia; and Girona, Spain, respectively). The performance of the classifier is assessed by comparing its image classification with an a priori classification carried out by visual inspection of more than 200 images from each camera. The index of agreement is 76% when five different sky conditions are considered: clear, low cumuliform clouds, stratiform clouds (overcast), cirriform clouds, and mottled clouds (altocumulus, cirrocumulus). Discussion on the future directions of this research is also presented, regarding both the use of other features and the use of other classification techniques

Automatic Prediction of Facial Trait Judgments: Appearance vs. Structural Models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Evaluating other individuals with respect to personality characteristics plays a crucial role in human relations and it is the focus of attention for research in diverse fields such as psychology and interactive computer systems. In psychology, face perception has been recognized as a key component of this evaluation system. Multiple studies suggest that observers use face information to infer personality characteristics. Interactive computer systems are trying to take advantage of these findings and apply them to increase the natural aspect of interaction and to improve the performance of interactive computer systems. Here, we experimentally test whether the automatic prediction of facial trait judgments (e.g. dominance) can be made by using the full appearance information of the face and whether a reduced representation of its structure is sufficient. We evaluate two separate approaches: a holistic representation model using the facial appearance information and a structural model constructed from the relations among facial salient points. State of the art machine learning methods are applied to a) derive a facial trait judgment model from training data and b) predict a facial trait value for any face. Furthermore, we address the issue of whether there are specific structural relations among facial points that predict perception of facial traits. Experimental results over a set of labeled data (9 different trait evaluations) and classification rules (4 rules) suggest that a) prediction of perception of facial traits is learnable by both holistic and structural approaches; b) the most reliable prediction of facial trait judgments is obtained by certain type of holistic descriptions of the face appearance; and c) for some traits such as attractiveness and extroversion, there are relationships between specific structural features and social perceptions.

Application of the Mutual Information Minimization to speaker recognition / identification improvement

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results for speaker recognition shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone, and for speaker identification can reduce the minimum detection cost function with saturated test sentences from 6.42% to 4.15%, while the results with clean speech (without saturation) is 5.74% for one microphone and 7.02% for the other one.

Bootstrapping a statistical speech translator from a rule-based one

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a series of experiments in which we start with English to French and English to Japanese versions of an Open Source rule-based speech translation system for a medical domain, and bootstrap correspondign statistical systems. Comparative evaluation reveals that the rule-based systems are still significantly better than the statistical ones, despite the fact that considerable effort has been invested in tuning both the recognition and translation components; also, a hybrid system only marginally improved recall at the cost of a los in precision. The result suggests that rule-based architectures may still be preferable to statistical ones for safety-critical speech translation tasks.

Speaker recognition improvement using blind inversion of distortions

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we propose the inversion of nonlinear distortions in order to improve the recognition rates of a speaker recognizer system. We study the effect of saturations on the test signals, trying to take into account real situations where the training material has been recorded in a controlled situation but the testing signals present some mismatch with the input signal level (saturations). The experimental results shows that a combination of several strategies can improve the recognition rates with saturated test sentences from 80% to 89.39%, while the results with clean speech (without saturation) is 87.76% for one microphone.

Model-based objects recognition in man-made environments

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a model-based objects recognition system which is part of an image interpretation system intended to assist autonomous vehicles navigation. The system is intended to operate in man-made environments. Behavior-based navigation of autonomous vehicles involves the recognition of navigable areas and the potential obstacles. The recognition system integrates color, shape and texture information together with the location of the vanishing point. The recognition process starts from some prior scene knowledge, that is, a generic model of the expected scene and the potential objects. The recognition system constitutes an approach where different low-level vision techniques extract a multitude of image descriptors which are then analyzed using a rule-based reasoning system to interpret the image content. This system has been implemented using CEES, the C++ embedded expert system shell developed in the Systems Engineering and Automatic Control Laboratory (University of Girona) as a specific rule-based problem solving tool. It has been especially conceived for supporting cooperative expert systems, and uses the object oriented programming paradigm

Dance movement patterns recognition

Relevância:

20.00% 20.00%

Publicador:

Resumo:

"Es tracta d'un projecte dividit en dues parts independents però complementàries, realitzades per autors diferents. Aquest document conté originàriament altre material i/o programari només consultable a la Biblioteca de Ciència i Tecnologia"

Object recognition for autonomous objects: comparison of two approaches

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Report for the scientific sojourn at the Swiss Federal Institute of Technology Zurich, Switzerland, between September and December 2007. In order to make robots useful assistants for our everyday life, the ability to learn and recognize objects is of essential importance. However, object recognition in real scenes is one of the most challenging problems in computer vision, as it is necessary to deal with difficulties. Furthermore, in mobile robotics a new challenge is added to the list: computational complexity. In a dynamic world, information about the objects in the scene can become obsolete before it is ready to be used if the detection algorithm is not fast enough. Two recent object recognition techniques have achieved notable results: the constellation approach proposed by Lowe and the bag of words approach proposed by Nistér and Stewénius. The Lowe constellation approach is the one currently being used in the robot localization project of the COGNIRON project. This report is divided in two main sections. The first section is devoted to briefly review the currently used object recognition system, the Lowe approach, and bring to light the drawbacks found for object recognition in the context of indoor mobile robot navigation. Additionally the proposed improvements for the algorithm are described. In the second section the alternative bag of words method is reviewed, as well as several experiments conducted to evaluate its performance with our own object databases. Furthermore, some modifications to the original algorithm to make it suitable for object detection in unsegmented images are proposed.

Automatic DVB signal analyser

Relevância:

20.00% 20.00%

Publicador:

Resumo:

El problema de controlar les emissions de televisió digital a tota Europa pel desenvolupament de receptors robustos i fiables és cada vegada més significant, per això, sorgeix la necessitat d’automatitzar el procés d’anàlisi i control d’aquests senyals. Aquest projecte presenta el desenvolupament software d’una aplicació que vol solucionar una part d’aquest problema. L’aplicació s’encarrega d’analitzar, gestionar i capturar senyals de televisió digital. Aquest document fa una introducció a la matèria central que és la televisió digital i la informació que porten els senyals de televisió, concretament, la que es refereix a l’estàndard "Digital Video Broadcasting". A continuació d’aquesta part, l’escrit es concentra en l’explicació i descripció de les funcionalitats que necessita cobrir l'aplicació, així com introduir i explicar cada etapa d’un procés de desenvolupament software. Finalment, es resumeixen els avantatges de la creació d’aquest programa per l’automatització de l’anàlisi de senyal digital partint d’una optimització de recursos.

On the automatic application of inequality indexes in the analysis of the international distribution of environmental indicators

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years traditional inequality measures have been used to quite a considerable extent to examine the international distribution of environmental indicators. One of its main characteristics is that each one assigns different weights to the changes that occur in the different sections of the variable distribution and, consequently, the results they yield can potentially be very different. Hence, we suggest the appropriateness of using a range of well-recommended measures to achieve more robust results. We also provide an empirical test for the comparative behaviour of several suitable inequality measures and environmental indicators. Our findings support the hypothesis that in some cases there are differences among measures in both the sign of the evolution and its size. JEL codes: D39; Q43; Q56. Keywords: international environment factor distribution; Kaya factors; Inequality measurement

Automatic self-configuration of the logical network using distributed software agents

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a system for dynamic network resource configuration in environments with bandwidth reservation. The proposed system is completely distributed and automates the mechanisms for adapting the logical network to the offered load. The system is able to manage dynamically a logical network such as a virtual path network in ATM or a label switched path network in MPLS or GMPLS. The system design and implementation is based on a multi-agent system (MAS) which make the decisions of when and how to change a logical path. Despite the lack of a centralised global network view, results show that MAS manages the network resources effectively, reducing the connection blocking probability and, therefore, achieving better utilisation of network resources. We also include details of its architecture and implementation

Automatic classification of breast density

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A recent trend in digital mammography is computer-aided diagnosis systems, which are computerised tools designed to assist radiologists. Most of these systems are used for the automatic detection of abnormalities. However, recent studies have shown that their sensitivity is significantly decreased as the density of the breast increases. This dependence is method specific. In this paper we propose a new approach to the classification of mammographic images according to their breast parenchymal density. Our classification uses information extracted from segmentation results and is based on the underlying breast tissue texture. Classification performance was based on a large set of digitised mammograms. Evaluation involves different classifiers and uses a leave-one-out methodology. Results demonstrate the feasibility of estimating breast density using image processing and analysis techniques

Overview of coded light projection techniques for automatic 3D profiling

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Obtaining automatic 3D profile of objects is one of the most important issues in computer vision. With this information, a large number of applications become feasible: from visual inspection of industrial parts to 3D reconstruction of the environment for mobile robots. In order to achieve 3D data, range finders can be used. Coded structured light approach is one of the most widely used techniques to retrieve 3D information of an unknown surface. An overview of the existing techniques as well as a new classification of patterns for structured light sensors is presented. This kind of systems belong to the group of active triangulation method, which are based on projecting a light pattern and imaging the illuminated scene from one or more points of view. Since the patterns are coded, correspondences between points of the image(s) and points of the projected pattern can be easily found. Once correspondences are found, a classical triangulation strategy between camera(s) and projector device leads to the reconstruction of the surface. Advantages and constraints of the different patterns are discussed

An automatic correction tool for relational database schemas

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A Web-based tool developed to automatically correct relational database schemas is presented. This tool has been integrated into a more general e-learning platform and is used to reinforce teaching and learning on database courses. This platform assigns to each student a set of database problems selected from a common repository. The student has to design a relational database schema and enter it into the system through a user friendly interface specifically designed for it. The correction tool corrects the design and shows detected errors. The student has the chance to correct them and send a new solution. These steps can be repeated as many times as required until a correct solution is obtained. Currently, this system is being used in different introductory database courses at the University of Girona with very promising results

Real-Time implementation of a blind authentication method using self-synchronous speech watermarking

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A blind speech watermarking scheme that meets hard real-time deadlines is presented and implemented. In addition, one of the key issues in these block-oriented watermarking techniques is to preserve the synchronization. Namely, to recover the exact position of each block in the mark extract process. In fact, the presented scheme can be split up into two distinguished parts, the synchronization and the information mark methods. The former is embedded into the time domain and it is fast enough to be run meeting real-time requirements. The latter contains the authentication information and it is embedded into the wavelet domain. The synchronization and information mark techniques are both tunable in order to allow a con gurable method. Thus, capacity, transparency and robustness can be con gured depending on the needs. It makes the scheme useful for professional applications, such telephony authentication or even sending information throw radio applications.

«
1
2
3
4
5
6
»