12 resultados para Audio-visual Speech Recognition, Visual Feature Extraction, Free-parts, Monolithic, ROI
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
Visual attention is a very important task in autonomous robotics, but, because of its complexity, the processing time required is significant. We propose an architecture for feature selection using foveated images that is guided by visual attention tasks and that reduces the processing time required to perform these tasks. Our system can be applied in bottom-up or top-down visual attention. The foveated model determines which scales are to be used on the feature extraction algorithm. The system is able to discard features that are not extremely necessary for the tasks, thus, reducing the processing time. If the fovea is correctly placed, then it is possible to reduce the processing time without compromising the quality of the tasks outputs. The distance of the fovea from the object is also analyzed. If the visual system loses the tracking in top-down attention, basic strategies of fovea placement can be applied. Experiments have shown that it is possible to reduce up to 60% the processing time with this approach. To validate the method, we tested it with the feature algorithm known as Speeded Up Robust Features (SURF), one of the most efficient approaches for feature extraction. With the proposed architecture, we can accomplish real time requirements of robotics vision, mainly to be applied in autonomous robotics
Resumo:
The skin cancer is the most common of all cancers and the increase of its incidence must, in part, caused by the behavior of the people in relation to the exposition to the sun. In Brazil, the non-melanoma skin cancer is the most incident in the majority of the regions. The dermatoscopy and videodermatoscopy are the main types of examinations for the diagnosis of dermatological illnesses of the skin. The field that involves the use of computational tools to help or follow medical diagnosis in dermatological injuries is seen as very recent. Some methods had been proposed for automatic classification of pathology of the skin using images. The present work has the objective to present a new intelligent methodology for analysis and classification of skin cancer images, based on the techniques of digital processing of images for extraction of color characteristics, forms and texture, using Wavelet Packet Transform (WPT) and learning techniques called Support Vector Machine (SVM). The Wavelet Packet Transform is applied for extraction of texture characteristics in the images. The WPT consists of a set of base functions that represents the image in different bands of frequency, each one with distinct resolutions corresponding to each scale. Moreover, the characteristics of color of the injury are also computed that are dependants of a visual context, influenced for the existing colors in its surround, and the attributes of form through the Fourier describers. The Support Vector Machine is used for the classification task, which is based on the minimization principles of the structural risk, coming from the statistical learning theory. The SVM has the objective to construct optimum hyperplanes that represent the separation between classes. The generated hyperplane is determined by a subset of the classes, called support vectors. For the used database in this work, the results had revealed a good performance getting a global rightness of 92,73% for melanoma, and 86% for non-melanoma and benign injuries. The extracted describers and the SVM classifier became a method capable to recognize and to classify the analyzed skin injuries
Resumo:
The focus of this thesis is a production of video biographies with/by sheltered teenagers. The general objective is discuss the potential production of video biographies while a device of research-action-formation. From de point of view of research, the study interrogates cultural practices that demarcates the passage of teenagers in shelter institutional; From the point of view of action, seeks to identify the modes of appropriation of space for audiovisual creation by teenagers; and, from the point of view of formation, asks the potentiality of audiovisual language while the way from which teenagers can auto-configure themselves responsibly, in the reinvention of places and others worlds for them. The research falls in the intersection of qualitative approach of ethnographic and of research-action-formation. Is anchored theoretically in autobiographical approaches - Pineau (2005); Passeggi (2008); Delory-Momberger (2008); Josso (2010) e Bertaux (2010) - and in the filmic method - Ramos (2003); Wohlgemuth (2005) and Comoli (2009). Participated in the research eleven teenagers members of the production cycle of biographical traces, among these, the three teenagers who advanced to cycles of audiovisual recording life narratives and reflective exercises around produced reports, procedures of which we extract the set of empirical material analyzed The analysis revealed that teens use in under three types of practices: the practices of mess , as way of expression; the practices of evasion , as resistance to restraint the right to come and go, and the practices of claim a regime of "truth" to the institutional environment, which emerge as a survival tactic in the face of paths desvínculos, family abandonment and neglect. The study also showed the appropriation of spaces of audio-visual creation meaningful expression through music by teenagers and encourage dialogue between and with the teenagers and the achievement of reflective exercises focused for awareness of their stories in becoming. These findings show a broader sense the thesis that visual language is a potent mobilizer artifact reflections and empowerment of individuals in situations of social exclusion
Resumo:
This paper proposes a method based on the theory of electromagnetic waves reflected to evaluate the behavior of these waves and the level of attenuation caused in bone tissue. For this, it was proposed the construction of two antennas in microstrip structure with resonance frequency at 2.44 GHz The problem becomes relevant because of the diseases osteometabolic reach a large portion of the population, men and women. With this method, the signal is classified into two groups: tissue mass with bony tissues with normal or low bone mass. For this, techniques of feature extraction (Wavelet Transform) and pattern recognition (KNN and ANN) were used. The tests were performed on bovine bone and tissue with chemicals, the methodology and results are described in the work
Resumo:
This research aims to identify the process of appropriation of audio visual (in digital video) for collective symbolic production (participatory video practice that expresses popular culture) in a socio-cultural context where the minorities are. Therefore, we based this study on the students‟ experience of the film and video workshops held by Cinema para Todos (Culture Point Cine for All), in Natal, RN. Culture Point is the basis of the project Programa Nacional de Cultura, Educação e Cidadania Cultura Viva (National Program of Culture, Education and Citizenship - Living Culture) of the Brazilian Ministry of Culture. In 2010, the Cine for All developed three film and video workshops in the state of Rio Grande do Norte (Northeastern of Brazil), in the municipalities of Açu, Lajes and São Gonçalo do Amarante, within a public policy project of socio-cultural inclusion. These three workshops are the focus of this empirical investigation. To support the analysis of this research we consider that the spaces of workshops are sites of the social practices origin, assuming they show the development of the changes of actions that generate new practices. In this socio-cultural context the students as interlocutors process the communicative culture mediation, where come from the logics of action (using digital video) to become a mediatic practice (video auto ethnographic). With the overall goal set and the field of investigative action delimited we considered the methodology of case study suitable for the observation of the object, because it is a phenomenon of modernity, occurring in a context of real life and with little or no control over events. Participant observation and interviews was also applied to this case study. The analytical theoretical support comes from the notion of mediation by Jesús Martín-Barbero and the concept of habitus by Pierre Bourdieu. The research found that a new way of communicating has been developing by this social group, and this reflects the technological change experienced by them. The auto ethnographic videos, short movies, reveal an allegorical trial of the mainstream media, because while they use the mainstream format, they rework the aesthetic, but without revising the history, in fact they proposing to retell the history of themselves full of colorful details and with richness of their forgotten popular culture.
Resumo:
ARAUJO, Márcio V. ; ALSINA, Pablo J. ; MEDEIROS, Adelardo A. D. ; PEREIRA, Jonathan P.P. ; DOMINGOS, Elber C. ; ARAÚJO, Fábio M.U. ; SILVA, Jáder S. . Development of an Active Orthosis Prototype for Lower Limbs. In: INTERNATIONAL CONGRESS OF MECHANICAL ENGINEERING, 20., 2009, Gramado, RS. Proceedings… Gramado, RS: [s. n.], 2009
Resumo:
The automatic speech recognition by machine has been the target of researchers in the past five decades. In this period have been numerous advances, such as in the field of recognition of isolated words (commands), which has very high rates of recognition, currently. However, we are still far from developing a system that could have a performance similar to the human being (automatic continuous speech recognition). One of the great challenges of searches for continuous speech recognition is the large amount of pattern. The modern languages such as English, French, Spanish and Portuguese have approximately 500,000 words or patterns to be identified. The purpose of this study is to use smaller units than the word such as phonemes, syllables and difones units as the basis for the speech recognition, aiming to recognize any words without necessarily using them. The main goal is to reduce the restriction imposed by the excessive amount of patterns. In order to validate this proposal, the system was tested in the isolated word recognition in dependent-case. The phonemes characteristics of the Brazil s Portuguese language were used to developed the hierarchy decision system. These decisions are made through the use of neural networks SVM (Support Vector Machines). The main speech features used were obtained from the Wavelet Packet Transform. The descriptors MFCC (Mel-Frequency Cepstral Coefficient) are also used in this work. It was concluded that the method proposed in this work, showed good results in the steps of recognition of vowels, consonants (syllables) and words when compared with other existing methods in literature
Resumo:
The human voice is an important communication tool and any disorder of the voice can have profound implications for social and professional life of an individual. Techniques of digital signal processing have been used by acoustic analysis of vocal disorders caused by pathologies in the larynx, due to its simplicity and noninvasive nature. This work deals with the acoustic analysis of voice signals affected by pathologies in the larynx, specifically, edema, and nodules on the vocal folds. The purpose of this work is to develop a classification system of voices to help pre-diagnosis of pathologies in the larynx, as well as monitoring pharmacological treatments and after surgery. Linear Prediction Coefficients (LPC), Mel Frequency cepstral coefficients (MFCC) and the coefficients obtained through the Wavelet Packet Transform (WPT) are applied to extract relevant characteristics of the voice signal. For the classification task is used the Support Vector Machine (SVM), which aims to build optimal hyperplanes that maximize the margin of separation between the classes involved. The hyperplane generated is determined by the support vectors, which are subsets of points in these classes. According to the database used in this work, the results showed a good performance, with a hit rate of 98.46% for classification of normal and pathological voices in general, and 98.75% in the classification of diseases together: edema and nodules
Resumo:
In this work, the Markov chain will be the tool used in the modeling and analysis of convergence of the genetic algorithm, both the standard version as for the other versions that allows the genetic algorithm. In addition, we intend to compare the performance of the standard version with the fuzzy version, believing that this version gives the genetic algorithm a great ability to find a global optimum, own the global optimization algorithms. The choice of this algorithm is due to the fact that it has become, over the past thirty yares, one of the more importan tool used to find a solution of de optimization problem. This choice is due to its effectiveness in finding a good quality solution to the problem, considering that the knowledge of a good quality solution becomes acceptable given that there may not be another algorithm able to get the optimal solution for many of these problems. However, this algorithm can be set, taking into account, that it is not only dependent on how the problem is represented as but also some of the operators are defined, to the standard version of this, when the parameters are kept fixed, to their versions with variables parameters. Therefore to achieve good performance with the aforementioned algorithm is necessary that it has an adequate criterion in the choice of its parameters, especially the rate of mutation and crossover rate or even the size of the population. It is important to remember that those implementations in which parameters are kept fixed throughout the execution, the modeling algorithm by Markov chain results in a homogeneous chain and when it allows the variation of parameters during the execution, the Markov chain that models becomes be non - homogeneous. Therefore, in an attempt to improve the algorithm performance, few studies have tried to make the setting of the parameters through strategies that capture the intrinsic characteristics of the problem. These characteristics are extracted from the present state of execution, in order to identify and preserve a pattern related to a solution of good quality and at the same time that standard discarding of low quality. Strategies for feature extraction can either use precise techniques as fuzzy techniques, in the latter case being made through a fuzzy controller. A Markov chain is used for modeling and convergence analysis of the algorithm, both in its standard version as for the other. In order to evaluate the performance of a non-homogeneous algorithm tests will be applied to compare the standard fuzzy algorithm with the genetic algorithm, and the rate of change adjusted by a fuzzy controller. To do so, pick up optimization problems whose number of solutions varies exponentially with the number of variables
Resumo:
Lung cancer is one of the most common types of cancer and has the highest mortality rate. Patient survival is highly correlated with early detection. Computed Tomography technology services the early detection of lung cancer tremendously by offering aminimally invasive medical diagnostic tool. However, the large amount of data per examination makes the interpretation difficult. This leads to omission of nodules by human radiologist. This thesis presents a development of a computer-aided diagnosis system (CADe) tool for the detection of lung nodules in Computed Tomography study. The system, called LCD-OpenPACS (Lung Cancer Detection - OpenPACS) should be integrated into the OpenPACS system and have all the requirements for use in the workflow of health facilities belonging to the SUS (Brazilian health system). The LCD-OpenPACS made use of image processing techniques (Region Growing and Watershed), feature extraction (Histogram of Gradient Oriented), dimensionality reduction (Principal Component Analysis) and classifier (Support Vector Machine). System was tested on 220 cases, totaling 296 pulmonary nodules, with sensitivity of 94.4% and 7.04 false positives per case. The total time for processing was approximately 10 minutes per case. The system has detected pulmonary nodules (solitary, juxtavascular, ground-glass opacity and juxtapleural) between 3 mm and 30 mm.
Resumo:
ARAUJO, Márcio V. ; ALSINA, Pablo J. ; MEDEIROS, Adelardo A. D. ; PEREIRA, Jonathan P.P. ; DOMINGOS, Elber C. ; ARAÚJO, Fábio M.U. ; SILVA, Jáder S. . Development of an Active Orthosis Prototype for Lower Limbs. In: INTERNATIONAL CONGRESS OF MECHANICAL ENGINEERING, 20., 2009, Gramado, RS. Proceedings… Gramado, RS: [s. n.], 2009
Resumo:
This work uses computer vision algorithms related to features in the identification of medicine boxes for the visually impaired. The system is for people who have a disease that compromises his vision, hindering the identification of the correct medicine to be ingested. We use the camera, available in several popular devices such as computers, televisions and phones, to identify the box of the correct medicine and audio through the image, showing the poor information about the medication, such: as the dosage, indication and contraindications of the medication. We utilize a model of object detection using algorithms to identify the features in the boxes of drugs and playing the audio at the time of detection of feauteres in those boxes. Experiments carried out with 15 people show that where 93 % think that the system is useful and very helpful in identifying drugs for boxes. So, it is necessary to make use of this technology to help several people with visual impairments to take the right medicine, at the time indicated in advance by the physician