873 resultados para Audio-visual Speech Recognition, Visual Feature Extraction, Free-parts, Monolithic, ROI


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Malayalam is one of the 22 scheduled languages in India with more than 130 million speakers. This paper presents a report on the development of a speaker independent, continuous transcription system for Malayalam. The system employs Hidden Markov Model (HMM) for acoustic modeling and Mel Frequency Cepstral Coefficient (MFCC) for feature extraction. It is trained with 21 male and female speakers in the age group ranging from 20 to 40 years. The system obtained a word recognition accuracy of 87.4% and a sentence recognition accuracy of 84%, when tested with a set of continuous speech data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a handwritten character recognition system for Malayalam language. The feature extraction phase consists of gradient and curvature calculation and dimensionality reduction using Principal Component Analysis. Directional information from the arc tangent of gradient is used as gradient feature. Strength of gradient in curvature direction is used as the curvature feature. The proposed system uses a combination of gradient and curvature feature in reduced dimension as the feature vector. For classification, discriminative power of Support Vector Machine (SVM) is evaluated. The results reveal that SVM with Radial Basis Function (RBF) kernel yield the best performance with 96.28% and 97.96% of accuracy in two different datasets. This is the highest accuracy ever reported on these datasets

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an efficient Online Handwritten character Recognition System for Malayalam Characters (OHR-M) using Kohonen network. It would help in recognizing Malayalam text entered using pen-like devices. It will be more natural and efficient way for users to enter text using a pen than keyboard and mouse. To identify the difference between similar characters in Malayalam a novel feature extraction method has been adopted-a combination of context bitmap and normalized (x, y) coordinates. The system reported an accuracy of 88.75% which is writer independent with a recognition time of 15-32 milliseconds

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Crear un material audio-visual. Mejorar la calidad de la enseñanza. Estudiar la aplicación de programas audio-visuales en el aula. Buscar una metodología adecuada a la utilización didáctica de los medios audio-visuales. Comprobar las diferencias que pueden existir entre diferentes medios audio-visuales, diapositivas-vídeo. La muestra está formada por los niños de tres aulas de segundo de BUP del Colegio Escoles Pies de Sarrià (Barcelona). En total 102 sujetos que han estudiado primero de BUP en el mismo centro. Se expone el marco teórico. Se describen las variables (medios audio-visuales, rendimiento escolar, rendimiento escolar anterior, metodología, inteligencia, clase social, profesor y edad). Se describe la muestra. División de la muestra en tres clases (sin medio audio-visual, con vídeo, con diapositivas). Realización del material audio-visual. Se realizan las sesiones pertinentes en cada clase. Aplicación de la prueba objetiva. Se analizan los datos. Se ofrecen conclusiones y alternativas. Prueba objetiva de rendimiento. Test d'aptituds diferencials. Baremo de puntuaciones anteriores. Diferencia de medias, estadística descriptiva, análisis de varianza, prueba de Scheffe, para establecer si hay diferencias entre el grupo que ha trabajado con medio audio-visual, visual y sin medio audiovisual. La metodología experimental aplicada no ha producido los resultados esperados, hay razones para afirmar que han intervenido factores no controlados, ajenos a la experimentación. Se constata un gran interés de los alumnos por el uso del vídeo como elemento de motivación. Se señala la importancia de incidir en este campo creando metodologías activas adecuadas y series de programas válidos. Hace falta una intensa investigación en las posibilidades y efectos de dichas metodologías.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It has been shown through a number of experiments that neural networks can be used for a phonetic typewriter. Algorithms can be looked on as producing self-organizing feature maps which correspond to phonemes. In the Chinese language the utterance of a Chinese character consists of a very simple string of Chinese phonemes. With this as a starting point, a neural network feature map for Chinese phonemes can be built up. In this paper, feature map structures for Chinese phonemes are discussed and tested. This research on a Chinese phonetic feature map is important both for Chinese speech recognition and for building a Chinese phonetic typewriter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The academic discipline of television studies has been constituted by the claim that television is worth studying because it is popular. Yet this claim has also entailed a need to defend the subject against the triviality that is associated with the television medium because of its very popularity. This article analyses the many attempts in the later twentieth and twenty-first centuries to constitute critical discourses about television as a popular medium. It focuses on how the theoretical currents of Television Studies emerged and changed in the UK, where a disciplinary identity for the subject was founded by borrowing from related disciplines, yet argued for the specificity of the medium as an object of criticism. Eschewing technological determinism, moral pathologization and sterile debates about television's supposed effects, UK writers such as Raymond Williams addressed television as an aspect of culture. Television theory in Britain has been part of, and also separate from, the disciplinary fields of media theory, literary theory and film theory. It has focused its attention on institutions, audio-visual texts, genres, authors and viewers according to the ways that research problems and theoretical inadequacies have emerged over time. But a consistent feature has been the problem of moving from a descriptive discourse to an analytical and evaluative one, and from studies of specific texts, moments and locations of television to larger theories. By discussing some historically significant critical work about television, the article considers how academic work has constructed relationships between the different kinds of objects of study. The article argues that a fundamental tension between descriptive and politically activist discourses has confused academic writing about ›the popular‹. Television study in Britain arose not to supply graduate professionals to the television industry, nor to perfect the instrumental techniques of allied sectors such as advertising and marketing, but to analyse and critique the medium's aesthetic forms and to evaluate its role in culture. Since television cannot be made by ›the people‹, the empowerment that discourses of television theory and analysis aimed for was focused on disseminating the tools for critique. Recent developments in factual entertainment television (in Britain and elsewhere) have greatly increased the visibility of ›the people‹ in programmes, notably in docusoaps, game shows and other participative formats. This has led to renewed debates about whether such ›popular‹ programmes appropriately represent ›the people‹ and how factual entertainment that is often despised relates to genres hitherto considered to be of high quality, such as scripted drama and socially-engaged documentary television. A further aspect of this problem of evaluation is how television globalisation has been addressed, and the example that the issue has crystallised around most is the reality TV contest Big Brother. Television theory has been largely based on studying the texts, institutions and audiences of television in the Anglophone world, and thus in specific geographical contexts. The transnational contexts of popular television have been addressed as spaces of contestation, for example between Americanisation and national or regional identities. Commentators have been ambivalent about whether the discipline's role is to celebrate or critique television, and whether to do so within a national, regional or global context. In the discourses of the television industry, ›popular television‹ is a quantitative and comparative measure, and because of the overlap between the programming with the largest audiences and the scheduling of established programme types at the times of day when the largest audiences are available, it has a strong relationship with genre. The measurement of audiences and the design of schedules are carried out in predominantly national contexts, but the article refers to programmes like Big Brother that have been broadcast transnationally, and programmes that have been extensively exported, to consider in what ways they too might be called popular. Strands of work in television studies have at different times attempted to diagnose what is at stake in the most popular programme types, such as reality TV, situation comedy and drama series. This has centred on questions of how aesthetic quality might be discriminated in television programmes, and how quality relates to popularity. The interaction of the designations ›popular‹ and ›quality‹ is exemplified in the ways that critical discourse has addressed US drama series that have been widely exported around the world, and the article shows how the two critical terms are both distinct and interrelated. In this context and in the article as a whole, the aim is not to arrive at a definitive meaning for ›the popular‹ inasmuch as it designates programmes or indeed the medium of television itself. Instead the aim is to show how, in historically and geographically contingent ways, these terms and ideas have been dynamically adopted and contested in order to address a multiple and changing object of analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multispectral iris recognition uses information from multiple bands of the electromagnetic spectrum to better represent certain physiological characteristics of the iris texture and enhance obtained recognition accuracy. This paper addresses the questions of single versus cross spectral performance and compares score-level fusion accuracy for different feature types, combining different wavelengths to overcome limitations in less constrained recording environments. Further it is investigated whether Doddington's “goats” (users who are particularly difficult to recognize) in one spectrum also extend to other spectra. Focusing on the question of feature stability at different wavelengths, this work uses manual ground truth segmentation, avoiding bias by segmentation impact. Experiments on the public UTIRIS multispectral iris dataset using 4 feature extraction techniques reveal a significant enhancement when combining NIR + Red for 2-channel and NIR + Red + Blue for 3-channel fusion, across different feature types. Selective feature-level fusion is investigated and shown to improve overall and especially cross-spectral performance without increasing the overall length of the iris code.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates the potential of fusion at normalisation/segmentation level prior to feature extraction. While there are several biometric fusion methods at data/feature level, score level and rank/decision level combining raw biometric signals, scores, or ranks/decisions, this type of fusion is still in its infancy. However, the increasing demand to allow for more relaxed and less invasive recording conditions, especially for on-the-move iris recognition, suggests to further investigate fusion at this very low level. This paper focuses on the approach of multi-segmentation fusion for iris biometric systems investigating the benefit of combining the segmentation result of multiple normalisation algorithms, using four methods from two different public iris toolkits (USIT, OSIRIS) on the public CASIA and IITD iris datasets. Evaluations based on recognition accuracy and ground truth segmentation data indicate high sensitivity with regards to the type of errors made by segmentation algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

I denna uppsats har filmljudet i krigsfilmerna Apocalypse Now och Saving Private Ryan undersökts. Detta har gjorts för att försöka bidra med ökad förståelse för filmljudets användningsområde och funktioner, främst för filmerna i fråga, men även för krigsfilm rent generellt. Filmljud i denna kontext omfattar allt det ljud som finns i film, men utesluter dock all ickediegetisk musik. Båda filmerna har undersökts genom en audio-visuell analys. En sådan analys görs genom att detaljgranska båda filmernas ljud- och bildinnehåll var för sig, för att slutligen undersöka samma filmsekvens som helhet då ljudet och bilden satts ihop igen. Den audio-visuella analysmetod som nyttjats i uppsatsen är Michel Chions metod, Masking. De 30 minuter film som analyserades placerades sedan i olika filmljudzoner, där respektive filmljudzons ljudinnehåll bland annat visade vilka främsta huvudfunktioner somfilmljudet hade i dessa filmer. Dessa funktioner är till för att bibehålla åskådarens fokus och intresse, att skapa närhet till rollkaraktärerna, samt att tillföra en hög känsla av realism och närvaro. Intentionerna med filmljudet verkade vara att flytta åskådaren in i filmens verklighet, att låta åskådaren bli ett med filmen. Att återspegla denna känsla av realism, närvaro, fokus samt intresse, visade sig också vara de intentioner som funnits redan i de båda filmernas förproduktionsstadier. Detta bevisar att de lyckats åstadkomma det de eftersträvat. Men om filmljudet använts på samma sätt eller innehar samma funktioner i krigsfilm rent genrellt går inte att säga.I have for this bachelor’s thesis examined the movie sound of the classic warfare movies Apocalypse Now and Saving Private Ryan. This is an attempt to contribute to a more profound comprehension of the appliance and importance of movie sound. In this context movie sound implies all kinds of sounds within the movies, accept from non-diegetic music. These two movies have been examined by an audio-visual analysis. It's done by auditing the sound and picture content separately, and then combined to audit the same sequence as a whole. Michel Chion, which is the founder of this analysis, calls this method Masking. The sound in this 30 minute sequence was then divided into different zones, where every zone represented a certain main function. These functions are provided to create a stronger connection to the characters, sustain the viewers interest and bring a sense of realism and presence. It seems though the intention with the movies sound is to bring the viewers to the scene in hand, and let it become their reality. To mirror this sense of realism, presence, focus and interest, proves to be the intention from an early stage of the production. This bachelor’s thesis demonstrates a success in their endeavours. Although it can’t confirm whether the movie sound have been utilized in the same manner or if they posess the same functions to warefare movies in general.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The project introduces an application using computer vision for Hand gesture recognition. A camera records a live video stream, from which a snapshot is taken with the help of interface. The system is trained for each type of count hand gestures (one, two, three, four, and five) at least once. After that a test gesture is given to it and the system tries to recognize it.A research was carried out on a number of algorithms that could best differentiate a hand gesture. It was found that the diagonal sum algorithm gave the highest accuracy rate. In the preprocessing phase, a self-developed algorithm removes the background of each training gesture. After that the image is converted into a binary image and the sums of all diagonal elements of the picture are taken. This sum helps us in differentiating and classifying different hand gestures.Previous systems have used data gloves or markers for input in the system. I have no such constraints for using the system. The user can give hand gestures in view of the camera naturally. A completely robust hand gesture recognition system is still under heavy research and development; the implemented system serves as an extendible foundation for future work.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The skin cancer is the most common of all cancers and the increase of its incidence must, in part, caused by the behavior of the people in relation to the exposition to the sun. In Brazil, the non-melanoma skin cancer is the most incident in the majority of the regions. The dermatoscopy and videodermatoscopy are the main types of examinations for the diagnosis of dermatological illnesses of the skin. The field that involves the use of computational tools to help or follow medical diagnosis in dermatological injuries is seen as very recent. Some methods had been proposed for automatic classification of pathology of the skin using images. The present work has the objective to present a new intelligent methodology for analysis and classification of skin cancer images, based on the techniques of digital processing of images for extraction of color characteristics, forms and texture, using Wavelet Packet Transform (WPT) and learning techniques called Support Vector Machine (SVM). The Wavelet Packet Transform is applied for extraction of texture characteristics in the images. The WPT consists of a set of base functions that represents the image in different bands of frequency, each one with distinct resolutions corresponding to each scale. Moreover, the characteristics of color of the injury are also computed that are dependants of a visual context, influenced for the existing colors in its surround, and the attributes of form through the Fourier describers. The Support Vector Machine is used for the classification task, which is based on the minimization principles of the structural risk, coming from the statistical learning theory. The SVM has the objective to construct optimum hyperplanes that represent the separation between classes. The generated hyperplane is determined by a subset of the classes, called support vectors. For the used database in this work, the results had revealed a good performance getting a global rightness of 92,73% for melanoma, and 86% for non-melanoma and benign injuries. The extracted describers and the SVM classifier became a method capable to recognize and to classify the analyzed skin injuries

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature selection has been actively pursued in the last years, since to find the most discriminative set of features can enhance the recognition rates and also to make feature extraction faster. In this paper, the propose a new feature selection called Binary Cuckoo Search, which is based on the behavior of cuckoo birds. The experiments were carried out in the context of theft detection in power distribution systems in two datasets obtained from a Brazilian electrical power company, and have demonstrated the robustness of the proposed technique against with several others nature-inspired optimization techniques. © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)