944 resultados para Visual Speech Recognition, Multiple Views, Frontal View, Profile View


100.00% 100.00%



This perspectives paper and its associated commentaries examine Alan Rugman's conceptual contribution to international business scholarship. Most significantly, we highlight Rugman's version of internalization theory as an approach that integrates transaction cost economics and ‘classical’ internalization theory with elements from the resource-based view, such that it is especially relevant to strategic management. In reviewing his oeuvre, we also offer observations on his ideas for ‘new internalization theory’. We classify his other novel insights into four categories: Network Multinationals; National competitiveness; Development and public policy; and Emerging Economy MNEs. This special section offers multiple views on how his work informed the larger academic debate and considers how these ideas might evolve in the longer term.


100.00% 100.00%



Background: Voice processing in real-time is challenging. A drawback of previous work for Hypokinetic Dysarthria (HKD) recognition is the requirement of controlled settings in a laboratory environment. A personal digital assistant (PDA) has been developed for home assessment of PD patients. The PDA offers sound processing capabilities, which allow for developing a module for recognition and quantification HKD. Objective: To compose an algorithm for assessment of PD speech severity in the home environment based on a review synthesis. Methods: A two-tier review methodology is utilized. The first tier focuses on real-time problems in speech detection. In the second tier, acoustics features that are robust to medication changes in Levodopa-responsive patients are investigated for HKD recognition. Keywords such as Hypokinetic Dysarthria , and Speech recognition in real time were used in the search engines. IEEE explorer produced the most useful search hits as compared to Google Scholar, ELIN, EBRARY, PubMed and LIBRIS. Results: Vowel and consonant formants are the most relevant acoustic parameters to reflect PD medication changes. Since relevant speech segments (consonants and vowels) contains minority of speech energy, intelligibility can be improved by amplifying the voice signal using amplitude compression. Pause detection and peak to average power rate calculations for voice segmentation produce rich voice features in real time. Enhancements in voice segmentation can be done by inducing Zero-Crossing rate (ZCR). Consonants have high ZCR whereas vowels have low ZCR. Wavelet transform is found promising for voice analysis since it quantizes non-stationary voice signals over time-series using scale and translation parameters. In this way voice intelligibility in the waveforms can be analyzed in each time frame. Conclusions: This review evaluated HKD recognition algorithms to develop a tool for PD speech home-assessment using modern mobile technology. An algorithm that tackles realtime constraints in HKD recognition based on the review synthesis is proposed. We suggest that speech features may be further processed using wavelet transforms and used with a neural network for detection and quantification of speech anomalies related to PD. Based on this model, patients' speech can be automatically categorized according to UPDRS speech ratings.


100.00% 100.00%



This paper maps the current debates surrounding school-based and university-based teacher education models, and presents a ‘multiple-space’ model of teacher education that both explores and values the many ‘forgotten’ spaces that teachers work in. It draws from a variety of research studies, including my own doctoral work, to argue for a new approach to teacher education programs. I suggest that in order for teacher education to move beyond separatist, binary models, we need to adopt a ‘multiple-space’ view of learning to be a teacher that embraces the notion that teachers do not learn about theory in a university space, nor do they simply work in a classroom space.


100.00% 100.00%



Speaker recognition is the process of automatically recognizing the speaker by analyzing individual information contained in the speech waves. In this paper, we discuss the development of an intelligent system for text-dependent speaker recognition. The system comprises two main modules, a wavelet-based signal-processing module for feature extraction of speech waves, and an artificial-neural-network-based classifier module to identify and categorize the speakers. Wavelet is used in de-noising and in compressing the speech signals. The wavelet family that we used is the Daubechies Wavelets. After extracting the necessary features from the speech waves, the features were then fed to a neural-network-based classifier to identify the speakers. We have implemented the Fuzzy ARTMAP (FAM) network in the classifier module to categorize the de-noised and compressed signals. The proposed intelligent learning system has been applied to a case study of text-dependent speaker recognition problem.


100.00% 100.00%



Super-resolution is a method of post-processing image enhancement that increases the spatial resolution of video or images. Existing super-resolution techniques apply only to images captured of a planar scene. This paper aims to extend super-resolution concepts from the 2D domain to the 3D domain, drawing on ideas from both superresolution and multi-view geometry, two fields of research that until now have predominantly been studied in isolation. 2D super-resolution methods are not without their complexities and limitations. However, once multiple views of a scene are considered within a super-resolution framework, a new range of issues arise that must also be resolved. For example, when input images of a scene with variation in depth are considered, it is no longer clear how and where the images should be registered. This paper describes the use of sparse 3D reconstruction in order to ‘register’ the input images, which are then transferred to a novel image plane and combined to increase the perceived detail in the scene. Experimental results using real images captured from generally positioned input cameras are presented.


100.00% 100.00%



The work described in this thesis aims to support the distributed design of integrated systems and considers specifically the need for collaborative interaction among designers. Particular emphasis was given to issues which were only marginally considered in previous approaches, such as the abstraction of the distribution of design automation resources over the network, the possibility of both synchronous and asynchronous interaction among designers and the support for extensible design data models. Such issues demand a rather complex software infrastructure, as possible solutions must encompass a wide range of software modules: from user interfaces to middleware to databases. To build such structure, several engineering techniques were employed and some original solutions were devised. The core of the proposed solution is based in the joint application of two homonymic technologies: CAD Frameworks and object-oriented frameworks. The former concept was coined in the late 80's within the electronic design automation community and comprehends a layered software environment which aims to support CAD tool developers, CAD administrators/integrators and designers. The latter, developed during the last decade by the software engineering community, is a software architecture model to build extensible and reusable object-oriented software subsystems. In this work, we proposed to create an object-oriented framework which includes extensible sets of design data primitives and design tool building blocks. Such object-oriented framework is included within a CAD Framework, where it plays important roles on typical CAD Framework services such as design data representation and management, versioning, user interfaces, design management and tool integration. The implemented CAD Framework - named Cave2 - followed the classical layered architecture presented by Barnes, Harrison, Newton and Spickelmier, but the possibilities granted by the use of the object-oriented framework foundations allowed a series of improvements which were not available in previous approaches: - object-oriented frameworks are extensible by design, thus this should be also true regarding the implemented sets of design data primitives and design tool building blocks. This means that both the design representation model and the software modules dealing with it can be upgraded or adapted to a particular design methodology, and that such extensions and adaptations will still inherit the architectural and functional aspects implemented in the object-oriented framework foundation; - the design semantics and the design visualization are both part of the object-oriented framework, but in clearly separated models. This allows for different visualization strategies for a given design data set, which gives collaborating parties the flexibility to choose individual visualization settings; - the control of the consistency between semantics and visualization - a particularly important issue in a design environment with multiple views of a single design - is also included in the foundations of the object-oriented framework. Such mechanism is generic enough to be also used by further extensions of the design data model, as it is based on the inversion of control between view and semantics. The view receives the user input and propagates such event to the semantic model, which evaluates if a state change is possible. If positive, it triggers the change of state of both semantics and view. Our approach took advantage of such inversion of control and included an layer between semantics and view to take into account the possibility of multi-view consistency; - to optimize the consistency control mechanism between views and semantics, we propose an event-based approach that captures each discrete interaction of a designer with his/her respective design views. The information about each interaction is encapsulated inside an event object, which may be propagated to the design semantics - and thus to other possible views - according to the consistency policy which is being used. Furthermore, the use of event pools allows for a late synchronization between view and semantics in case of unavailability of a network connection between them; - the use of proxy objects raised significantly the abstraction of the integration of design automation resources, as either remote or local tools and services are accessed through method calls in a local object. The connection to remote tools and services using a look-up protocol also abstracted completely the network location of such resources, allowing for resource addition and removal during runtime; - the implemented CAD Framework is completely based on Java technology, so it relies on the Java Virtual Machine as the layer which grants the independence between the CAD Framework and the operating system. All such improvements contributed to a higher abstraction on the distribution of design automation resources and also introduced a new paradigm for the remote interaction between designers. The resulting CAD Framework is able to support fine-grained collaboration based on events, so every single design update performed by a designer can be propagated to the rest of the design team regardless of their location in the distributed environment. This can increase the group awareness and allow a richer transfer of experiences among them, improving significantly the collaboration potential when compared to previously proposed file-based or record-based approaches. Three different case studies were conducted to validate the proposed approach, each one focusing one a subset of the contributions of this thesis. The first one uses the proxy-based resource distribution architecture to implement a prototyping platform using reconfigurable hardware modules. The second one extends the foundations of the implemented object-oriented framework to support interface-based design. Such extensions - design representation primitives and tool blocks - are used to implement a design entry tool named IBlaDe, which allows the collaborative creation of functional and structural models of integrated systems. The third case study regards the possibility of integration of multimedia metadata to the design data model. Such possibility is explored in the frame of an online educational and training platform.


100.00% 100.00%



Os resultados das análises feitas com estes dados indicaram diferenças significativas no aumento da amplitude do plano meridiano horizontal nasal do campo visual monocular, medidas em unidades angulares. As diferenças foram interpretadas como indicativas da influência dos três diferentes níveis de complexidade dos estímulos visuais. Concluiu-se, portanto, que a variável colativa por complexidade influi no ato perceptual do reconhecimento visual.


100.00% 100.00%



Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)


100.00% 100.00%



This paper presents some results of the application on Evolvable Hardware (EHW) in the area of voice recognition. Evolvable Hardware is able to change inner connections, using genetic learning techniques, adapting its own functionality to external condition changing. This technique became feasible by the improvement of the Programmable Logic Devices. Nowadays, it is possible to have, in a single device, the ability to change, on-line and in real-time, part of its own circuit. This work proposes a reconfigurable architecture of a system that is able to receive voice commands to execute special tasks as, to help handicapped persons in their daily home routines. The idea is to collect several voice samples, process them through algorithms based on Mel - Ceptrais theory to obtain their numerical coefficients for each sample, which, compose the universe of search used by genetic algorithm. The voice patterns considered, are limited to seven sustained Portuguese vowel phonemes (a, eh, e, i, oh, o, u).


100.00% 100.00%



An intelligent system that emulates human decision behaviour based on visual data acquisition is proposed. The approach is useful in applications where images are used to supply information to specialists who will choose suitable actions. An artificial neural classifier aids a fuzzy decision support system to deal with uncertainty and imprecision present in available information. Advantages of both techniques are exploited complementarily. As an example, this method was applied in automatic focus checking and adjustment in video monitor manufacturing. Copyright © 2005 IFAC.


100.00% 100.00%



The applications of Automatic Vowel Recognition (AVR), which is a sub-part of fundamental importance in most of the speech processing systems, vary from automatic interpretation of spoken language to biometrics. State-of-the-art systems for AVR are based on traditional machine learning models such as Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs), however, such classifiers can not deal with efficiency and effectiveness at the same time, existing a gap to be explored when real-time processing is required. In this work, we present an algorithm for AVR based on the Optimum-Path Forest (OPF), which is an emergent pattern recognition technique recently introduced in literature. Adopting a supervised training procedure and using speech tags from two public datasets, we observed that OPF has outperformed ANNs, SVMs, plus other classifiers, in terms of training time and accuracy. ©2010 IEEE.


100.00% 100.00%



Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)


100.00% 100.00%



Versa sobre a aplicação das características de um ambiente de realidade aumentada, tais como: interação fácil e intuitiva, e grande espaço para visualização de dados, na implementação, interação e visualização de múltiplas visões de dados coordenadas. As múltiplas visões de dados permitem que o usuário realize uma melhor análise dos dados sobre diferentes aspectos, e a coordenação entre as múltiplas visões tem o objetivo de diminuir a sobrecarga oogniriva conferida ao usuário. O ambiente aumentado foi concebido através do ARToolKit, a interação se dá através de uma interface baseada em cartões marcadores. A técnica implementada foi Dispersão de Dados 3D, acompanhada de uma diversidade de filtros e configurações para as visões de dados. Por fim, são apresentados alguns ensaios de usabilidade preliminares do protótipo desenvolvido.


100.00% 100.00%



Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)


100.00% 100.00%



Este texto é parte das reflexões teóricas do nosso Pós-doutorado realizado junto ao Laboratório de Antropologia Visual da Universidade Aberta de Portugal que abordou aspectos interculturais do estudo fotoetnográfico da publicidade e do consumo alimentar no Brasil e em Portugal. Aqui serão ressaltados os aspectos referentes às contribuições da semiótica para o estudo das comunicações publicitárias de alimentos. A proposta é entender os modelos de análise semiótica da publicidade como um meio de operacionalização da descrição densa, na perspectiva etnográfica, a partir da interface interdisciplinar com a produção de sentido das imagens publicitárias, no campo da alimentação, apresentado a análise de um anúncio do azeite Gallo como exemplo.