904 resultados para Audio-Visual Automatic Speech Recognition
Resumo:
New low cost sensors and the new open free libraries for 3D image processing are permitting to achieve important advances for robot vision applications such as tridimensional object recognition, semantic mapping, navigation and localization of robots, human detection and/or gesture recognition for human-machine interaction. In this paper, a method to recognize the human hand and to track the fingers is proposed. This new method is based on point clouds from range images, RGBD. It does not require visual marks, camera calibration, environment knowledge and complex expensive acquisition systems. Furthermore, this method has been implemented to create a human interface in order to move a robot hand. The human hand is recognized and the movement of the fingers is analyzed. Afterwards, it is imitated from a Barret hand, using communication events programmed from ROS.
Resumo:
This paper presents a method for the fast calculation of a robot’s egomotion using visual features. The method is part of a complete system for automatic map building and Simultaneous Location and Mapping (SLAM). The method uses optical flow to determine whether the robot has undergone a movement. If so, some visual features that do not satisfy several criteria are deleted, and then egomotion is calculated. Thus, the proposed method improves the efficiency of the whole process because not all the data is processed. We use a state-of-the-art algorithm (TORO) to rectify the map and solve the SLAM problem. Additionally, a study of different visual detectors and descriptors has been conducted to identify which of them are more suitable for the SLAM problem. Finally, a navigation method is described using the map obtained from the SLAM solution.
Resumo:
In this project, we propose the implementation of a 3D object recognition system which will be optimized to operate under demanding time constraints. The system must be robust so that objects can be recognized properly in poor light conditions and cluttered scenes with significant levels of occlusion. An important requirement must be met: the system must exhibit a reasonable performance running on a low power consumption mobile GPU computing platform (NVIDIA Jetson TK1) so that it can be integrated in mobile robotics systems, ambient intelligence or ambient assisted living applications. The acquisition system is based on the use of color and depth (RGB-D) data streams provided by low-cost 3D sensors like Microsoft Kinect or PrimeSense Carmine. The range of algorithms and applications to be implemented and integrated will be quite broad, ranging from the acquisition, outlier removal or filtering of the input data and the segmentation or characterization of regions of interest in the scene to the very object recognition and pose estimation. Furthermore, in order to validate the proposed system, we will create a 3D object dataset. It will be composed by a set of 3D models, reconstructed from common household objects, as well as a handful of test scenes in which those objects appear. The scenes will be characterized by different levels of occlusion, diverse distances from the elements to the sensor and variations on the pose of the target objects. The creation of this dataset implies the additional development of 3D data acquisition and 3D object reconstruction applications. The resulting system has many possible applications, ranging from mobile robot navigation and semantic scene labeling to human-computer interaction (HCI) systems based on visual information.
Resumo:
Sensing techniques are important for solving problems of uncertainty inherent to intelligent grasping tasks. The main goal here is to present a visual sensing system based on range imaging technology for robot manipulation of non-rigid objects. Our proposal provides a suitable visual perception system of complex grasping tasks to support a robot controller when other sensor systems, such as tactile and force, are not able to obtain useful data relevant to the grasping manipulation task. In particular, a new visual approach based on RGBD data was implemented to help a robot controller carry out intelligent manipulation tasks with flexible objects. The proposed method supervises the interaction between the grasped object and the robot hand in order to avoid poor contact between the fingertips and an object when there is neither force nor pressure data. This new approach is also used to measure changes to the shape of an object’s surfaces and so allows us to find deformations caused by inappropriate pressure being applied by the hand’s fingers. Test was carried out for grasping tasks involving several flexible household objects with a multi-fingered robot hand working in real time. Our approach generates pulses from the deformation detection method and sends an event message to the robot controller when surface deformation is detected. In comparison with other methods, the obtained results reveal that our visual pipeline does not use deformations models of objects and materials, as well as the approach works well both planar and 3D household objects in real time. In addition, our method does not depend on the pose of the robot hand because the location of the reference system is computed from a recognition process of a pattern located place at the robot forearm. The presented experiments demonstrate that the proposed method accomplishes a good monitoring of grasping task with several objects and different grasping configurations in indoor environments.
Resumo:
A deficiência auditiva afecta milhões de pessoas em todo o mundo, originando vários problemas, nomeadamente a nível psicossocial, que comprometem a qualidade de vida do indivíduo. A deficiência auditiva influencia o comportamento, particularmente ao dificultar a comunicação. Com o avanço tecnológico, os produtos de apoio, em particular os aparelhos auditivos e o implante coclear, melhoram essa qualidade de vida, através da melhoria da comunicação. Com as escalas de avaliação determinamos o modo como a deficiência auditiva influencia a vida diária, com ou sem amplificação, e de que forma afecta o desempenho psicossocial, emocional ou profissional do indivíduo, sendo esta informação importante para determinar a necessidade e o sucesso de amplificação, independentemente do tipo e grau da deficiência auditiva. O objectivo do presente estudo foi a tradução e adaptação para a cultura portuguesa da escala The Speech, Spatial and Qualities of Hearing Scale (SSQ), desenvolvida por Stuart Gatehouse e William Noble em 2004. Este trabalho foi realizado nos centros auditivos da Widex Portugal. Após os procedimentos de tradução e retroversão, a versão portuguesa foi testada em 12 indivíduos, com idades compreendidas entre os 36 anos e os 80 anos, dos quais 6 utilizavam prótese auditiva há mais de um ano, um utilizava prótese há menos de um ano e 5 nunca tinham utilizado. Com a tradução e adaptação cultural para o Português Europeu do “Questionário sobre as Qualidades Espaciais do Discurso – SSQ”, contribuímos para uma melhor avaliação dos indivíduos que estejam, ou venham a estar, a cumprir programas de reabilitação auditiva.
Resumo:
"C00-2118-0048."
Resumo:
Bibliography: leaf 25.
Resumo:
Mode of access: Internet.
Resumo:
Includes bibliographies. "Visual-aids bibliography": p. 371-374.
Resumo:
The objective of this study was to compare onset of deep and superficial cervical flexor muscle activity during rapid, unilateral arm movements between ten patients with chronic neck pain and 12 control subjects. Deep cervical flexor (DCF) electromyographic activity (EMG) was recorded with custom electrodes inserted via the nose and fixed by suction to the posterior mucosa of the oropharynx. Surface electrodes were placed over the sternocleidomastoid (SCM) and anterior scalene (AS) muscles. While standing, subjects flexed and extended the right arm in response to a visual stimulus. For the control group, activation of DCF, SCM and AS muscles occurred less than 50 ms after the onset of deltoid activity, which is consistent with feedforward control of the neck during arm flexion and extension. When subjects with a history of neck pain flexed the arm, the onsets of DCF and contralateral SCM and AS muscles were significantly delayed (p<0.05). It is concluded that the delay in neck muscle activity associated with movement of the arm in patients with neck pain indicates a significant deficit in the automatic feedforward control of the cervical spine. As the deep cervical muscles are fundamentally important for support of the cervical lordosis and the cervical joints, change in the feedforward response may leave the cervical spine vulnerable to reactive forces from arm movement.
Resumo:
To investigate the effects of dopamine on the dynamics of semantic activation, 39 healthy volunteers were randomly assigned to ingest either a placebo (n = 24) or a levodopa (it = 16) capsule. Participants then performed a lexical decision task that implemented a masked priming paradigm. Direct and indirect semantic priming was measured across stimulus onset asynchronies (SOAs) of 250, 500 and 1200 ms. The results revealed significant direct and indirect semantic priming effects for the placebo group at SOAs of 250 ms and 500 ms, but no significant direct or indirect priming effects at the 1200 ms SOA. In contrast, the levodopa group showed significant direct and indirect semantic priming effects at the 250 ms SOA, while no significant direct or indirect priming effects were evident at the SOAs of 500 ins or 1200 ms. These results suggest that dopamine has a role in modulating both automatic and attentional aspects of semantic activation according to a specific time course. The implications of these results for current theories of dopaminergic modulation of semantic activation are discussed.
Resumo:
Drawing from ethnographic, empirical, and historical/cultural perspectives, we examine the extent to which visual aspects of music contribute to the communication that takes place between performers and their listeners. First, we introduce a framework for understanding how media and genres shape aural and visual experiences of music. Second, we present case studies of two performances, and describe the relation between visual and aural aspects of performance. Third, we report empirical evidence that visual aspects of performance reliably influence perceptions of musical structure (pitch related features) and affective interpretations of music. Finally, we trace new and old media trajectories of aural and visual dimensions of music, and highlight how our conceptions, perceptions and appreciation of music are intertwined with technological innovation and media deployment strategies.
Resumo:
Previously it has been shown that the branching pattern of pyramidal cells varies markedly between different cortical areas in simian primates. These differences are thought to influence the functional complexity of the cells. In particular, there is a progressive increase in the fractal dimension of pyramidal cells with anterior progression through cortical areas in the occipitotemporal (OT) visual stream, including the primary visual area (V1), the second visual area (V2), the dorsolateral area (DL, corresponding to the fourth visual area) and inferotemporal cortex (IT). However, there are as yet no data on the fractal dimension of these neurons in prosimian primates. Here we focused on the nocturnal prosimian galago (Otolemur garnetti). The fractal dimension (D), and aspect ratio (a measure of branching symmetry), was determined for I I I layer III pyramidal cells in V1, V2, DL and IT. We found, as in simian primates, that the fractal dimension of neurons increased with anterior progression from V1 through V2, DL, and IT. Two important conclusions can be drawn from these results: (1) the trend for increasing branching complexity with anterior progression through OT areas was likely to be present in a common primate ancestor, and (2) specialization in neuron structure more likely facilitates object recognition than spectral processing.