66 resultados para Visual Speaker Recognition, Visual Speech Recognition, Cascading Appearance-Based Features


Relevância:

60.00% 60.00%

Publicador:

Resumo:

International exhibitions were greatly responsible for the modernization of western society. The motive for these events was based on the possibility of enhancing the country’s international status abroad. The genesis of world exhibitions came from the conviction that humanity as a whole would improve the continual flow of new practical applications, the development of modern communication techniques and the social need for a medium that could acquaint the general public with changes in technology, economy and society .
Since the first national industrial exhibitions in Paris during the eighteenth century and especially starting from the first Great Exhibition in London’s Hyde Park in 1851 these international events spread steadily all over Europe and the United States, to reach Latin America in the beginnings of the twentieth century . The work of professionals such as Daniel Burnham, Werner Hegemann and Elbert Peets made the relation between exhibitions and urban transformation a much more connected one, setting a precedent for subsequent exhibitions.
In Buenos Aires, the celebration of the centennial of independence from Spain in 1910 had many meanings and repercussions. A series of factors allowed for a moment of change in the city. Official optimism, economical progress, inequality and social conflict made of this a suitable time for transformation. With the organization of the Exposición Internacional the government had, among others, one specific aim: to achieve a network of visual tools to set the feeling of belonging and provide an identity for the mixture of cultures that populated the city of Buenos Aires at the time. Another important objective of the government was to put Buenos Aires at the level of European cities.
Foreign professionals had a great influence in the conceptual and factual shaping of the exhibition and in the subsequent changes caused in the urban condition. The exhibition had an important role in the ways of thinking the city and in the leisure ideas it introduced. The exhibition, as a didactic tool, worked as a precedent for conceiving leisure spaces in the future. Urban and landscape planners such as Joseph Bouvard and Charles Thays were instrumental in great part of the design of the Exhibition, but it was not only the architects and designers who shaped the identity of the fair. Other visitors such as Jules Huret or Georges Clemenceau were responsible for giving the city an international image it did not previously have.
This paper will explore on the one hand the significance of the exhibition of 1910 for the shaping of the city and its image; and on the other hand, the role of foreign professionals and the reach these influences had.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

While current speech recognisers give acceptable performance in carefully controlled environments, their performance degrades rapidly when they are applied in more realistic situations. Generally, the environmental noise may be classified into two classes: the wide-band noise and narrow band noise. While the multi-band model has been shown to be capable of dealing with speech corrupted by narrow-band noise, it is ineffective for wide-band noise. In this paper, we suggest a combination of the frequency-filtering technique with the probabilistic union model in the multi-band approach. The new system has been tested on the TIDIGITS database, corrupted by white noise, noise collected from a railway station, and narrow-band noise, respectively. The results have shown that this approach is capable of dealing with noise of narrow-band or wide-band characteristics, assuming no knowledge about the noisy environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a novel method of audio-visual fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there is a limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new representation and a modified cosine similarity are introduced for combining and comparing bimodal features with limited training data as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal data set created from the SPIDRE and AR databases with variable noise corruption of speech and occlusion in the face images. The new method has demonstrated improved recognition accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In order to use virtual reality as a sport analysis tool, we need to be sure that an immersed athlete reacts realistically in a virtual environment. This has been validated for a real handball goalkeeper facing a virtual thrower. However, we currently ignore which visual variables induce a realistic motor behavior of the immersed handball goalkeeper. In this study, we used virtual reality to dissociate the visual information related to the movements of the player from the visual information related to the trajectory of the ball. Thus, the aim is to evaluate the relative influence of these different visual information sources on the goalkeeper's motor behavior. We tested 10 handball goalkeepers who had to predict the final position of the virtual ball in the goal when facing the following: only the throwing action of the attacking player (TA condition), only the resulting ball trajectory (BA condition), and both the throwing action of the attacking player and the resulting ball trajectory (TB condition). Here we show that performance was better in the BA and TB conditions, but contrary to expectations, performance was substantially worse in the TA condition. A significant effect of ball landing zone does, however, suggest that the relative importance between visual information from the player and the ball depends on the targeted zone in the goal. In some cases, body-based cues embedded in the throwing actions may have a minor influence on the ball trajectory and vice versa. Kinematics analysis was then combined with these results to determine why such differences occur depending on the ball landing zone and consequently how it can clarify the role of different sources of visual information on the motor behavior of an athlete immersed in a virtual environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper investigated using lip movements as a behavioural biometric for person authentication. The system was trained, evaluated and tested using the XM2VTS dataset, following the Lausanne Protocol configuration II. Features were selected from the DCT coefficients of the greyscale lip image. This paper investigated the number of DCT coefficients selected, the selection process, and static and dynamic feature combinations. Using a Gaussian Mixture Model - Universal Background Model framework an Equal Error Rate of 2.20% was achieved during evaluation and on an unseen test set a False Acceptance Rate of 1.7% and False Rejection Rate of 3.0% was achieved. This compares favourably with face authentication results on the same dataset whilst not being susceptible to spoofing attacks.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The authors are concerned with the development of computer systems that are capable of using information from faces and voices to recognise people's emotions in real-life situations. The paper addresses the nature of the challenges that lie ahead, and provides an assessment of the progress that has been made in the areas of signal processing and analysis techniques (with regard to speech and face), and the psychological and linguistic analyses of emotion. Ongoing developmental work by the authors in each of these areas is described.