854 resultados para Artificial intelligence|Computer science


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applications. The objective of this work is to recognize faces using video sequences both for training and recognition input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. In particular there are three areas of novelty: (i) we show how a photometric model of image formation can be combined with a statistical model of generic face appearance variation, learnt offline, to generalize in the presence of extreme illumination changes; (ii) we use the smoothness of geodesically local appearance manifold structure and a robust same-identity likelihood to achieve invariance to unseen head poses; and (iii) we introduce an accurate video sequence "reillumination" algorithm to achieve robustness to face motion patterns in video. We describe a fully automatic recognition system based on the proposed method and an extensive evaluation on 171 individuals and over 1300 video sequences with extreme illumination, pose and head motion variation. On this challenging data set our system consistently demonstrated a nearly perfect recognition rate (over 99.7%), significantly outperforming state-of-the-art commercial software and methods from the literature. © Springer-Verlag Berlin Heidelberg 2006.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized decision forest combines many such features to achieve a coherent 2D segmentation and recognize the object categories present. Our main contribution is to show how semantic segmentation is possible based solely on motion-derived 3D world structure. Our method works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors. Experiments were performed on a challenging new video database containing sequences filmed from a moving car in daylight and at dusk. The results confirm that indeed, accurate segmentation and recognition are possible using only motion and 3D world structure. Further, we show that the motion-derived information complements an existing state-of-the-art appearance-based method, improving both qualitative and quantitative performance. © 2008 Springer Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets. © 2010 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present the process of designing an efficient speech corpus for the first unit selection speech synthesis system for Bulgarian, along with some significant preliminary results regarding the quality of the resulted system. As the initial corpus is a crucial factor for the quality delivered by the Text-to-Speech system, special effort has been given in designing a complete and efficient corpus for use in a unit selection TTS system. The targeted domain of the TTS system and hence that of the corpus is the news reports, and although it is a restricted one, it is characterized by an unlimited vocabulary. The paper focuses on issues regarding the design of an optimal corpus for such a framework and the ideas on which our approach was based on. A novel multi-stage approach is presented, with special attention given to language and speaker dependent issues, as they affect the entire process. The paper concludes with the presentation of our results and the evaluation experiments, which provide clear evidence of the quality level achieved. © 2011 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We introduce a new algorithm to automatically identify the time and pixel location of foot contact events in high speed video of sprinters. We use this information to autonomously synchronise and overlay multiple recorded performances to provide feedback to athletes and coaches during their training sessions. The algorithm exploits the variation in speed of different parts of the body during sprinting. We use an array of foreground accumulators to identify short-term static pixels and a temporal analysis of the associated static regions to identify foot contacts. We evaluated the technique using 13 videos of three sprinters. It successfully identifed 55 of the 56 contacts, with a mean localisation error of 1.39±1.05 pixels. Some videos were also seen to produce additional, spurious contacts. We present heuristics to help identify the true contacts. © 2011 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this study was to determine if the responses of basal forebrain neurons are related to the cognitive processes necessary for the performance of behavioural tasks, or to the hedonic attributes of the reinforcers delivered to the monkey as

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current state-of-the-art techniques for determination of the change in volume of human chests, used in lung-function measurement, calculate the volume bounded by a reconstructed chest surface and its projection on to an approximately parallel static plane over a series of time instants. This method works well so long as the subject does not move globally relative to the reconstructed surface's co-ordinate system. In practice this means the subject has to be braced, which restricts the technique's use. We present here a method to compensate for global motion of the subject, allowing accurate measurement while free-standing, and also while undergoing intentional motion. © 2012 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a new learning method to infer a mid-level feature representation that combines the advantage of semantic attribute representations with the higher expressive power of non-semantic features. The idea lies in augmenting an existing attribute-based representation with additional dimensions for which an autoencoder model is coupled with a large-margin principle. This construction allows a smooth transition between the zero-shot regime with no training example, the unsupervised regime with training examples but without class labels, and the supervised regime with training examples and with class labels. The resulting optimization problem can be solved efficiently, because several of the necessity steps have closed-form solutions. Through extensive experiments we show that the augmented representation achieves better results in terms of object categorization accuracy than the semantic representation alone. © 2012 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a heterogeneous reconfigurable system for real-time applications applying particle filters. The system consists of an FPGA and a multi-threaded CPU. We propose a method to adapt the number of particles dynamically and utilise the run-time reconfigurability of the FPGA for reduced power and energy consumption. An application is developed which involves simultaneous mobile robot localisation and people tracking. It shows that the proposed adaptive particle filter can reduce up to 99% of computation time. Using run-time reconfiguration, we achieve 34% reduction in idle power and save 26-34% of system energy. Our proposed system is up to 7.39 times faster and 3.65 times more energy efficient than the Intel Xeon X5650 CPU with 12 threads, and 1.3 times faster and 2.13 times more energy efficient than an NVIDIA Tesla C2070 GPU. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The protection of the environment against pollutants produced by aviation is of great concern in the 21st century. Among the multiplicity of proposed solutions, modifying flight profiles for existing aircraft is a promising approach. The aim is to deliver and understand the trade-off between environmental impact and operating costs. This work will illustrate the optimisation process of aircraft trajectories by minimising fuel consumption and flight time for the climb phase of an aircraft that belongs to A320 category. To achieve this purpose a new variant of a multi-objective Tabu Search optimiser was evolved and integrated within a computational framework, called GATAC, that simulates flight profiles based on altitude and speed. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to its importance, video segmentation has regained interest recently. However, there is no common agreement about the necessary ingredients for best performance. This work contributes a thorough analysis of various within- and between-frame affinities suitable for video segmentation. Our results show that a frame-based superpixel segmentation combined with a few motion and appearance-based affinities are sufficient to obtain good video segmentation performance. A second contribution of the paper is the extension of [1] to include motion-cues, which makes the algorithm globally aware of motion, thus improving its performance for video sequences. Finally, we contribute an extension of an established image segmentation benchmark [1] to videos, allowing coarse-to-fine video segmentations and multiple human annotations. Our results are tested on BMDS [2], and compared to existing methods. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many visual datasets are traditionally used to analyze the performance of different learning techniques. The evaluation is usually done within each dataset, therefore it is questionable if such results are a reliable indicator of true generalization ability. We propose here an algorithm to exploit the existing data resources when learning on a new multiclass problem. Our main idea is to identify an image representation that decomposes orthogonally into two subspaces: a part specific to each dataset, and a part generic to, and therefore shared between, all the considered source sets. This allows us to use the generic representation as un-biased reference knowledge for a novel classification task. By casting the method in the multi-view setting, we also make it possible to use different features for different databases. We call the algorithm MUST, Multitask Unaligned Shared knowledge Transfer. Through extensive experiments on five public datasets, we show that MUST consistently improves the cross-datasets generalization performance. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses user target intention recognition algorithms for pointing - clicking tasks to reduce users' pointing time and difficulty. Predicting targets by comparing the bearing angles to targets proposed as one of the first algorithms [1] is compared with a Kalman Filter prediction algorithm. Accuracy and sensitivity of prediction are used as performance criteria. The outcomes of a standard point and click experiment are used for performance comparison, collected from both able-bodied and impaired users. © 2013 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports a survey on people with age-related and physical impairments in India. The survey evaluates functional parameters related to human computer interaction and reports subjective attitude and exposure of users towards technology. We found a significant cognitive decline in elderly users while their functional parameters are sufficient to use existing electronic devices. However young disabled users are found to be experienced with computer but could not have access to appropriate assistive devices, which would benefit them. Most users used desktop computers and mobile phone but none used tablet, smartphone or kiosks though they are keen to learn new technologies. Overall we hope that our results will be useful for HCI practitioners in developing countries. © 2013 Springer-Verlag Berlin Heidelberg.