956 resultados para computer science, artificial Intelligence


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose an algorithm for semantic segmentation based on 3D point clouds derived from ego-motion. We motivate five simple cues designed to model specific patterns of motion and 3D world structure that vary with object category. We introduce features that project the 3D cues back to the 2D image plane while modeling spatial layout and context. A randomized decision forest combines many such features to achieve a coherent 2D segmentation and recognize the object categories present. Our main contribution is to show how semantic segmentation is possible based solely on motion-derived 3D world structure. Our method works well on sparse, noisy point clouds, and unlike existing approaches, does not need appearance-based descriptors. Experiments were performed on a challenging new video database containing sequences filmed from a moving car in daylight and at dusk. The results confirm that indeed, accurate segmentation and recognition are possible using only motion and 3D world structure. Further, we show that the motion-derived information complements an existing state-of-the-art appearance-based method, improving both qualitative and quantitative performance. © 2008 Springer Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Density modeling is notoriously difficult for high dimensional data. One approach to the problem is to search for a lower dimensional manifold which captures the main characteristics of the data. Recently, the Gaussian Process Latent Variable Model (GPLVM) has successfully been used to find low dimensional manifolds in a variety of complex data. The GPLVM consists of a set of points in a low dimensional latent space, and a stochastic map to the observed space. We show how it can be interpreted as a density model in the observed space. However, the GPLVM is not trained as a density model and therefore yields bad density estimates. We propose a new training strategy and obtain improved generalisation performance and better density estimates in comparative evaluations on several benchmark data sets. © 2010 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we present the process of designing an efficient speech corpus for the first unit selection speech synthesis system for Bulgarian, along with some significant preliminary results regarding the quality of the resulted system. As the initial corpus is a crucial factor for the quality delivered by the Text-to-Speech system, special effort has been given in designing a complete and efficient corpus for use in a unit selection TTS system. The targeted domain of the TTS system and hence that of the corpus is the news reports, and although it is a restricted one, it is characterized by an unlimited vocabulary. The paper focuses on issues regarding the design of an optimal corpus for such a framework and the ideas on which our approach was based on. A novel multi-stage approach is presented, with special attention given to language and speaker dependent issues, as they affect the entire process. The paper concludes with the presentation of our results and the evaluation experiments, which provide clear evidence of the quality level achieved. © 2011 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We introduce a new algorithm to automatically identify the time and pixel location of foot contact events in high speed video of sprinters. We use this information to autonomously synchronise and overlay multiple recorded performances to provide feedback to athletes and coaches during their training sessions. The algorithm exploits the variation in speed of different parts of the body during sprinting. We use an array of foreground accumulators to identify short-term static pixels and a temporal analysis of the associated static regions to identify foot contacts. We evaluated the technique using 13 videos of three sprinters. It successfully identifed 55 of the 56 contacts, with a mean localisation error of 1.39±1.05 pixels. Some videos were also seen to produce additional, spurious contacts. We present heuristics to help identify the true contacts. © 2011 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current state-of-the-art techniques for determination of the change in volume of human chests, used in lung-function measurement, calculate the volume bounded by a reconstructed chest surface and its projection on to an approximately parallel static plane over a series of time instants. This method works well so long as the subject does not move globally relative to the reconstructed surface's co-ordinate system. In practice this means the subject has to be braced, which restricts the technique's use. We present here a method to compensate for global motion of the subject, allowing accurate measurement while free-standing, and also while undergoing intentional motion. © 2012 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a new learning method to infer a mid-level feature representation that combines the advantage of semantic attribute representations with the higher expressive power of non-semantic features. The idea lies in augmenting an existing attribute-based representation with additional dimensions for which an autoencoder model is coupled with a large-margin principle. This construction allows a smooth transition between the zero-shot regime with no training example, the unsupervised regime with training examples but without class labels, and the supervised regime with training examples and with class labels. The resulting optimization problem can be solved efficiently, because several of the necessity steps have closed-form solutions. Through extensive experiments we show that the augmented representation achieves better results in terms of object categorization accuracy than the semantic representation alone. © 2012 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a heterogeneous reconfigurable system for real-time applications applying particle filters. The system consists of an FPGA and a multi-threaded CPU. We propose a method to adapt the number of particles dynamically and utilise the run-time reconfigurability of the FPGA for reduced power and energy consumption. An application is developed which involves simultaneous mobile robot localisation and people tracking. It shows that the proposed adaptive particle filter can reduce up to 99% of computation time. Using run-time reconfiguration, we achieve 34% reduction in idle power and save 26-34% of system energy. Our proposed system is up to 7.39 times faster and 3.65 times more energy efficient than the Intel Xeon X5650 CPU with 12 threads, and 1.3 times faster and 2.13 times more energy efficient than an NVIDIA Tesla C2070 GPU. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The protection of the environment against pollutants produced by aviation is of great concern in the 21st century. Among the multiplicity of proposed solutions, modifying flight profiles for existing aircraft is a promising approach. The aim is to deliver and understand the trade-off between environmental impact and operating costs. This work will illustrate the optimisation process of aircraft trajectories by minimising fuel consumption and flight time for the climb phase of an aircraft that belongs to A320 category. To achieve this purpose a new variant of a multi-objective Tabu Search optimiser was evolved and integrated within a computational framework, called GATAC, that simulates flight profiles based on altitude and speed. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Due to its importance, video segmentation has regained interest recently. However, there is no common agreement about the necessary ingredients for best performance. This work contributes a thorough analysis of various within- and between-frame affinities suitable for video segmentation. Our results show that a frame-based superpixel segmentation combined with a few motion and appearance-based affinities are sufficient to obtain good video segmentation performance. A second contribution of the paper is the extension of [1] to include motion-cues, which makes the algorithm globally aware of motion, thus improving its performance for video sequences. Finally, we contribute an extension of an established image segmentation benchmark [1] to videos, allowing coarse-to-fine video segmentations and multiple human annotations. Our results are tested on BMDS [2], and compared to existing methods. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many visual datasets are traditionally used to analyze the performance of different learning techniques. The evaluation is usually done within each dataset, therefore it is questionable if such results are a reliable indicator of true generalization ability. We propose here an algorithm to exploit the existing data resources when learning on a new multiclass problem. Our main idea is to identify an image representation that decomposes orthogonally into two subspaces: a part specific to each dataset, and a part generic to, and therefore shared between, all the considered source sets. This allows us to use the generic representation as un-biased reference knowledge for a novel classification task. By casting the method in the multi-view setting, we also make it possible to use different features for different databases. We call the algorithm MUST, Multitask Unaligned Shared knowledge Transfer. Through extensive experiments on five public datasets, we show that MUST consistently improves the cross-datasets generalization performance. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses user target intention recognition algorithms for pointing - clicking tasks to reduce users' pointing time and difficulty. Predicting targets by comparing the bearing angles to targets proposed as one of the first algorithms [1] is compared with a Kalman Filter prediction algorithm. Accuracy and sensitivity of prediction are used as performance criteria. The outcomes of a standard point and click experiment are used for performance comparison, collected from both able-bodied and impaired users. © 2013 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports a survey on people with age-related and physical impairments in India. The survey evaluates functional parameters related to human computer interaction and reports subjective attitude and exposure of users towards technology. We found a significant cognitive decline in elderly users while their functional parameters are sufficient to use existing electronic devices. However young disabled users are found to be experienced with computer but could not have access to appropriate assistive devices, which would benefit them. Most users used desktop computers and mobile phone but none used tablet, smartphone or kiosks though they are keen to learn new technologies. Overall we hope that our results will be useful for HCI practitioners in developing countries. © 2013 Springer-Verlag Berlin Heidelberg.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present Multi Scale Shape Index (MSSI), a novel feature for 3D object recognition. Inspired by the scale space filtering theory and Shape Index measure proposed by Koenderink & Van Doorn [6], this feature associates different forms of shape, such as umbilics, saddle regions, parabolic regions to a real valued index. This association is useful for representing an object based on its constituent shape forms. We derive closed form scale space equations which computes a characteristic scale at each 3D point in a point cloud without an explicit mesh structure. This characteristic scale is then used to estimate the Shape Index. We quantitatively evaluate the robustness and repeatability of the MSSI feature for varying object scales and changing point cloud density. We also quantify the performance of MSSI for object category recognition on a publicly available dataset. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an insight into leather manufacturing processes, depicting peculiarities and challenges faced by leather industry. An analysis of this industry reveals the need for a new approach to optimize the productivity of leather processing operations, ensure consistent quality of leather, mitigate the adverse health effects in tannery workers exposed to chemicals and comply with environmental regulation. Holonic manufacturing systems (HMS) paradigm represent a bottom-up distributed approach that provides stability, adaptability, efficient use of resources and a plug and operate functionality to the manufacturing system. A vision of how HMS might operate in a tannery is illustrated presenting the rationales behind its application in this industry. © 2013 Springer-Verlag.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an overview of the Text-to-Speech synthesis system developed at the Institute for Language and Speech Processing (ILSP). It focuses on the key issues regarding the design of the system components. The system currently fully supports three languages (Greek, English, Bulgarian) and is designed in such a way to be as language and speaker independent as possible. Also, experimental results are presented which show that the system produces high quality synthetic speech in terms of naturalness and intelligibility. The system was recently ranked among the first three systems worldwide in terms of achieved quality for the English language, at the international Blizzard Challenge 2013 workshop. © 2014 Springer International Publishing.