883 resultados para vision-based place recognition


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This book brings together experts in the fields of spatial planning, landuse and infrastructure management to explore the emerging agenda of spatially-oriented integrated evaluation. It weaves together the latest theories, case studies, methods, policy and practice to examine and assess the values, impacts, benefits and the overall success in integrated land-use management. In doing so, the book clarifies the nature and roles of evaluation and puts forward guidance for future policy and practice.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This book brings together experts in the fields of spatial planning, landuse and infrastructure management to explore the emerging agenda of spatially-oriented integrated evaluation. It weaves together the latest theories, case studies, methods, policy and practice to examine and assess the values, impacts, benefits and the overall success in integrated land-use management. In doing so, the book clarifies the nature and roles of evaluation and puts forward guidance for future policy and practice.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The emerging technologies have expanded a new dimension of self – ‘technoself’ driven by socio-technical innovations and taken an important step forward in pervasive learning. Technology Enhanced Learning (TEL) research has increasingly focused on emergent technologies such as Augmented Reality (AR) for augmented learning, mobile learning, and game-based learning in order to improve self-motivation and self-engagement of the learners in enriched multimodal learning environments. These researches take advantage of technological innovations in hardware and software across different platforms and devices including tablets, phoneblets and even game consoles and their increasing popularity for pervasive learning with the significant development of personalization processes which place the student at the center of the learning process. In particular, augmented reality (AR) research has matured to a level to facilitate augmented learning, which is defined as an on-demand learning technique where the learning environment adapts to the needs and inputs from learners. In this paper we firstly study the role of Technology Acceptance Model (TAM) which is one of the most influential theories applied in TEL on how learners come to accept and use a new technology. Then we present the design methodology of the technoself approach for pervasive learning and introduce technoself enhanced learning as a novel pedagogical model to improve student engagement by shaping personal learning focus and setting. Furthermore we describe the design and development of an AR-based interactive digital interpretation system for augmented learning and discuss key features. By incorporating mobiles, game simulation, voice recognition, and multimodal interaction through Augmented Reality, the learning contents can be geared toward learner's needs and learners can stimulate discovery and gain greater understanding. The system demonstrates that Augmented Reality can provide rich contextual learning environment and contents tailored for individuals. Augment learning via AR can bridge this gap between the theoretical learning and practical learning, and focus on how the real and virtual can be combined together to fulfill different learning objectives, requirements, and even environments. Finally, we validate and evaluate the AR-based technoself enhanced learning approach to enhancing the student motivation and engagement in the learning process through experimental learning practices. It shows that Augmented Reality is well aligned with constructive learning strategies, as learners can control their own learning and manipulate objects that are not real in augmented environment to derive and acquire understanding and knowledge in a broad diversity of learning practices including constructive activities and analytical activities.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

[EN]Active Vision Systems can be considered as dynamical systems which close the loop around artificial visual perception, controlling camera parameters, motion and also controlling processing to simplify, accelerate and do more robust visual perception. Research and Development in Active Vision Systems [Aloi87], [Bajc88] is a main area of interest in Computer Vision, mainly by its potential application in different scenarios where real-time performance is needed such as robot navigation, surveillance, visual inspection, among many others. Several systems have been developed during last years using robotic-heads for this purpose...

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Abstract : Images acquired from unmanned aerial vehicles (UAVs) can provide data with unprecedented spatial and temporal resolution for three-dimensional (3D) modeling. Solutions developed for this purpose are mainly operating based on photogrammetry concepts, namely UAV-Photogrammetry Systems (UAV-PS). Such systems are used in applications where both geospatial and visual information of the environment is required. These applications include, but are not limited to, natural resource management such as precision agriculture, military and police-related services such as traffic-law enforcement, precision engineering such as infrastructure inspection, and health services such as epidemic emergency management. UAV-photogrammetry systems can be differentiated based on their spatial characteristics in terms of accuracy and resolution. That is some applications, such as precision engineering, require high-resolution and high-accuracy information of the environment (e.g. 3D modeling with less than one centimeter accuracy and resolution). In other applications, lower levels of accuracy might be sufficient, (e.g. wildlife management needing few decimeters of resolution). However, even in those applications, the specific characteristics of UAV-PSs should be well considered in the steps of both system development and application in order to yield satisfying results. In this regard, this thesis presents a comprehensive review of the applications of unmanned aerial imagery, where the objective was to determine the challenges that remote-sensing applications of UAV systems currently face. This review also allowed recognizing the specific characteristics and requirements of UAV-PSs, which are mostly ignored or not thoroughly assessed in recent studies. Accordingly, the focus of the first part of this thesis is on exploring the methodological and experimental aspects of implementing a UAV-PS. The developed system was extensively evaluated for precise modeling of an open-pit gravel mine and performing volumetric-change measurements. This application was selected for two main reasons. Firstly, this case study provided a challenging environment for 3D modeling, in terms of scale changes, terrain relief variations as well as structure and texture diversities. Secondly, open-pit-mine monitoring demands high levels of accuracy, which justifies our efforts to improve the developed UAV-PS to its maximum capacities. The hardware of the system consisted of an electric-powered helicopter, a high-resolution digital camera, and an inertial navigation system. The software of the system included the in-house programs specifically designed for camera calibration, platform calibration, system integration, onboard data acquisition, flight planning and ground control point (GCP) detection. The detailed features of the system are discussed in the thesis, and solutions are proposed in order to enhance the system and its photogrammetric outputs. The accuracy of the results was evaluated under various mapping conditions, including direct georeferencing and indirect georeferencing with different numbers, distributions and types of ground control points. Additionally, the effects of imaging configuration and network stability on modeling accuracy were assessed. The second part of this thesis concentrates on improving the techniques of sparse and dense reconstruction. The proposed solutions are alternatives to traditional aerial photogrammetry techniques, properly adapted to specific characteristics of unmanned, low-altitude imagery. Firstly, a method was developed for robust sparse matching and epipolar-geometry estimation. The main achievement of this method was its capacity to handle a very high percentage of outliers (errors among corresponding points) with remarkable computational efficiency (compared to the state-of-the-art techniques). Secondly, a block bundle adjustment (BBA) strategy was proposed based on the integration of intrinsic camera calibration parameters as pseudo-observations to Gauss-Helmert model. The principal advantage of this strategy was controlling the adverse effect of unstable imaging networks and noisy image observations on the accuracy of self-calibration. The sparse implementation of this strategy was also performed, which allowed its application to data sets containing a lot of tie points. Finally, the concepts of intrinsic curves were revisited for dense stereo matching. The proposed technique could achieve a high level of accuracy and efficiency by searching only through a small fraction of the whole disparity search space as well as internally handling occlusions and matching ambiguities. These photogrammetric solutions were extensively tested using synthetic data, close-range images and the images acquired from the gravel-pit mine. Achieving absolute 3D mapping accuracy of 11±7 mm illustrated the success of this system for high-precision modeling of the environment.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visual recognition is a fundamental research topic in computer vision. This dissertation explores datasets, features, learning, and models used for visual recognition. In order to train visual models and evaluate different recognition algorithms, this dissertation develops an approach to collect object image datasets on web pages using an analysis of text around the image and of image appearance. This method exploits established online knowledge resources (Wikipedia pages for text; Flickr and Caltech data sets for images). The resources provide rich text and object appearance information. This dissertation describes results on two datasets. The first is Berg’s collection of 10 animal categories; on this dataset, we significantly outperform previous approaches. On an additional set of 5 categories, experimental results show the effectiveness of the method. Images are represented as features for visual recognition. This dissertation introduces a text-based image feature and demonstrates that it consistently improves performance on hard object classification problems. The feature is built using an auxiliary dataset of images annotated with tags, downloaded from the Internet. Image tags are noisy. The method obtains the text features of an unannotated image from the tags of its k-nearest neighbors in this auxiliary collection. A visual classifier presented with an object viewed under novel circumstances (say, a new viewing direction) must rely on its visual examples. This text feature may not change, because the auxiliary dataset likely contains a similar picture. While the tags associated with images are noisy, they are more stable when appearance changes. The performance of this feature is tested using PASCAL VOC 2006 and 2007 datasets. This feature performs well; it consistently improves the performance of visual object classifiers, and is particularly effective when the training dataset is small. With more and more collected training data, computational cost becomes a bottleneck, especially when training sophisticated classifiers such as kernelized SVM. This dissertation proposes a fast training algorithm called Stochastic Intersection Kernel Machine (SIKMA). This proposed training method will be useful for many vision problems, as it can produce a kernel classifier that is more accurate than a linear classifier, and can be trained on tens of thousands of examples in two minutes. It processes training examples one by one in a sequence, so memory cost is no longer the bottleneck to process large scale datasets. This dissertation applies this approach to train classifiers of Flickr groups with many group training examples. The resulting Flickr group prediction scores can be used to measure image similarity between two images. Experimental results on the Corel dataset and a PASCAL VOC dataset show the learned Flickr features perform better on image matching, retrieval, and classification than conventional visual features. Visual models are usually trained to best separate positive and negative training examples. However, when recognizing a large number of object categories, there may not be enough training examples for most objects, due to the intrinsic long-tailed distribution of objects in the real world. This dissertation proposes an approach to use comparative object similarity. The key insight is that, given a set of object categories which are similar and a set of categories which are dissimilar, a good object model should respond more strongly to examples from similar categories than to examples from dissimilar categories. This dissertation develops a regularized kernel machine algorithm to use this category dependent similarity regularization. Experiments on hundreds of categories show that our method can make significant improvement for categories with few or even no positive examples.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Hand detection on images has important applications on person activities recognition. This thesis focuses on PASCAL Visual Object Classes (VOC) system for hand detection. VOC has become a popular system for object detection, based on twenty common objects, and has been released with a successful deformable parts model in VOC2007. A hand detection on an image is made when the system gets a bounding box which overlaps with at least 50% of any ground truth bounding box for a hand on the image. The initial average precision of this detector is around 0.215 compared with a state-of-art of 0.104; however, color and frequency features for detected bounding boxes contain important information for re-scoring, and the average precision can be improved to 0.218 with these features. Results show that these features help on getting higher precision for low recall, even though the average precision is similar.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The research described in this thesis was motivated by the need of a robust model capable of representing 3D data obtained with 3D sensors, which are inherently noisy. In addition, time constraints have to be considered as these sensors are capable of providing a 3D data stream in real time. This thesis proposed the use of Self-Organizing Maps (SOMs) as a 3D representation model. In particular, we proposed the use of the Growing Neural Gas (GNG) network, which has been successfully used for clustering, pattern recognition and topology representation of multi-dimensional data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models, without considering time constraints. It is proposed a hardware implementation leveraging the computing power of modern GPUs, which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). The proposed methods were applied to different problem and applications in the area of computer vision such as the recognition and localization of objects, visual surveillance or 3D reconstruction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents a semi-parametric Algorithm for parsing football video structures. The approach works on a two interleaved based process that closely collaborate towards a common goal. The core part of the proposed method focus perform a fast automatic football video annotation by looking at the enhance entropy variance within a series of shot frames. The entropy is extracted on the Hue parameter from the HSV color system, not as a global feature but in spatial domain to identify regions within a shot that will characterize a certain activity within the shot period. The second part of the algorithm works towards the identification of dominant color regions that could represent players and playfield for further activity recognition. Experimental Results shows that the proposed football video segmentation algorithm performs with high accuracy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Dissertação de Mestrado, Engenharia Informática, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2014

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This article describes the Robot Vision challenge, a competition that evaluates solutions for the visual place classification problem. Since its origin, this challenge has been proposed as a common benchmark where worldwide proposals are measured using a common overall score. Each new edition of the competition introduced novelties, both for the type of input data and subobjectives of the challenge. All the techniques used by the participants have been gathered up and published to make it accessible for future developments. The legacy of the Robot Vision challenge includes data sets, benchmarking techniques, and a wide experience in the place classification research that is reflected in this article.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visual inputs to artificial and biological visual systems are often quantized: cameras accumulate photons from the visual world, and the brain receives action potentials from visual sensory neurons. Collecting more information quanta leads to a longer acquisition time and better performance. In many visual tasks, collecting a small number of quanta is sufficient to solve the task well. The ability to determine the right number of quanta is pivotal in situations where visual information is costly to obtain, such as photon-starved or time-critical environments. In these situations, conventional vision systems that always collect a fixed and large amount of information are infeasible. I develop a framework that judiciously determines the number of information quanta to observe based on the cost of observation and the requirement for accuracy. The framework implements the optimal speed versus accuracy tradeoff when two assumptions are met, namely that the task is fully specified probabilistically and constant over time. I also extend the framework to address scenarios that violate the assumptions. I deploy the framework to three recognition tasks: visual search (where both assumptions are satisfied), scotopic visual recognition (where the model is not specified), and visual discrimination with unknown stimulus onset (where the model is dynamic over time). Scotopic classification experiments suggest that the framework leads to dramatic improvement in photon-efficiency compared to conventional computer vision algorithms. Human psychophysics experiments confirmed that the framework provides a parsimonious and versatile explanation for human behavior under time pressure in both static and dynamic environments.