890 resultados para Distributed artificial intelligence - multiagent systems
Resumo:
In this paper we use a sequence-based visual localization algorithm to reveal surprising answers to the question, how much visual information is actually needed to conduct effective navigation? The algorithm actively searches for the best local image matches within a sliding window of short route segments or 'sub-routes', and matches sub-routes by searching for coherent sequences of local image matches. In contract to many existing techniques, the technique requires no pre-training or camera parameter calibration. We compare the algorithm's performance to the state-of-the-art FAB-MAP 2.0 algorithm on a 70 km benchmark dataset. Performance matches or exceeds the state of the art feature-based localization technique using images as small as 4 pixels, fields of view reduced by a factor of 250, and pixel bit depths reduced to 2 bits. We present further results demonstrating the system localizing in an office environment with near 100% precision using two 7 bit Lego light sensors, as well as using 16 and 32 pixel images from a motorbike race and a mountain rally car stage. By demonstrating how little image information is required to achieve localization along a route, we hope to stimulate future 'low fidelity' approaches to visual navigation that complement probabilistic feature-based techniques.
Resumo:
Appearance-based localization can provide loop closure detection at vast scales regardless of accumulated metric error. However, the computation time and memory requirements of current appearance-based methods scale not only with the size of the environment but also with the operation time of the platform. Additionally, repeated visits to locations will develop multiple competing representations, which will reduce recall performance over time. These properties impose severe restrictions on long-term autonomy for mobile robots, as loop closure performance will inevitably degrade with increased operation time. In this paper we present a graphical extension to CAT-SLAM, a particle filter-based algorithm for appearance-based localization and mapping, to provide constant computation and memory requirements over time and minimal degradation of recall performance during repeated visits to locations. We demonstrate loop closure detection in a large urban environment with capped computation time and memory requirements and performance exceeding previous appearance-based methods by a factor of 2. We discuss the limitations of the algorithm with respect to environment size, appearance change over time and applications in topological planning and navigation for long-term robot operation.
In the pursuit of effective affective computing : the relationship between features and registration
Resumo:
For facial expression recognition systems to be applicable in the real world, they need to be able to detect and track a previously unseen person's face and its facial movements accurately in realistic environments. A highly plausible solution involves performing a "dense" form of alignment, where 60-70 fiducial facial points are tracked with high accuracy. The problem is that, in practice, this type of dense alignment had so far been impossible to achieve in a generic sense, mainly due to poor reliability and robustness. Instead, many expression detection methods have opted for a "coarse" form of face alignment, followed by an application of a biologically inspired appearance descriptor such as the histogram of oriented gradients or Gabor magnitudes. Encouragingly, recent advances to a number of dense alignment algorithms have demonstrated both high reliability and accuracy for unseen subjects [e.g., constrained local models (CLMs)]. This begs the question: Aside from countering against illumination variation, what do these appearance descriptors do that standard pixel representations do not? In this paper, we show that, when close to perfect alignment is obtained, there is no real benefit in employing these different appearance-based representations (under consistent illumination conditions). In fact, when misalignment does occur, we show that these appearance descriptors do work well by encoding robustness to alignment error. For this work, we compared two popular methods for dense alignment-subject-dependent active appearance models versus subject-independent CLMs-on the task of action-unit detection. These comparisons were conducted through a battery of experiments across various publicly available data sets (i.e., CK+, Pain, M3, and GEMEP-FERA). We also report our performance in the recent 2011 Facial Expression Recognition and Analysis Challenge for the subject-independent task.
Resumo:
The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.
Resumo:
Thermal-infrared imagery is relatively robust to many of the failure conditions of visual and laser-based SLAM systems, such as fog, dust and smoke. The ability to use thermal-infrared video for localization is therefore highly appealing for many applications. However, operating in thermal-infrared is beyond the capacity of existing SLAM implementations. This paper presents the first known monocular SLAM system designed and tested for hand-held use in the thermal-infrared modality. The implementation includes a flexible feature detection layer able to achieve robust feature tracking in high-noise, low-texture thermal images. A novel approach for structure initialization is also presented. The system is robust to irregular motion and capable of handling the unique mechanical shutter interruptions common to thermal-infrared cameras. The evaluation demonstrates promising performance of the algorithm in several environments.
Resumo:
Much of our understanding of human thinking is based on probabilistic models. This innovative book by Jerome R. Busemeyer and Peter D. Bruza argues that, actually, the underlying mathematical structures from quantum theory provide a much better account of human thinking than traditional models. They introduce the foundations for modelling probabilistic-dynamic systems using two aspects of quantum theory. The first, "contextuality", is a way to understand interference effects found with inferences and decisions under conditions of uncertainty. The second, "entanglement", allows cognitive phenomena to be modelled in non-reductionist ways. Employing these principles drawn from quantum theory allows us to view human cognition and decision in a totally new light...
Resumo:
In this paper a real-time vision based power line extraction solution is investigated for active UAV guidance. The line extraction algorithm starts from ridge points detected by steerable filters. A collinear line segments fitting algorithm is followed up by considering global and local information together with multiple collinear measurements. GPU boosted algorithm implementation is also investigated in the experiment. The experimental result shows that the proposed algorithm outperforms two baseline line detection algorithms and is able to fitting long collinear line segments. The low computational cost of the algorithm make suitable for real-time applications.
Resumo:
Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.
Resumo:
Having a reliable understanding about the behaviours, problems, and performance of existing processes is important in enabling a targeted process improvement initiative. Recently, there has been an increase in the application of innovative process mining techniques to facilitate evidence-based understanding about organizations' business processes. Nevertheless, the application of these techniques in the domain of finance in Australia is, at best, scarce. This paper details a 6-month case study on the application of process mining in one of the largest insurance companies in Australia. In particular, the challenges encountered, the lessons learned, and the results obtained from this case study are detailed. Through this case study, we not only validated existing `lessons learned' from other similar case studies, but also added new insights that can be beneficial to other practitioners in applying process mining in their respective fields.
Resumo:
This paper proposes a technique that supports process participants in making risk-informed decisions, with the aim to reduce the process risks. Risk reduction involves decreasing the likelihood and severity of a process fault from occurring. Given a process exposed to risks, e.g. a financial process exposed to a risk of reputation loss, we enact this process and whenever a process participant needs to provide input to the process, e.g. by selecting the next task to execute or by filling out a form, we prompt the participant with the expected risk that a given fault will occur given the particular input. These risks are predicted by traversing decision trees generated from the logs of past process executions and considering process data, involved resources, task durations and contextual information like task frequencies. The approach has been implemented in the YAWL system and its effectiveness evaluated. The results show that the process instances executed in the tests complete with substantially fewer faults and with lower fault severities, when taking into account the recommendations provided by our technique.
Resumo:
Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. One of the most popular web personalization systems is recommender systems. In recommender systems choosing user information that can be used to profile users is very crucial for user profiling. In Web 2.0, one facility that can help users organize Web resources of their interest is user tagging systems. Exploring user tagging behavior provides a promising way for understanding users’ information needs since tags are given directly by users. However, free and relatively uncontrolled vocabulary makes the user self-defined tags lack of standardization and semantic ambiguity. Also, the relationships among tags need to be explored since there are rich relationships among tags which could provide valuable information for us to better understand users. In this paper, we propose a novel approach for learning tag ontology based on the widely used lexical database WordNet for capturing the semantics and the structural relationships of tags. We present personalization strategies to disambiguate the semantics of tags by combining the opinion of WordNet lexicographers and users’ tagging behavior together. To personalize further, clustering of users is performed to generate a more accurate ontology for a particular group of users. In order to evaluate the usefulness of the tag ontology, we use the tag ontology in a pilot tag recommendation experiment for improving the recommendation performance by exploiting the semantic information in the tag ontology. The initial result shows that the personalized information has improved the accuracy of the tag recommendation.
Resumo:
Substantial research efforts have been expended to deal with the complexity of concurrent systems that is inherent to their analysis, e.g., works that tackle the well-known state space explosion problem. Approaches differ in the classes of properties that they are able to suitably check and this is largely a result of the way they balance the trade-off between analysis time and space employed to describe a concurrent system. One interesting class of properties is concerned with behavioral characteristics. These properties are conveniently expressed in terms of computations, or runs, in concurrent systems. This article introduces the theory of untanglings that exploits a particular representation of a collection of runs in a concurrent system. It is shown that a representative untangling of a bounded concurrent system can be constructed that captures all and only the behavior of the system. Representative untanglings strike a unique balance between time and space, yet provide a single model for the convenient extraction of various behavioral properties. Performance measurements in terms of construction time and size of representative untanglings with respect to the original specifications of concurrent systems, conducted on a collection of models from practice, confirm the scalability of the approach. Finally, this article demonstrates practical benefits of using representative untanglings when checking various behavioral properties of concurrent systems.
Resumo:
This paper illustrates the complexity of pointing as it is employed in a design workshop. Using the method of interaction analysis, we argue that pointing is not merely employed to index, locate, or fix reference to an object. It also constitutes a practice for reestablishing intersubjectivity and solving interactional trouble such as misunderstandings or disagreements by virtue of enlisting something as part of the participants’ shared experience. We use this analysis to discuss implications for how such practices might be supported with computer mediation, arguing for a “bricolage” approach to systems development that emphasizes the provision of resources for users to collaboratively negotiate the accomplishment of intersubjectivity ra- ther than systems that try to support pointing as a specific gestural action.