411 resultados para 280213 Other Artificial Intelligence
Resumo:
The ability to build high-fidelity 3D representations of the environment from sensor data is critical for autonomous robots. Multi-sensor data fusion allows for more complete and accurate representations. Furthermore, using distinct sensing modalities (i.e. sensors using a different physical process and/or operating at different electromagnetic frequencies) usually leads to more reliable perception, especially in challenging environments, as modalities may complement each other. However, they may react differently to certain materials or environmental conditions, leading to catastrophic fusion. In this paper, we propose a new method to reliably fuse data from multiple sensing modalities, including in situations where they detect different targets. We first compute distinct continuous surface representations for each sensing modality, with uncertainty, using Gaussian Process Implicit Surfaces (GPIS). Second, we perform a local consistency test between these representations, to separate consistent data (i.e. data corresponding to the detection of the same target by the sensors) from inconsistent data. The consistent data can then be fused together, using another GPIS process, and the rest of the data can be combined as appropriate. The approach is first validated using synthetic data. We then demonstrate its benefit using a mobile robot, equipped with a laser scanner and a radar, which operates in an outdoor environment in the presence of large clouds of airborne dust and smoke.
Resumo:
Real-world environments such as houses and offices change over time, meaning that a mobile robot’s map will become out of date. In this work, we introduce a method to update the reference views in a hybrid metrictopological map so that a mobile robot can continue to localize itself in a changing environment. The updating mechanism, based on the multi-store model of human memory, incorporates a spherical metric representation of the observed visual features for each node in the map, which enables the robot to estimate its heading and navigate using multi-view geometry, as well as representing the local 3D geometry of the environment. A series of experiments demonstrate the persistence performance of the proposed system in real changing environments, including analysis of the long-term stability.
Resumo:
This paper investigates the effect of topic dependent language models (TDLM) on phonetic spoken term detection (STD) using dynamic match lattice spotting (DMLS). Phonetic STD consists of two steps: indexing and search. The accuracy of indexing audio segments into phone sequences using phone recognition methods directly affects the accuracy of the final STD system. If the topic of a document in known, recognizing the spoken words and indexing them to an intermediate representation is an easier task and consequently, detecting a search word in it will be more accurate and robust. In this paper, we propose the use of TDLMs in the indexing stage to improve the accuracy of STD in situations where the topic of the audio document is known in advance. It is shown that using TDLMs instead of the traditional general language model (GLM) improves STD performance according to figure of merit (FOM) criteria.
Resumo:
Environmental monitoring has become increasingly important due to the significant impact of human activities and climate change on biodiversity. Environmental sound sources such as rain and insect vocalizations are a rich and underexploited source of information in environmental audio recordings. This paper is concerned with the classification of rain within acoustic sensor re-cordings. We present the novel application of a set of features for classifying environmental acoustics: acoustic entropy, the acoustic complexity index, spectral cover, and background noise. In order to improve the performance of the rain classification system we automatically classify segments of environmental recordings into the classes of heavy rain or non-rain. A decision tree classifier is experientially compared with other classifiers. The experimental results show that our system is effective in classifying segments of environmental audio recordings with an accuracy of 93% for the binary classification of heavy rain/non-rain.
Resumo:
While existing multi-biometic Dempster-Shafer the- ory fusion approaches have demonstrated promising perfor- mance, they do not model the uncertainty appropriately, sug- gesting that further improvement can be achieved. This research seeks to develop a unified framework for multimodal biometric fusion to take advantage of the uncertainty concept of Dempster- Shafer theory, improving the performance of multi-biometric authentication systems. Modeling uncertainty as a function of uncertainty factors affecting the recognition performance of the biometric systems helps to address the uncertainty of the data and the confidence of the fusion outcome. A weighted combination of quality measures and classifiers performance (Equal Error Rate) are proposed to encode the uncertainty concept to improve the fusion. We also found that quality measures contribute unequally to the recognition performance, thus selecting only significant factors and fusing them with a Dempster-Shafer approach to generate an overall quality score play an important role in the success of uncertainty modeling. The proposed approach achieved a competitive performance (approximate 1% EER) in comparison with other Dempster-Shafer based approaches and other conventional fusion approaches.
Resumo:
Although the collection of player and ball tracking data is fast becoming the norm in professional sports, large-scale mining of such spatiotemporal data has yet to surface. In this paper, given an entire season's worth of player and ball tracking data from a professional soccer league (approx 400,000,000 data points), we present a method which can conduct both individual player and team analysis. Due to the dynamic, continuous and multi-player nature of team sports like soccer, a major issue is aligning player positions over time. We present a "role-based" representation that dynamically updates each player's relative role at each frame and demonstrate how this captures the short-term context to enable both individual player and team analysis. We discover role directly from data by utilizing a minimum entropy data partitioning method and show how this can be used to accurately detect and visualize formations, as well as analyze individual player behavior.
Resumo:
This paper deals with constrained image-based visual servoing of circular and conical spiral motion about an unknown object approximating a single image point feature. Effective visual control of such trajectories has many applications for small unmanned aerial vehicles, including surveillance and inspection, forced landing (homing), and collision avoidance. A spherical camera model is used to derive a novel visual-predictive controller (VPC) using stability-based design methods for general nonlinear model-predictive control. In particular, a quasi-infinite horizon visual-predictive control scheme is derived. A terminal region, which is used as a constraint in the controller structure, can be used to guide appropriate reference image features for spiral tracking with respect to nominal stability and feasibility. Robustness properties are also discussed with respect to parameter uncertainty and additive noise. A comparison with competing visual-predictive control schemes is made, and some experimental results using a small quad rotor platform are given.
Resumo:
The UAV challenge takes place every year. Teams of compteitors compete to use an Unmanned Airborne Vehicle to locate a simulated lost person and deliver water.
Resumo:
As a result of the more distributed nature of organisations and the inherently increasing complexity of their business processes, a significant effort is required for the specification and verification of those processes. The composition of the activities into a business process that accomplishes a specific organisational goal has primarily been a manual task. Automated planning is a branch of artificial intelligence (AI) in which activities are selected and organised by anticipating their expected outcomes with the aim of achieving some goal. As such, automated planning would seem to be a natural fit to the BPM domain to automate the specification of control flow. A number of attempts have been made to apply automated planning to the business process and service composition domain in different stages of the BPM lifecycle. However, a unified adoption of these techniques throughout the BPM lifecycle is missing. As such, we propose a new intention-centric BPM paradigm, which aims on minimising the specification effort by exploiting automated planning techniques to achieve a pre-stated goal. This paper provides a vision on the future possibilities of enhancing BPM using automated planning. A research agenda is presented, which provides an overview of the opportunities and challenges for the exploitation of automated planning in BPM.
Resumo:
This proposal describes the innovative and competitive lunar payload solution developed at the Queensland University of Technology (QUT)–the LunaRoo: a hopping robot designed to exploit the Moon's lower gravity to leap up to 20m above the surface. It is compact enough to fit within a 10cm cube, whilst providing unique observation and mission capabilities by creating imagery during the hop. This first section is deliberately kept short and concise for web submission; additional information can be found in the second chapter.
Resumo:
In this paper we present a robust method to detect handwritten text from unconstrained drawings on normal whiteboards. Unlike printed text on documents, free form handwritten text has no pattern in terms of size, orientation and font and it is often mixed with other drawings such as lines and shapes. Unlike handwritings on paper, handwritings on a normal whiteboard cannot be scanned so the detection has to be based on photos. Our work traces straight edges on photos of the whiteboard and builds graph representation of connected components. We use geometric properties such as edge density, graph density, aspect ratio and neighborhood similarity to differentiate handwritten text from other drawings. The experiment results show that our method achieves satisfactory precision and recall. Furthermore, the method is robust and efficient enough to be deployed in a mobile device. This is an important enabler of business applications that support whiteboard-centric visual meetings in enterprise scenarios. © 2012 IEEE.
Resumo:
Even though crashes between trains and road users are rare events at railway level crossings, they are one of the major safety concerns for the Australian railway industry. Nearmiss events at level crossings occur more frequently, and can provide more information about factors leading to level crossing incidents. In this paper we introduce a video analytic approach for automatically detecting and localizing vehicles from cameras mounted on trains for detecting near-miss events. To detect and localize vehicles at level crossings we extract patches from an image and classify each patch for detecting vehicles. We developed a region proposals algorithm for generating patches, and we use a Convolutional Neural Network (CNN) for classifying each patch. To localize vehicles in images we combine the patches that are classified as vehicles according to their CNN scores and positions. We compared our system with the Deformable Part Models (DPM) and Regions with CNN features (R-CNN) object detectors. Experimental results on a railway dataset show that the recall rate of our proposed system is 29% higher than what can be achieved with DPM or R-CNN detectors.
Resumo:
Clustering identities in a video is a useful task to aid in video search, annotation and retrieval, and cast identification. However, reliably clustering faces across multiple videos is challenging task due to variations in the appearance of the faces, as videos are captured in an uncontrolled environment. A person's appearance may vary due to session variations including: lighting and background changes, occlusions, changes in expression and make up. In this paper we propose the novel Local Total Variability Modelling (Local TVM) approach to cluster faces across a news video corpus; and incorporate this into a novel two stage video clustering system. We first cluster faces within a single video using colour, spatial and temporal cues; after which we use face track modelling and hierarchical agglomerative clustering to cluster faces across the entire corpus. We compare different face recognition approaches within this framework. Experiments on a news video database show that the Local TVM technique is able effectively model the session variation observed in the data, resulting in improved clustering performance, with much greater computational efficiency than other methods.
Resumo:
Natural history collections are an invaluable resource housing a wealth of knowledge with a long tradition of contributing to a wide range of fields such as taxonomy, quarantine, conservation and climate change. It is recognized however [Smith and Blagoderov 2012] that such physical collections are often heavily underutilized as a result of the practical issues of accessibility. The digitization of these collections is a step towards removing these access issues, but other hurdles must be addressed before we truly unlock the potential of this knowledge.