722 resultados para Multi-Touch Recognition
Resumo:
This paper presents a robust place recognition algorithm for mobile robots. The framework proposed combines nonlinear dimensionality reduction, nonlinear regression under noise, and variational Bayesian learning to create consistent probabilistic representations of places from images. These generative models are learnt from a few images and used for multi-class place recognition where classification is computed from a set of feature-vectors. Recognition can be performed in near real-time and accounts for complexity such as changes in illumination, occlusions and blurring. The algorithm was tested with a mobile robot in indoor and outdoor environments with sequences of 1579 and 3820 images respectively. This framework has several potential applications such as map building, autonomous navigation, search-rescue tasks and context recognition.
Resumo:
Objective: To highlight the registration issues for nurses who wish to practice nationally, particularly those practicing within the telehealth sector. Design: As part of a national clinical research study, applications were made to every state and territory for mutual recognition of nursing registration and fee waiver for telenursing cross boarder practice for a period of three years. These processes are described using a case study approach. Outcome: The aim of this case study was to achieve registration in every state and territory of Australia without paying multiple fees by using mutual recognition provisions and the cross-border fee waiver policy of the nurse regulatory authorities in order to practice telenursing. Results: Mutual recognition and fee waiver for cross-border practice was granted unconditionally in two states: Victoria (Vic) and Tasmania (Tas), and one territory: the Northern Territory (NT). The remainder of the Australian states and territories would only grant temporary registration for the period of the project or not at all, due to policy restrictions or nurse regulatory authority (NRA) Board decisions. As a consequence of gaining fee waiver the annual cost of registration was a maximum of $145 per annum as opposed to the potential $959 for initial registration and $625 for annual renewal. Conclusions: Having eight individual nurses Acts and NRAs for a population of 265,000 nurses would clearly indicate a case for over regulation in this country. The structure of regulation of nursing in Australia is a barrier to the changing and evolving role of nurses in the 21st century and a significant factor when considering workforce planning.
Resumo:
Features derived from the trispectra of DFT magnitude slices are used for multi-font digit recognition. These features are insensitive to translation, rotation, or scaling of the input. They are also robust to noise. Classification accuracy tests were conducted on a common data base of 256× 256 pixel bilevel images of digits in 9 fonts. Randomly rotated and translated noisy versions were used for training and testing. The results indicate that the trispectral features are better than moment invariants and affine moment invariants. They achieve a classification accuracy of 95% compared to about 81% for Hu's (1962) moment invariants and 39% for the Flusser and Suk (1994) affine moment invariants on the same data in the presence of 1% impulse noise using a 1-NN classifier. For comparison, a multilayer perceptron with no normalization for rotations and translations yields 34% accuracy on 16× 16 pixel low-pass filtered and decimated versions of the same data.
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.
Resumo:
Gait energy images (GEIs) and its variants form the basis of many recent appearance-based gait recognition systems. The GEI combines good recognition performance with a simple implementation, though it suffers problems inherent to appearance-based approaches, such as being highly view dependent. In this paper, we extend the concept of the GEI to 3D, to create what we call the gait energy volume, or GEV. A basic GEV implementation is tested on the CMU MoBo database, showing improvements over both the GEI baseline and a fused multi-view GEI approach. We also demonstrate the efficacy of this approach on partial volume reconstructions created from frontal depth images, which can be more practically acquired, for example, in biometric portals implemented with stereo cameras, or other depth acquisition systems. Experiments on frontal depth images are evaluated on an in-house developed database captured using the Microsoft Kinect, and demonstrate the validity of the proposed approach.
Resumo:
Management (or perceived mismanagement) of large-scale, complex projects poses special problems and often results in spectacular failures, cost overruns, time blowouts and stakeholder dissatisfaction. While traditional project management responds with increasingly administrative constraints, we argue that leaders of such projects also need to display adaptive and enabling behaviours to foster adaptive processes, such as opportunity recognition, which requires an interaction of cognitive and affective processes of individual, project, and team leader attributes and behaviours. At the core of this model we propose is an interaction of cognitive flexibility, affect and emotional intelligence. The result of this interaction is enhanced leader opportunity recognition that, in turn, facilitates multilevel outcomes.
Resumo:
Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.
Resumo:
Project work has grown significantly in volume and recognition in recent decades as projects have ‘become a common form of work organization in all sectors of the economy’ (Lindgren & Packendorff, 2006: 841). This increase in project-based work is just one of the many changes that have been affecting the nature of work, the employment relationship and the associated conceptualization and experience of careers (Baruch, 2004b; Söderlund & Bredin, 2006). A career can be defined as a process of development along a path of work experience and roles in one or more organizations (Baruch & Rosenstein, 1992), and careers involving project-based work take place within multi layered institutional settings. Projects are generally undertaken by small temporary organizations (Ekstedt, Lundin, Söderholm & Wirdenius, 1999; Pettigrew, 2003; Söderlund, 2012) which in turn may form part of larger, permanent entities; involve people drawn from a number of disciplines and organizations; or be formed as partnerships, joint ventures or strategic alliances between two or more organizations (Scott, 2007).
Resumo:
Automated crowd counting has become an active field of computer vision research in recent years. Existing approaches are scene-specific, as they are designed to operate in the single camera viewpoint that was used to train the system. Real world camera networks often span multiple viewpoints within a facility, including many regions of overlap. This paper proposes a novel scene invariant crowd counting algorithm that is designed to operate across multiple cameras. The approach uses camera calibration to normalise features between viewpoints and to compensate for regions of overlap. This compensation is performed by constructing an 'overlap map' which provides a measure of how much an object at one location is visible within other viewpoints. An investigation into the suitability of various feature types and regression models for scene invariant crowd counting is also conducted. The features investigated include object size, shape, edges and keypoints. The regression models evaluated include neural networks, K-nearest neighbours, linear and Gaussian process regresion. Our experiments demonstrate that accurate crowd counting was achieved across seven benchmark datasets, with optimal performance observed when all features were used and when Gaussian process regression was used. The combination of scene invariance and multi camera crowd counting is evaluated by training the system on footage obtained from the QUT camera network and testing it on three cameras from the PETS 2009 database. Highly accurate crowd counting was observed with a mean relative error of less than 10%. Our approach enables a pre-trained system to be deployed on a new environment without any additional training, bringing the field one step closer toward a 'plug and play' system.
Resumo:
Association rule mining is one technique that is widely used when querying databases, especially those that are transactional, in order to obtain useful associations or correlations among sets of items. Much work has been done focusing on efficiency, effectiveness and redundancy. There has also been a focusing on the quality of rules from single level datasets with many interestingness measures proposed. However, with multi-level datasets now being common there is a lack of interestingness measures developed for multi-level and cross-level rules. Single level measures do not take into account the hierarchy found in a multi-level dataset. This leaves the Support-Confidence approach, which does not consider the hierarchy anyway and has other drawbacks, as one of the few measures available. In this chapter we propose two approaches which measure multi-level association rules to help evaluate their interestingness by considering the database’s underlying taxonomy. These measures of diversity and peculiarity can be used to help identify those rules from multi-level datasets that are potentially useful.
Resumo:
This paper presents a new multi-scale place recognition system inspired by the recent discovery of overlapping, multi-scale spatial maps stored in the rodent brain. By training a set of Support Vector Machines to recognize places at varying levels of spatial specificity, we are able to validate spatially specific place recognition hypotheses against broader place recognition hypotheses without sacrificing localization accuracy. We evaluate the system in a range of experiments using cameras mounted on a motorbike and a human in two different environments. At 100% precision, the multiscale approach results in a 56% average improvement in recall rate across both datasets. We analyse the results and then discuss future work that may lead to improvements in both robotic mapping and our understanding of sensory processing and encoding in the mammalian brain.
Resumo:
Fusion techniques can be used in biometrics to achieve higher accuracy. When biometric systems are in operation and the threat level changes, controlling the trade-off between detection error rates can reduce the impact of an attack. In a fused system, varying a single threshold does not allow this to be achieved, but systematic adjustment of a set of parameters does. In this paper, fused decisions from a multi-part, multi-sample sequential architecture are investigated for that purpose in an iris recognition system. A specific implementation of the multi-part architecture is proposed and the effect of the number of parts and samples in the resultant detection error rate is analysed. The effectiveness of the proposed architecture is then evaluated under two specific cases of obfuscation attack: miosis and mydriasis. Results show that robustness to such obfuscation attacks is achieved, since lower error rates than in the case of the non-fused base system are obtained.
Resumo:
This paper presents a robust place recognition algorithm for mobile robots that can be used for planning and navigation tasks. The proposed framework combines nonlinear dimensionality reduction, nonlinear regression under noise, and Bayesian learning to create consistent probabilistic representations of places from images. These generative models are incrementally learnt from very small training sets and used for multi-class place recognition. Recognition can be performed in near real-time and accounts for complexity such as changes in illumination, occlusions, blurring and moving objects. The algorithm was tested with a mobile robot in indoor and outdoor environments with sequences of 1579 and 3820 images, respectively. This framework has several potential applications such as map building, autonomous navigation, search-rescue tasks and context recognition.
Resumo:
Traditional nearest points methods use all the samples in an image set to construct a single convex or affine hull model for classification. However, strong artificial features and noisy data may be generated from combinations of training samples when significant intra-class variations and/or noise occur in the image set. Existing multi-model approaches extract local models by clustering each image set individually only once, with fixed clusters used for matching with various image sets. This may not be optimal for discrimination, as undesirable environmental conditions (eg. illumination and pose variations) may result in the two closest clusters representing different characteristics of an object (eg. frontal face being compared to non-frontal face). To address the above problem, we propose a novel approach to enhance nearest points based methods by integrating affine/convex hull classification with an adapted multi-model approach. We first extract multiple local convex hulls from a query image set via maximum margin clustering to diminish the artificial variations and constrain the noise in local convex hulls. We then propose adaptive reference clustering (ARC) to constrain the clustering of each gallery image set by forcing the clusters to have resemblance to the clusters in the query image set. By applying ARC, noisy clusters in the query set can be discarded. Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method outperforms single model approaches and other recent techniques, such as Sparse Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant Analysis.