993 resultados para handwritten recognition
Resumo:
Automatic Call Recognition is vital for environmental monitoring. Patten recognition has been applied in automatic species recognition for years. However, few studies have applied formal syntactic methods to species call structure analysis. This paper introduces a novel method to adopt timed and probabilistic automata in automatic species recognition based upon acoustic components as the primitives. We demonstrate this through one kind of birds in Australia: Eastern Yellow Robin.
The backfilled GEI : a cross-capture modality gait feature for frontal and side-view gait recognition
Resumo:
In this paper, we propose a novel direction for gait recognition research by proposing a new capture-modality independent, appearance-based feature which we call the Back-filled Gait Energy Image (BGEI). It can can be constructed from both frontal depth images, as well as the more commonly used side-view silhouettes, allowing the feature to be applied across these two differing capturing systems using the same enrolled database. To evaluate this new feature, a frontally captured depth-based gait dataset was created containing 37 unique subjects, a subset of which also contained sequences captured from the side. The results demonstrate that the BGEI can effectively be used to identify subjects through their gait across these two differing input devices, achieving rank-1 match rate of 100%, in our experiments. We also compare the BGEI against the GEI and GEV in their respective domains, using the CASIA dataset and our depth dataset, showing that it compares favourably against them. The experiments conducted were performed using a sparse representation based classifier with a locally discriminating input feature space, which show significant improvement in performance over other classifiers used in gait recognition literature, achieving state of the art results with the GEI on the CASIA dataset.
Resumo:
Spatio-Temporal interest points are the most popular feature representation in the field of action recognition. A variety of methods have been proposed to detect and describe local patches in video with several techniques reporting state of the art performance for action recognition. However, the reported results are obtained under different experimental settings with different datasets, making it difficult to compare the various approaches. As a result of this, we seek to comprehensively evaluate state of the art spatio- temporal features under a common evaluation framework with popular benchmark datasets (KTH, Weizmann) and more challenging datasets such as Hollywood2. The purpose of this work is to provide guidance for researchers, when selecting features for different applications with different environmental conditions. In this work we evaluate four popular descriptors (HOG, HOF, HOG/HOF, HOG3D) using a popular bag of visual features representation, and Support Vector Machines (SVM)for classification. Moreover, we provide an in-depth analysis of local feature descriptors and optimize the codebook sizes for different datasets with different descriptors. In this paper, we demonstrate that motion based features offer better performance than those that rely solely on spatial information, while features that combine both types of data are more consistent across a variety of conditions, but typically require a larger codebook for optimal performance.
Resumo:
Our contemporary public sphere has seen the 'emergence of new political rituals, which are concerned with the stains of the past, with self disclosure, and with ways of remembering once taboo and traumatic events' (Misztal, 2005). A recent case of this phenomenon occurred in Australia in 2009 with the apology to the 'Forgotten Australians': a group who suffered abuse and neglect after being removed from their parents – either in Australia or in the UK - and placed in Church and State run institutions in Australia between 1930 and 1970. This campaign for recognition by a profoundly marginalized group coincides with the decade in which the opportunities of Web 2.0 were seen to be diffusing throughout different social groups, and were considered a tool for social inclusion. This paper examines the case of the Forgotten Australians as an opportunity to investigate the role of the internet in cultural trauma and public apology. As such, it adds to recent scholarship on the role of digital web based technologies in commemoration and memorials (Arthur, 2009; Haskins, 2007; Cohen and Willis, 2004), and on digital storytelling in the context of trauma (Klaebe, 2011) by locating their role in a broader and emerging domain of social responsibility and political action (Alexander, 2004).
Resumo:
Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.
Resumo:
Many state of the art vision-based Simultaneous Localisation And Mapping (SLAM) and place recognition systems compute the salience of visual features in their environment. As computing salience can be problematic in radically changing environments new low resolution feature-less systems have been introduced, such as SeqSLAM, all of which consider the whole image. In this paper, we implement a supervised classifier system (UCS) to learn the salience of image regions for place recognition by feature-less systems. SeqSLAM only slightly benefits from the results of training, on the challenging real world Eynsham dataset, as it already appears to filter less useful regions of a panoramic image. However, when recognition is limited to specific image regions performance improves by more than an order of magnitude by utilising the learnt image region saliency. We then investigate whether the region salience generated from the Eynsham dataset generalizes to another car-based dataset using a perspective camera. The results suggest the general applicability of an image region salience mask for optimizing route-based navigation applications.
Resumo:
In the field of face recognition, Sparse Representation (SR) has received considerable attention during the past few years. Most of the relevant literature focuses on holistic descriptors in closed-set identification applications. The underlying assumption in SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such assumption is easily violated in the more challenging face verification scenario, where an algorithm is required to determine if two faces (where one or both have not been seen before) belong to the same person. In this paper, we first discuss why previous attempts with SR might not be applicable to verification problems. We then propose an alternative approach to face verification via SR. Specifically, we propose to use explicit SR encoding on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which are then concatenated to form an overall face descriptor. Due to the deliberate loss spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment & various image deformations. Within the proposed framework, we evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN), and an implicit probabilistic technique based on Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the proposed local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, in both verification and closed-set identification problems. The experiments also show that l1-minimisation based encoding has a considerably higher computational than the other techniques, but leads to higher recognition rates.
Resumo:
Abstract. In recent years, sparse representation based classification(SRC) has received much attention in face recognition with multipletraining samples of each subject. However, it cannot be easily applied toa recognition task with insufficient training samples under uncontrolledenvironments. On the other hand, cohort normalization, as a way of mea-suring the degradation effect under challenging environments in relationto a pool of cohort samples, has been widely used in the area of biometricauthentication. In this paper, for the first time, we introduce cohort nor-malization to SRC-based face recognition with insufficient training sam-ples. Specifically, a user-specific cohort set is selected to normalize theraw residual, which is obtained from comparing the test sample with itssparse representations corresponding to the gallery subject, using poly-nomial regression. Experimental results on AR and FERET databases show that cohort normalization can bring SRC much robustness against various forms of degradation factors for undersampled face recognition.
Resumo:
To recognize faces in video, face appearances have been widely modeled as piece-wise local linear models which linearly approximate the smooth yet non-linear low dimensional face appearance manifolds. The choice of representations of the local models is crucial. Most of the existing methods learn each local model individually meaning that they only anticipate variations within each class. In this work, we propose to represent local models as Gaussian distributions which are learned simultaneously using the heteroscedastic probabilistic linear discriminant analysis (PLDA). Each gallery video is therefore represented as a collection of such distributions. With the PLDA, not only the within-class variations are estimated during the training, the separability between classes is also maximized leading to an improved discrimination. The heteroscedastic PLDA itself is adapted from the standard PLDA to approximate face appearance manifolds more accurately. Instead of assuming a single global within-class covariance, the heteroscedastic PLDA learns different within-class covariances specific to each local model. In the recognition phase, a probe video is matched against gallery samples through the fusion of point-to-model distances. Experiments on the Honda and MoBo datasets have shown the merit of the proposed method which achieves better performance than the state-of-the-art technique.
Resumo:
Increasing awareness of the benefits of stimulating entrepreneurial behaviour in small and medium enterprises has fostered strong interest in innovation programs. Recently many western countries have invested in design innovation for better firm performance. This research presents some early findings from a study of companies that participated in a holistic approach to design innovation, where the outcomes include better business performance and better market positioning in global markets. Preliminary findings from in-depth semi-structured interviews indicate the importance of firm openness to new ways of working and to developing new processes of strategic entrepreneurship. Implications for theory and practice are discussed.
Resumo:
There is an army of bottom of the pyramid entrepreneurs (BOPE) who have the potential to transform developing economies, if they can identify and exploit business opportunities. BOPE could have unidentified resources that could lead to the recognition of radical new opportunities. This study paper asks how environmental factors and identification of resources affect Opportunity Recognition by BOP entrepreneurs in developing economies. To investigate this research question we conduct a literature review and plan semi-structured interviews of existing and nascent entrepreneurs in the largest and arguably the poorest country in Africa, the Democratic Republic of the Congo. In this paper we review the context of BOPE and describe the methodology we will use to gather and analyse data. Finally, we describe our access to suitable respondents for this study and how it will be conducted.
Resumo:
This paper investigates advanced channel compensation techniques for the purpose of improving i-vector speaker verification performance in the presence of high intersession variability using the NIST 2008 and 2010 SRE corpora. The performance of four channel compensation techniques: (a) weighted maximum margin criterion (WMMC), (b) source-normalized WMMC (SN-WMMC), (c) weighted linear discriminant analysis (WLDA), and; (d) source-normalized WLDA (SN-WLDA) have been investigated. We show that, by extracting the discriminatory information between pairs of speakers as well as capturing the source variation information in the development i-vector space, the SN-WLDA based cosine similarity scoring (CSS) i-vector system is shown to provide over 20% improvement in EER for NIST 2008 interview and microphone verification and over 10% improvement in EER for NIST 2008 telephone verification, when compared to SN-LDA based CSS i-vector system. Further, score-level fusion techniques are analyzed to combine the best channel compensation approaches, to provide over 8% improvement in DCF over the best single approach, (SN-WLDA), for NIST 2008 interview/ telephone enrolment-verification condition. Finally, we demonstrate that the improvements found in the context of CSS also generalize to state-of-the-art GPLDA with up to 14% relative improvement in EER for NIST SRE 2010 interview and microphone verification and over 7% relative improvement in EER for NIST SRE 2010 telephone verification.
Resumo:
Odours emitted by flowers are complex blends of volatile compounds. These odours are learnt by flower-visiting insect species, improving their recognition of rewarding flowers and thus foraging efficiency. We investigated the flexibility of floral odour learning by testing whether adult moths recognize single compounds common to flowers on which they forage. Dual choice preference tests on Helicoverpa armigera moths allowed free flying moths to forage on one of three flower species; Argyranthemum frutescens (federation daisy), Cajanus cajan (pigeonpea) or Nicotiana tabacum (tobacco). Results showed that, (i) a benzenoid (phenylacetaldehyde) and a monoterpene (linalool) were subsequently recognized after visits to flowers that emitted these volatile constituents, (ii) in a preference test, other monoterpenes in the flowers' odour did not affect the moths' ability to recognize the monoterpene linalool and (iii) relative preferences for two volatiles changed after foraging experience on a single flower species that emitted both volatiles. The importance of using free flying insects and real flowers to understand the mechanisms involved in floral odour learning in nature are discussed in the context of our findings.
Resumo:
In this paper we use the algorithm SeqSLAM to address the question, how little and what quality of visual information is needed to localize along a familiar route? We conduct a comprehensive investigation of place recognition performance on seven datasets while varying image resolution (primarily 1 to 512 pixel images), pixel bit depth, field of view, motion blur, image compression and matching sequence length. Results confirm that place recognition using single images or short image sequences is poor, but improves to match or exceed current benchmarks as the matching sequence length increases. We then present place recognition results from two experiments where low-quality imagery is directly caused by sensor limitations; in one, place recognition is achieved along an unlit mountain road by using noisy, long-exposure blurred images, and in the other, two single pixel light sensors are used to localize in an indoor environment. We also show failure modes caused by pose variance and sequence aliasing, and discuss ways in which they may be overcome. By showing how place recognition along a route is feasible even with severely degraded image sequences, we hope to provoke a re-examination of how we develop and test future localization and mapping systems.
Resumo:
Uncooperative iris identification systems at a distance suffer from poor resolution of the acquired iris images, which significantly degrades iris recognition performance. Super-resolution techniques have been employed to enhance the resolution of iris images and improve the recognition performance. However, most existing super-resolution approaches proposed for the iris biometric super-resolve pixel intensity values, rather than the actual features used for recognition. This paper thoroughly investigates transferring super-resolution of iris images from the intensity domain to the feature domain. By directly super-resolving only the features essential for recognition, and by incorporating domain specific information from iris models, improved recognition performance compared to pixel domain super-resolution can be achieved. A framework for applying super-resolution to nonlinear features in the feature-domain is proposed. Based on this framework, a novel feature-domain super-resolution approach for the iris biometric employing 2D Gabor phase-quadrant features is proposed. The approach is shown to outperform its pixel domain counterpart, as well as other feature domain super-resolution approaches and fusion techniques.