194 resultados para feature advertising
Resumo:
A new algorithm for extracting features from images for object recognition is described. The algorithm uses higher order spectra to provide desirable invariance properties, to provide noise immunity, and to incorporate nonlinearity into the feature extraction procedure thereby allowing the use of simple classifiers. An image can be reduced to a set of 1D functions via the Radon transform, or alternatively, the Fourier transform of each 1D projection can be obtained from a radial slice of the 2D Fourier transform of the image according to the Fourier slice theorem. A triple product of Fourier coefficients, referred to as the deterministic bispectrum, is computed for each 1D function and is integrated along radial lines in bifrequency space. Phases of the integrated bispectra are shown to be translation- and scale-invariant. Rotation invariance is achieved by a regrouping of these invariants at a constant radius followed by a second stage of invariant extraction. Rotation invariance is thus converted to translation invariance in the second step. Results using synthetic and actual images show that isolated, compact clusters are formed in feature space. These clusters are linearly separable, indicating that the nonlinearity required in the mapping from the input space to the classification space is incorporated well into the feature extraction stage. The use of higher order spectra results in good noise immunity, as verified with synthetic and real images. Classification of images using the higher order spectra-based algorithm compares favorably to classification using the method of moment invariants
Resumo:
An approach to pattern recognition using invariant parameters based on higher-order spectra is presented. In particular, bispectral invariants are used to classify one-dimensional shapes. The bispectrum, which is translation invariant, is integrated along straight lines passing through the origin in bifrequency space. The phase of the integrated bispectrum is shown to be scale- and amplification-invariant. A minimal set of these invariants is selected as the feature vector for pattern classification. Pattern recognition using higher-order spectral invariants is fast, suited for parallel implementation, and works for signals corrupted by Gaussian noise. The classification technique is shown to distinguish two similar but different bolts given their one-dimensional profiles
Resumo:
Features derived from the trispectra of DFT magnitude slices are used for multi-font digit recognition. These features are insensitive to translation, rotation, or scaling of the input. They are also robust to noise. Classification accuracy tests were conducted on a common data base of 256× 256 pixel bilevel images of digits in 9 fonts. Randomly rotated and translated noisy versions were used for training and testing. The results indicate that the trispectral features are better than moment invariants and affine moment invariants. They achieve a classification accuracy of 95% compared to about 81% for Hu's (1962) moment invariants and 39% for the Flusser and Suk (1994) affine moment invariants on the same data in the presence of 1% impulse noise using a 1-NN classifier. For comparison, a multilayer perceptron with no normalization for rotations and translations yields 34% accuracy on 16× 16 pixel low-pass filtered and decimated versions of the same data.
Resumo:
Gaussian mixture models (GMMs) have become an established means of modeling feature distributions in speaker recognition systems. It is useful for experimentation and practical implementation purposes to develop and test these models in an efficient manner particularly when computational resources are limited. A method of combining vector quantization (VQ) with single multi-dimensional Gaussians is proposed to rapidly generate a robust model approximation to the Gaussian mixture model. A fast method of testing these systems is also proposed and implemented. Results on the NIST 1996 Speaker Recognition Database suggest comparable and in some cases an improved verification performance to the traditional GMM based analysis scheme. In addition, previous research for the task of speaker identification indicated a similar system perfomance between the VQ Gaussian based technique and GMMs
Resumo:
A system to segment and recognize Australian 4-digit postcodes from address labels on parcels is described. Images of address labels are preprocessed and adaptively thresholded to reduce noise. Projections are used to segment the line and then the characters comprising the postcode. Individual digits are recognized using bispectral features extracted from their parallel beam projections. These features are insensitive to translation, scaling and rotation, and robust to noise. Results on scanned images are presented. The system is currently being improved and implemented to work on-line.
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
This special feature section of Journal of Management & Organization (Volume 17/1 - March 2011) sets out to widen understanding of the processes of stability and change in today's organizations, with a particular emphasis on the contribution of institutional approaches to organizational studies. Institutional perspectives on organization theory assume that rational, economic calculations, such as the maximization of profits or the optimization of resource allocation, are not sufficient to understand the behavior of organizations and their strategic choices. Institutionalists acknowledge the great uncertainty associated with the conduct of organizations and suggest that taken-for-granted values, beliefs and meanings within and outside organizations also play an important role in the determination of legitimate action.
Resumo:
Strategic communication is held to be a key process by which organisations respond to environmental uncertainty. In the received view articulated in the literatures of organisational communication and public relations, strategic communication results from collaborative efforts by organisational members to create shared understanding about environmental uncertainty and, as a result of this collective understanding, formulate appropriate communication responses. In this study, I explore how such collaborative efforts towards the development of strategic communication are derived from, and bounded by, culturally shared values and assumptions. Study of the influences of an organisation‟s culture on the formulation of strategic communication is a fundamental conceptual challenge for public relations and, to date, a largely unaddressed area of research. This thesis responds to this challenge by describing a key property of organisational culture – the action of cultural selection (Durham, 1992). I integrate this property of cultural selection to extend and refine the descriptive range of Weick‟s (1969, 1979) classic sociocultural model of organizing. From this integration I propose a new model, the Cultural Selection of Strategic Communication (CSSC). Underpinning the CSSC model is the central proposition that because of the action of cultural selection during organizing processes, the inherently conservative properties of an organisation‟s culture constrain development of effective strategic communication in ways that may be unrelated to the outcomes of “environmental scanning” and other monitoring functions heralded by the public relations literature as central to organisational adaptation. Thus, by examining the development of strategic communication, I describe a central conservative influence on the social ecology of organisations. This research also responds to Butschi and Steyn‟s (2006) call for the development of theory focusing on strategic communication as well as Grunig (2006) and Sriramesh‟s (2007) call for research to further understand the role of culture in public relations practice. In keeping with the explorative and descriptive goals of this study, I employ organisational ethnography to examine the influence of cultural selection on the development of strategic communication. In this methodological approach, I use the technique of progressive contextualisation to compare data from two related but distinct cultural settings. This approach provides a range of descriptive opportunities to permit a deeper understanding of the work of cultural selection. Findings of this study propose that culture, operating as a system of shared and socially transmitted social knowledge, acts through the property of cultural selection to influence decision making, and decrease conceptual variation within a group. The findings support the view that strategic communication, as a cultural product derived from the influence of cultural selection, is an essential feature to understand the social ecology of an organisation.
Resumo:
Gait recognition approaches continue to struggle with challenges including view-invariance, low-resolution data, robustness to unconstrained environments, and fluctuating gait patterns due to subjects carrying goods or wearing different clothes. Although computationally expensive, model based techniques offer promise over appearance based techniques for these challenges as they gather gait features and interpret gait dynamics in skeleton form. In this paper, we propose a fast 3D ellipsoidal-based gait recognition algorithm using a 3D voxel model derived from multi-view silhouette images. This approach directly solves the limitations of view dependency and self-occlusion in existing ellipse fitting model-based approaches. Voxel models are segmented into four components (left and right legs, above and below the knee), and ellipsoids are fitted to each region using eigenvalue decomposition. Features derived from the ellipsoid parameters are modeled using a Fourier representation to retain the temporal dynamic pattern for classification. We demonstrate the proposed approach using the CMU MoBo database and show that an improvement of 15-20% can be achieved over a 2D ellipse fitting baseline.
Resumo:
Continuous user authentication with keystroke dynamics uses characters sequences as features. Since users can type characters in any order, it is imperative to find character sequences (n-graphs) that are representative of user typing behavior. The contemporary feature selection approaches do not guarantee selecting frequently-typed features which may cause less accurate statistical user-representation. Furthermore, the selected features do not inherently reflect user typing behavior. We propose four statistical based feature selection techniques that mitigate limitations of existing approaches. The first technique selects the most frequently occurring features. The other three consider different user typing behaviors by selecting: n-graphs that are typed quickly; n-graphs that are typed with consistent time; and n-graphs that have large time variance among users. We use Gunetti’s keystroke dataset and k-means clustering algorithm for our experiments. The results show that among the proposed techniques, the most-frequent feature selection technique can effectively find user representative features. We further substantiate our results by comparing the most-frequent feature selection technique with three existing approaches (popular Italian words, common n-graphs, and least frequent ngraphs). We find that it performs better than the existing approaches after selecting a certain number of most-frequent n-graphs.
Resumo:
The conventional manual power line corridor inspection processes that are used by most energy utilities are labor-intensive, time consuming and expensive. Remote sensing technologies represent an attractive and cost-effective alternative approach to these monitoring activities. This paper presents a comprehensive investigation into automated remote sensing based power line corridor monitoring, focusing on recent innovations in the area of increased automation of fixed-wing platforms for aerial data collection, and automated data processing for object recognition using a feature fusion process. Airborne automation is achieved by using a novel approach that provides improved lateral control for tracking corridors and automatic real-time dynamic turning for flying between corridor segments, we call this approach PTAGS. Improved object recognition is achieved by fusing information from multi-sensor (LiDAR and imagery) data and multiple visual feature descriptors (color and texture). The results from our experiments and field survey illustrate the effectiveness of the proposed aircraft control and feature fusion approaches.
Resumo:
In this paper we present a novel algorithm for localization during navigation that performs matching over local image sequences. Instead of calculating the single location most likely to correspond to a current visual scene, the approach finds candidate matching locations within every section (subroute) of all learned routes. Through this approach, we reduce the demands upon the image processing front-end, requiring it to only be able to correctly pick the best matching image from within a short local image sequence, rather than globally. We applied this algorithm to a challenging downhill mountain biking visual dataset where there was significant perceptual or environment change between repeated traverses of the environment, and compared performance to applying the feature-based algorithm FAB-MAP. The results demonstrate the potential for localization using visual sequences, even when there are no visual features that can be reliably detected.
Resumo:
Pedestrians’ use of mp3 players or mobile phones can pose the risk of being hit by motor vehicles. We present an approach for detecting a crash risk level using the computing power and the microphone of mobile devices that can be used to alert the user in advance of an approaching vehicle so as to avoid a crash. A single feature extractor classifier is not usually able to deal with the diversity of risky acoustic scenarios. In this paper, we address the problem of detection of vehicles approaching a pedestrian by a novel, simple, non resource intensive acoustic method. The method uses a set of existing statistical tools to mine signal features. Audio features are adaptively thresholded for relevance and classified with a three component heuristic. The resulting Acoustic Hazard Detection (AHD) system has a very low false positive detection rate. The results of this study could help mobile device manufacturers to embed the presented features into future potable devices and contribute to road safety.