152 resultados para Microscopic features
Resumo:
Compared to people with a high socioeconomic status, those with a lower socioeconomic status are more likely to perceive their neighbourhood as unattractive and unsafe, which is associated with their lower levels of physical activity. Agreement between objective and perceived environmental factors is often found to be moderate or low, so it is questionable to what extent ‘creating supportive neighbourhoods’ would change neighbourhood perceptions. This study among residents (N=814) of fourteen neighbourhoods in the city of Eindhoven (the Netherlands), investigated to what extent socioeconomic differences in perceived neighbourhood safety and perceived neighbourhood attractiveness can be explained by five domains of objective neighbourhood features (i.e. design, traffic safety, social safety, aesthetics, and destinations), and to what extent other factors may play a role. Unfavourable neighbourhood perceptions of low socioeconomic groups partly reflected their actual less aesthetic and less safe neighbourhoods, and partly their perceptions of low social neighbourhood cohesion and adverse psychosocial circumstances.
Resumo:
Robust image hashing seeks to transform a given input image into a shorter hashed version using a key-dependent non-invertible transform. These image hashes can be used for watermarking, image integrity authentication or image indexing for fast retrieval. This paper introduces a new method of generating image hashes based on extracting Higher Order Spectral features from the Radon projection of an input image. The feature extraction process is non-invertible, non-linear and different hashes can be produced from the same image through the use of random permutations of the input. We show that the transform is robust to typical image transformations such as JPEG compression, noise, scaling, rotation, smoothing and cropping. We evaluate our system using a verification-style framework based on calculating false match, false non-match likelihoods using the publicly available Uncompressed Colour Image database (UCID) of 1320 images. We also compare our results to Swaminathan’s Fourier-Mellin based hashing method with at least 1% EER improvement under noise, scaling and sharpening.
Resumo:
In public venues, crowd size is a key indicator of crowd safety and stability. In this paper we propose a crowd counting algorithm that uses tracking and local features to count the number of people in each group as represented by a foreground blob segment, so that the total crowd estimate is the sum of the group sizes. Tracking is employed to improve the robustness of the estimate, by analysing the history of each group, including splitting and merging events. A simplified ground truth annotation strategy results in an approach with minimal setup requirements that is highly accurate.
Resumo:
The cascading appearance-based (CAB) feature extraction technique has established itself as the state-of-the-art in extracting dynamic visual speech features for speech recognition. In this paper, we will focus on investigating the effectiveness of this technique for the related speaker verification application. By investigating the speaker verification ability of each stage of the cascade we will demonstrate that the same steps taken to reduce static speaker and environmental information for the visual speech recognition application also provide similar improvements for visual speaker recognition. A further study is conducted comparing synchronous HMM (SHMM) based fusion of CAB visual features and traditional perceptual linear predictive (PLP) acoustic features to show that higher complexity inherit in the SHMM approach does not appear to provide any improvement in the final audio-visual speaker verification system over simpler utterance level score fusion.
Resumo:
The use of appropriate features to characterize an output class or object is critical for all classification problems. This paper evaluates the capability of several spectral and texture features for object-based vegetation classification at the species level using airborne high resolution multispectral imagery. Image-objects as the basic classification unit were generated through image segmentation. Statistical moments extracted from original spectral bands and vegetation index image are used as feature descriptors for image objects (i.e. tree crowns). Several state-of-art texture descriptors such as Gray-Level Co-Occurrence Matrix (GLCM), Local Binary Patterns (LBP) and its extensions are also extracted for comparison purpose. Support Vector Machine (SVM) is employed for classification in the object-feature space. The experimental results showed that incorporating spectral vegetation indices can improve the classification accuracy and obtained better results than in original spectral bands, and using moments of Ratio Vegetation Index obtained the highest average classification accuracy in our experiment. The experiments also indicate that the spectral moment features also outperform or can at least compare with the state-of-art texture descriptors in terms of classification accuracy.
Resumo:
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second zone system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state of the art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
Resumo:
Freeways are divided roadways designed to facilitate the uninterrupted movement of motor vehicles. However, many freeways now experience demand flows in excess of capacity, leading to recurrent congestion. The Highway Capacity Manual (TRB, 1994) uses empirical macroscopic relationships between speed, flow and density to quantify freeway operations and performance. Capacity may be predicted as the maximum uncongested flow achievable. Although they are effective tools for design and analysis, macroscopic models lack an understanding of the nature of processes taking place in the system. Szwed and Smith (1972, 1974) and Makigami and Matsuo (1990) have shown that microscopic modelling is also applicable to freeway operations. Such models facilitate an understanding of the processes whilst providing for the assessment of performance, through measures of capacity and delay. However, these models are limited to only a few circumstances. The aim of this study was to produce more comprehensive and practical microscopic models. These models were required to accurately portray the mechanisms of freeway operations at the specific locations under consideration. The models needed to be able to be calibrated using data acquired at these locations. The output of the models needed to be able to be validated with data acquired at these sites. Therefore, the outputs should be truly descriptive of the performance of the facility. A theoretical basis needed to underlie the form of these models, rather than empiricism, which is the case for the macroscopic models currently used. And the models needed to be adaptable to variable operating conditions, so that they may be applied, where possible, to other similar systems and facilities. It was not possible to produce a stand-alone model which is applicable to all facilities and locations, in this single study, however the scene has been set for the application of the models to a much broader range of operating conditions. Opportunities for further development of the models were identified, and procedures provided for the calibration and validation of the models to a wide range of conditions. The models developed, do however, have limitations in their applicability. Only uncongested operations were studied and represented. Driver behaviour in Brisbane was applied to the models. Different mechanisms are likely in other locations due to variability in road rules and driving cultures. Not all manoeuvres evident were modelled. Some unusual manoeuvres were considered unwarranted to model. However the models developed contain the principal processes of freeway operations, merging and lane changing. Gap acceptance theory was applied to these critical operations to assess freeway performance. Gap acceptance theory was found to be applicable to merging, however the major stream, the kerb lane traffic, exercises only a limited priority over the minor stream, the on-ramp traffic. Theory was established to account for this activity. Kerb lane drivers were also found to change to the median lane where possible, to assist coincident mergers. The net limited priority model accounts for this by predicting a reduced major stream flow rate, which excludes lane changers. Cowan's M3 model as calibrated for both streams. On-ramp and total upstream flow are required as input. Relationships between proportion of headways greater than 1 s and flow differed for on-ramps where traffic leaves signalised intersections and unsignalised intersections. Constant departure onramp metering was also modelled. Minimum follow-on times of 1 to 1.2 s were calibrated. Critical gaps were shown to lie between the minimum follow-on time, and the sum of the minimum follow-on time and the 1 s minimum headway. Limited priority capacity and other boundary relationships were established by Troutbeck (1995). The minimum average minor stream delay and corresponding proportion of drivers delayed were quantified theoretically in this study. A simulation model was constructed to predict intermediate minor and major stream delays across all minor and major stream flows. Pseudo-empirical relationships were established to predict average delays. Major stream average delays are limited to 0.5 s, insignificant compared with minor stream delay, which reach infinity at capacity. Minor stream delays were shown to be less when unsignalised intersections are located upstream of on-ramps than signalised intersections, and less still when ramp metering is installed. Smaller delays correspond to improved merge area performance. A more tangible performance measure, the distribution of distances required to merge, was established by including design speeds. This distribution can be measured to validate the model. Merging probabilities can be predicted for given taper lengths, a most useful performance measure. This model was also shown to be applicable to lane changing. Tolerable limits to merging probabilities require calibration. From these, practical capacities can be estimated. Further calibration is required of traffic inputs, critical gap and minimum follow-on time, for both merging and lane changing. A general relationship to predict proportion of drivers delayed requires development. These models can then be used to complement existing macroscopic models to assess performance, and provide further insight into the nature of operations.
Resumo:
Paired speaking tests are now commonly used in both high-stakes testing and classroom assessment contexts. The co-construction of discourse by candidates is regarded as a strength of paired speaking tests, as candidates have the opportunity to display a wider range of interactional competencies, including turn taking, initiating topics and engaging in extended discourse with a partner, rather than an examiner. However, the impact of the interlocutor in such jointly negotiated discourse and the implications for assessing interactional competence are areas of concern. This article reports on the features of interactional competence that were salient to four trained raters of 12 paired speaking tests through the analysis of rater notes, stimulated verbal recalls and rater discussions. Findings enabled the identification of features of the performance noted by raters when awarding scores for interactional competence, and the particular features associated with higher and lower scores. A number of these features were seen by the raters as mutual achievements, which raises the issue of the extent to which it is possible to assess individual contributions to the co-constructed performance. The findings have implications for defining the construct of interactional competence in paired speaking tests and operationalising this in rating scales.
Resumo:
Facial expression is an important channel for human communication and can be applied in many real applications. One critical step for facial expression recognition (FER) is to accurately extract emotional features. Current approaches on FER in static images have not fully considered and utilized the features of facial element and muscle movements, which represent static and dynamic, as well as geometric and appearance characteristics of facial expressions. This paper proposes an approach to solve this limitation using ‘salient’ distance features, which are obtained by extracting patch-based 3D Gabor features, selecting the ‘salient’ patches, and performing patch matching operations. The experimental results demonstrate high correct recognition rate (CRR), significant performance improvements due to the consideration of facial element and muscle movements, promising results under face registration errors, and fast processing time. The comparison with the state-of-the-art performance confirms that the proposed approach achieves the highest CRR on the JAFFE database and is among the top performers on the Cohn-Kanade (CK) database.
Resumo:
Human facial expression is a complex process characterized of dynamic, subtle and regional emotional features. State-of-the-art approaches on facial expression recognition (FER) have not fully utilized this kind of features to improve the recognition performance. This paper proposes an approach to overcome this limitation using patch-based ‘salient’ Gabor features. A set of 3D patches are extracted to represent the subtle and regional features, and then inputted into patch matching operations for capturing the dynamic features. Experimental results show a significant performance improvement of the proposed approach due to the use of the dynamic features. Performance comparison with pervious work also confirms that the proposed approach achieves the highest CRR reported to date on the JAFFE database and a top-level performance on the Cohn-Kanade (CK) database.
Resumo:
Robust, affine covariant, feature extractors provide a means to extract correspondences between images captured by widely separated cameras. Advances in wide baseline correspondence extraction require looking beyond the robust feature extraction and matching approach. This study examines new techniques of extracting correspondences that take advantage of information contained in affine feature matches. Methods of improving the accuracy of a set of putative matches, eliminating incorrect matches and extracting large numbers of additional correspondences are explored. It is assumed that knowledge of the camera geometry is not available and not immediately recoverable. The new techniques are evaluated by means of an epipolar geometry estimation task. It is shown that these methods enable the computation of camera geometry in many cases where existing feature extractors cannot produce sufficient numbers of accurate correspondences.