926 resultados para Advanced signal processing
Resumo:
We present a novel method for improving hierarchical speaker clustering in the tasks of speaker diarization and speaker linking. In hierarchical clustering, a tree can be formed that demonstrates various levels of clustering. We propose a ratio that expresses the impact of each cluster on the formation of this tree and use this to rescale cluster scores. This provides score normalisation based on the impact of each cluster. We use a state-of-the-art speaker diarization and linking system across the SAIVT-BNEWS corpus to show that our proposed impact ratio can provide a relative improvement of 16% in diarization error rate (DER).
Resumo:
A field oriented control (FOC) algorithm is simulated and implemented for use with a permanent magnet synchronous motor (PMSM). Rotor position is sensed using Hall effect switches on the stator because other hardware position sensors attached to the rotor may not be desirable or cost effective for certain applications. This places a limit on the resolution of position sensing – only a few Hall effect switches can be placed. In this simulation, three sensors are used and the position information is obtained at higher resolution by estimating it from the rotor dynamics, as shown in literature previously. This study compares the performance of the method with an incremental encoder using simulations. The FOC algorithm is implemented using Digital Motor Control (DMC) and IQ Texas Instruments libraries from a Simulink toolbox called Embedded Coder, and downloaded into a TI microcontroller (TMS320F28335) known as the Piccolo via Code Composer Studio (CCS).
Resumo:
Novel computer vision techniques have been developed to automatically detect unusual events in crowded scenes from video feeds of surveillance cameras. The research is useful in the design of the next generation intelligent video surveillance systems. Two major contributions are the construction of a novel machine learning model for multiple instance learning through compressive sensing, and the design of novel feature descriptors in the compressed video domain.
Resumo:
Due to the popularity of security cameras in public places, it is of interest to design an intelligent system that can efficiently detect events automatically. This paper proposes a novel algorithm for multi-person event detection. To ensure greater than real-time performance, features are extracted directly from compressed MPEG video. A novel histogram-based feature descriptor that captures the angles between extracted particle trajectories is proposed, which allows us to capture motion patterns of multi-person events in the video. To alleviate the need for fine-grained annotation, we propose the use of Labelled Latent Dirichlet Allocation, a “weakly supervised” method that allows the use of coarse temporal annotations which are much simpler to obtain. This novel system is able to run at approximately ten times real-time, while preserving state-of-theart detection performance for multi-person events on a 100-hour real-world surveillance dataset (TRECVid SED).
Empirical vehicle-to-vehicle pathloss modeling in highway, suburban and urban environments at 5.8GHz
Resumo:
In this paper, we present a pathloss characterization for vehicle-to-vehicle (V2V) communications based on empirical data collected from extensive measurement campaign performed under line-of-sight (LOS), non-line-of-sight (NLOS) and varying traffic densities. The experiment was conducted in three different V2V propagation environments: highway, suburban and urban at 5.8GHz. We developed pathloss models for each of the three different V2V environments considered. Based on a log-distance power law model, the values for the pathloss exponent and the standard deviation of shadowing were reported. The average pathloss exponent ranges from 1.77 for highway, 1.68 for the urban to 1.53 for the suburban environment. The reported results can contribute to vehicular network (VANET) simulators and can be used by system designers to develop, evaluate and validate new protocols and system designs under realistic propagation conditions.
Resumo:
This paper presents new schemes for recursive estimation of the state transition probabilities for hidden Markov models (HMM's) via extended least squares (ELS) and recursive state prediction error (RSPE) methods. Local convergence analysis for the proposed RSPE algorithm is shown using the ordinary differential equation (ODE) approach developed for the more familiar recursive output prediction error (RPE) methods. The presented scheme converges and is relatively well conditioned compared with the ...
Resumo:
In this paper new online adaptive hidden Markov model (HMM) state estimation schemes are developed, based on extended least squares (ELS) concepts and recursive prediction error (RPE) methods. The best of the new schemes exploit the idempotent nature of Markov chains and work with a least squares prediction error index, using a posterior estimates, more suited to Markov models then traditionally used in identification of linear systems.
Resumo:
This paper develops maximum likelihood (ML) estimation schemes for finite-state semi-Markov chains in white Gaussian noise. We assume that the semi-Markov chain is characterised by transition probabilities of known parametric from with unknown parameters. We reformulate this hidden semi-Markov model (HSM) problem in the scalar case as a two-vector homogeneous hidden Markov model (HMM) problem in which the state consist of the signal augmented by the time to last transition. With this reformulation we apply the expectation Maximumisation (EM ) algorithm to obtain ML estimates of the transition probabilities parameters, Markov state levels and noise variance. To demonstrate our proposed schemes, motivated by neuro-biological applications, we use a damped sinusoidal parameterised function for the transition probabilities.
Resumo:
In this paper we propose and study low complexity algorithms for on-line estimation of hidden Markov model (HMM) parameters. The estimates approach the true model parameters as the measurement noise approaches zero, but otherwise give improved estimates, albeit with bias. On a nite data set in the high noise case, the bias may not be signi cantly more severe than for a higher complexity asymptotically optimal scheme. Our algorithms require O(N3) calculations per time instant, where N is the number of states. Previous algorithms based on earlier hidden Markov model signal processing methods, including the expectation-maximumisation (EM) algorithm require O(N4) calculations per time instant.
Resumo:
The echolocation calls of long-tailed bats (Chalinolobus tuberculatus) were recorded in the Eglinton Valley, Fjordland, New Zealand, and digitized for analysis with the signal-processing software. Univariate and multivariate analyses of measure features facilitated a quantitative classification of the calls. Cluster analysis was used to categorize calls into two groups equating to search and terminal buzz calls described qualitatively for other species. When moving from search to terminal phases, the calls decrease in bandwidth, maximum and minimum frequency of call, and duration. Search calls begin with a steep-downward FM sweep followed by a short, less-modulated component. Buzz calls are FM sweeps. Although not found quantitatively, a broad pre-buzz group of calls also was identified. Ambiguity analysis of calls from the three groups shows that search-phrase calls are well suited to resolving the velocity of targets, and hence, identifying moving targets in a stationary clutter. Pre-buzz and buzz calls are better suited to resolving range, a feature that may aid the bats in capture of evasive prey after it has been identified.
Resumo:
Selection of features that will permit accurate pattern classification is a difficult task. However, if a particular data set is represented by discrete valued features, it becomes possible to determine empirically the contribution that each feature makes to the discrimination between classes. This paper extends the discrimination bound method so that both the maximum and average discrimination expected on unseen test data can be estimated. These estimation techniques are the basis of a backwards elimination algorithm that can be use to rank features in order of their discriminative power. Two problems are used to demonstrate this feature selection process: classification of the Mushroom Database, and a real-world, pregnancy related medical risk prediction task - assessment of risk of perinatal death.
Resumo:
Non-rigid image registration is an essential tool required for overcoming the inherent local anatomical variations that exist between images acquired from different individuals or atlases. Furthermore, certain applications require this type of registration to operate across images acquired from different imaging modalities. One popular local approach for estimating this registration is a block matching procedure utilising the mutual information criterion. However, previous block matching procedures generate a sparse deformation field containing displacement estimates at uniformly spaced locations. This neglects to make use of the evidence that block matching results are dependent on the amount of local information content. This paper presents a solution to this drawback by proposing the use of a Reversible Jump Markov Chain Monte Carlo statistical procedure to optimally select grid points of interest. Three different methods are then compared to propagate the estimated sparse deformation field to the entire image including a thin-plate spline warp, Gaussian convolution, and a hybrid fluid technique. Results show that non-rigid registration can be improved by using the proposed algorithm to optimally select grid points of interest.
Resumo:
Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating noncritical in-car systems. Under such conditions, however, speech recognition accuracy degrades significantly, and techniques such as speech enhancement are required to improve these accuracies. Likelihood-maximizing (LIMA) frameworks optimize speech enhancement algorithms based on recognized state sequences rather than traditional signal-level criteria such as maximizing signal-to-noise ratio. LIMA frameworks typically require calibration utterances to generate optimized enhancement parameters that are used for all subsequent utterances. Under such a scheme, suboptimal recognition performance occurs in noise conditions that are significantly different from that present during the calibration session – a serious problem in rapidly changing noise environments out on the open road. In this chapter, we propose a dialog-based design that allows regular optimization iterations in order to track the ever-changing noise conditions. Experiments using Mel-filterbank noise subtraction (MFNS) are performed to determine the optimization requirements for vehicular environments and show that minimal optimization is required to improve speech recognition, avoid over-optimization, and ultimately assist with semireal-time operation. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session only.
Resumo:
Energy efficient embedded computing enables new application scenarios in mobile devices like software-defined radio and video processing. The hierarchical multiprocessor considered in this work may contain dozens or hundreds of resource efficient VLIW CPUs. Programming this number of CPU cores is a complex task requiring compiler support. The stream programming paradigm provides beneficial properties that help to support automatic partitioning. This work describes a compiler for streaming applications targeting the self-build hierarchical CoreVA-MPSoC multiprocessor platform. The compiler is supported by a programming model that is tailored to fit the streaming programming paradigm. We present a novel simulated-annealing (SA) based partitioning algorithm, called Smart SA. The overall speedup of Smart SA is 12.84 for an MPSoC with 16 CPU cores compared to a single CPU implementation. Comparison with a state of the art partitioning algorithm shows an average performance improvement of 34.07%.
Resumo:
One main challenge in developing a system for visual surveillance event detection is the annotation of target events in the training data. By making use of the assumption that events with security interest are often rare compared to regular behaviours, this paper presents a novel approach by using Kullback-Leibler (KL) divergence for rare event detection in a weakly supervised learning setting, where only clip-level annotation is available. It will be shown that this approach outperforms state-of-the-art methods on a popular real-world dataset, while preserving real time performance.