Biblioteca Digital

188 resultados para Gaussian

PLDA based speaker recognition on short utterances

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.

Analytical and numerical solutions of the space and time fractional bloch-torrey equation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Fractional order dynamics in physics, particularly when applied to diffusion, leads to an extension of the concept of Brown-ian motion through a generalization of the Gaussian probability function to what is termed anomalous diffusion. As MRI is applied with increasing temporal and spatial resolution, the spin dynamics are being examined more closely; such examinations extend our knowledge of biological materials through a detailed analysis of relaxation time distribution and water diffusion heterogeneity. Here the dynamic models become more complex as they attempt to correlate new data with a multiplicity of tissue compartments where processes are often anisotropic. Anomalous diffusion in the human brain using fractional order calculus has been investigated. Recently, a new diffusion model was proposed by solving the Bloch-Torrey equation using fractional order calculus with respect to time and space (see R.L. Magin et al., J. Magnetic Resonance, 190 (2008) 255-270). However effective numerical methods and supporting error analyses for the fractional Bloch-Torrey equation are still limited. In this paper, the space and time fractional Bloch-Torrey equation (ST-FBTE) is considered. The time and space derivatives in the ST-FBTE are replaced by the Caputo and the sequential Riesz fractional derivatives, respectively. Firstly, we derive an analytical solution for the ST-FBTE with initial and boundary conditions on a finite domain. Secondly, we propose an implicit numerical method (INM) for the ST-FBTE, and the stability and convergence of the INM are investigated. We prove that the implicit numerical method for the ST-FBTE is unconditionally stable and convergent. Finally, we present some numerical results that support our theoretical analysis.

Real-time power line extraction from unmanned aerial system video images

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper a real-time vision based power line extraction solution is investigated for active UAV guidance. The line extraction algorithm starts from ridge points detected by steerable filters. A collinear line segments fitting algorithm is followed up by considering global and local information together with multiple collinear measurements. GPU boosted algorithm implementation is also investigated in the experiment. The experimental result shows that the proposed algorithm outperforms two baseline line detection algorithms and is able to fitting long collinear line segments. The low computational cost of the algorithm make suitable for real-time applications.

Wind-energy based path planning for unmanned aerial vehicles using Markov decision processes

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Exploiting wind-energy is one possible way to extend flight duration for Unmanned Arial Vehicles. Wind-energy can also be used to minimise energy consumption for a planned path. In this paper, we consider uncertain time-varying wind fields and plan a path through them. A Gaussian distribution is used to determine uncertainty in the Time-varying wind fields. We use Markov Decision Process to plan a path based upon the uncertainty of Gaussian distribution. Simulation results that compare the direct line of flight between start and target point and our planned path for energy consumption and time of travel are presented. The result is a robust path using the most visited cell while sampling the Gaussian distribution of the wind field in each cell.

Fast power line detection and localization using steerable filter for active UAV guidance

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a fast power line detection and localisation algorithm as well as propose a high-level guidance architecture for active vision-based Unmanned Aerial Vehicle (UAV) guidance. The detection stage is based on steerable filters for edge ridge detection, followed by a line fitting algorithm to refine candidate power lines in images. The guidance architecture assumes an UAV with an onboard Gimbal camera. We first control the position of the Gimbal such that the power line is in the field of view of the camera. Then its pose is used to generate the appropriate control commands such that the aircraft moves and flies above the lines. We present initial experimental results for the detection stage which shows that the proposed algorithm outperforms two state-of-the-art line detection algorithms for power line detection from aerial imagery.

Ergodic capacity of the slotted amplify and forward relay channel with finite relays

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract—In this paper we investigate the capacity of a general class of the slotted amplify and forward (SAF) relaying protocol where multiple, though a finite number of relays may transmit in a given cooperative slot and the relay terminals being half-duplex have a finite slot memory capacity. We derive an expression for the capacity per channel use of this generalized SAF channel assuming all source to relay, relay to destination and source to destination channel gains are independent and modeled as complex Gaussian. We show through the analysis of eigenvalue distributions that the increase in limiting capacity per channel use is marginal with the increase of relay terminals.

Eigenvoice modeling for cross likelihood ratio based speaker clustering : a Bayesian approach

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes the use of Bayesian approaches with the cross likelihood ratio (CLR) as a criterion for speaker clustering within a speaker diarization system, using eigenvoice modeling techniques. The CLR has previously been shown to be an effective decision criterion for speaker clustering using Gaussian mixture models. Recently, eigenvoice modeling has become an increasingly popular technique, due to its ability to adequately represent a speaker based on sparse training data, as well as to provide an improved capture of differences in speaker characteristics. The integration of eigenvoice modeling into the CLR framework to capitalize on the advantage of both techniques has also been shown to be beneficial for the speaker clustering task. Building on that success, this paper proposes the use of Bayesian methods to compute the conditional probabilities in computing the CLR, thus effectively combining the eigenvoice-CLR framework with the advantages of a Bayesian approach to the diarization problem. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 33.5% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

Mixed convection along a vertical flat plate in a non-absorbing medium

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Here mixed convection boundary layer flow of a viscous fluid along a heated vertical semi-infinite plate is investigated in a non-absorbing medium. The relationship between convection and thermal radiation is established via boundary condition of second kind on the thermally radiating vertical surface. The governing boundary layer equations are transformed into dimensionless parabolic partial differential equations with the help of appropriate transformations and the resultant system is solved numerically by applying straightforward finite difference method along with Gaussian elimination technique. It is worthy to note that Prandlt number, Pr, is taken to be small (<< 1) which is appropriate for liquid metals. Moreover, the numerical results are demonstrated graphically by showing the effects of important physical parameters, namely, the modified Richardson number (or mixed convection parameter), Ri*, and surface radiation parameter, R, in terms of local skin friction and local Nusselt number coefficients.

On robust face recognition via sparse encoding : the good, the bad, and the ugly

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the field of face recognition, Sparse Representation (SR) has received considerable attention during the past few years. Most of the relevant literature focuses on holistic descriptors in closed-set identification applications. The underlying assumption in SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such assumption is easily violated in the more challenging face verification scenario, where an algorithm is required to determine if two faces (where one or both have not been seen before) belong to the same person. In this paper, we first discuss why previous attempts with SR might not be applicable to verification problems. We then propose an alternative approach to face verification via SR. Specifically, we propose to use explicit SR encoding on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which are then concatenated to form an overall face descriptor. Due to the deliberate loss spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment & various image deformations. Within the proposed framework, we evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN), and an implicit probabilistic technique based on Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the proposed local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, in both verification and closed-set identification problems. The experiments also show that l1-minimisation based encoding has a considerably higher computational than the other techniques, but leads to higher recognition rates.

Users segmentations for recommendation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Traditional recommendation methods provide recommendations equally to all users. In this paper, a segmentation method using the Gaussian Mixture Model (GMM) is proposed to customize users’ needs in order to offer a specific recommendation strategy to each segment. Experiment is conducted using a live online dating network data.

Neutron Compton scattering from selectively deuterated acetanilide

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the aim of developing the application of neutron Compton scattering (NCS) to molecular systems of biophysical interest, we are using the Compton spectrometer EVS at ISIS to characterize the momentum distribution of protons in peptide groups. In this contribution we present NCS measurements of the recoil peak (Compton profile) due to the amide proton in otherwise fully deuterated acetanilide (ACN), a widely studied model system for H-bonding and energy transfer in biomolecules. We obtain values for the average width of the potential well of the amide proton and its mean kinetic energy. Deviations from the Gaussian form of the Compton profile, analyzed on the basis of an expansion due to Sears, provide data relating to the Laplacian of the proton potential. (C) 1998 Elsevier Science B.V. All rights reserved.

Heteroscedastic probabilistic linear discriminant analysis for manifold learning in video-based face recognition

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To recognize faces in video, face appearances have been widely modeled as piece-wise local linear models which linearly approximate the smooth yet non-linear low dimensional face appearance manifolds. The choice of representations of the local models is crucial. Most of the existing methods learn each local model individually meaning that they only anticipate variations within each class. In this work, we propose to represent local models as Gaussian distributions which are learned simultaneously using the heteroscedastic probabilistic linear discriminant analysis (PLDA). Each gallery video is therefore represented as a collection of such distributions. With the PLDA, not only the within-class variations are estimated during the training, the separability between classes is also maximized leading to an improved discrimination. The heteroscedastic PLDA itself is adapted from the standard PLDA to approximate face appearance manifolds more accurately. Instead of assuming a single global within-class covariance, the heteroscedastic PLDA learns different within-class covariances specific to each local model. In the recognition phase, a probe video is matched against gallery samples through the fusion of point-to-model distances. Experiments on the Honda and MoBo datasets have shown the merit of the proposed method which achieves better performance than the state-of-the-art technique.

Speaker diarization : "who spoke when"

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speaker diarization is the process of annotating an input audio with information that attributes temporal regions of the audio signal to their respective sources, which may include both speech and non-speech events. For speech regions, the diarization system also specifies the locations of speaker boundaries and assign relative speaker labels to each homogeneous segment of speech. In short, speaker diarization systems effectively answer the question of ‘who spoke when’. There are several important applications for speaker diarization technology, such as facilitating speaker indexing systems to allow users to directly access the relevant segments of interest within a given audio, and assisting with other downstream processes such as summarizing and parsing. When combined with automatic speech recognition (ASR) systems, the metadata extracted from a speaker diarization system can provide complementary information for ASR transcripts including the location of speaker turns and relative speaker segment labels, making the transcripts more readable. Speaker diarization output can also be used to localize the instances of specific speakers to pool data for model adaptation, which in turn boosts transcription accuracies. Speaker diarization therefore plays an important role as a preliminary step in automatic transcription of audio data. The aim of this work is to improve the usefulness and practicality of speaker diarization technology, through the reduction of diarization error rates. In particular, this research is focused on the segmentation and clustering stages within a diarization system. Although particular emphasis is placed on the broadcast news audio domain and systems developed throughout this work are also trained and tested on broadcast news data, the techniques proposed in this dissertation are also applicable to other domains including telephone conversations and meetings audio. Three main research themes were pursued: heuristic rules for speaker segmentation, modelling uncertainty in speaker model estimates, and modelling uncertainty in eigenvoice speaker modelling. The use of heuristic approaches for the speaker segmentation task was first investigated, with emphasis placed on minimizing missed boundary detections. A set of heuristic rules was proposed, to govern the detection and heuristic selection of candidate speaker segment boundaries. A second pass, using the same heuristic algorithm with a smaller window, was also proposed with the aim of improving detection of boundaries around short speaker segments. Compared to single threshold based methods, the proposed heuristic approach was shown to provide improved segmentation performance, leading to a reduction in the overall diarization error rate. Methods to model the uncertainty in speaker model estimates were developed, to address the difficulties associated with making segmentation and clustering decisions with limited data in the speaker segments. The Bayes factor, derived specifically for multivariate Gaussian speaker modelling, was introduced to account for the uncertainty of the speaker model estimates. The use of the Bayes factor also enabled the incorporation of prior information regarding the audio to aid segmentation and clustering decisions. The idea of modelling uncertainty in speaker model estimates was also extended to the eigenvoice speaker modelling framework for the speaker clustering task. Building on the application of Bayesian approaches to the speaker diarization problem, the proposed approach takes into account the uncertainty associated with the explicit estimation of the speaker factors. The proposed decision criteria, based on Bayesian theory, was shown to generally outperform their non- Bayesian counterparts.

A quasi-maximum likelihood method for estimating the parameters of multivariate diffusions

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A quasi-maximum likelihood procedure for estimating the parameters of multi-dimensional diffusions is developed in which the transitional density is a multivariate Gaussian density with first and second moments approximating the true moments of the unknown density. For affine drift and diffusion functions, the moments are exactly those of the true transitional density and for nonlinear drift and diffusion functions the approximation is extremely good and is as effective as alternative methods based on likelihood approximations. The estimation procedure generalises to models with latent factors. A conditioning procedure is developed that allows parameter estimation in the absence of proxies.

A closed-form approximation for pricing temperature-based weather derivatives

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper develops analytical distributions of temperature indices on which temperature derivatives are written. If the deviations of daily temperatures from their expected values are modelled as an Ornstein-Uhlenbeck process with timevarying variance, then the distributions of the temperature index on which the derivative is written is the sum of truncated, correlated Gaussian deviates. The key result of this paper is to provide an analytical approximation to the distribution of this sum, thus allowing the accurate computation of payoffs without the need for any simulation. A data set comprising average daily temperature spanning over a hundred years for four Australian cities is used to demonstrate the efficacy of this approach for estimating the payoffs to temperature derivatives. It is demonstrated that expected payoffs computed directly from historical records are a particularly poor approach to the problem when there are trends in underlying average daily temperature. It is shown that the proposed analytical approach is superior to historical pricing.

«
1
2
...
4
5
6
7
8
9
10
...
12
13
»