956 resultados para San Lorenzo (Church : Florence, Italy)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper develops a general theory of validation gating for non-linear non-Gaussian mod- els. Validation gates are used in target tracking to cull very unlikely measurement-to-track associa- tions, before remaining association ambiguities are handled by a more comprehensive (and expensive) data association scheme. The essential property of a gate is to accept a high percentage of correct associ- ations, thus maximising track accuracy, but provide a su±ciently tight bound to minimise the number of ambiguous associations. For linear Gaussian systems, the ellipsoidal vali- dation gate is standard, and possesses the statistical property whereby a given threshold will accept a cer- tain percentage of true associations. This property does not hold for non-linear non-Gaussian models. As a system departs from linear-Gaussian, the ellip- soid gate tends to reject a higher than expected pro- portion of correct associations and permit an excess of false ones. In this paper, the concept of the ellip- soidal gate is extended to permit correct statistics for the non-linear non-Gaussian case. The new gate is demonstrated by a bearing-only tracking example.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper we extend the concept of speaker annotation within a single-recording, or speaker diarization, to a collection wide approach we call speaker attribution. Accordingly, speaker attribution is the task of clustering expectantly homogenous intersession clusters obtained using diarization according to common cross-recording identities. The result of attribution is a collection of spoken audio across multiple recordings attributed to speaker identities. In this paper, an attribution system is proposed using mean-only MAP adaptation of a combined-gender UBM to model clusters from a perfect diarization system, as well as a JFA-based system with session variability compensation. The normalized cross-likelihood ratio is calculated for each pair of clusters to construct an attribution matrix and the complete linkage algorithm is employed to conduct clustering of the inter-session clusters. A matched cluster purity and coverage of 87.1% was obtained on the NIST 2008 SRE corpus.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

At present, many approaches have been proposed for deformable face alignment with varying degrees of success. However, the common drawback to nearly all these approaches is the inaccurate landmark registrations. The registration errors which occur are predominantly heterogeneous (i.e. low error for some frames in a sequence and higher error for others). In this paper we propose an approach for simultaneously aligning an ensemble of deformable face images stemming from the same subject given noisy heterogeneous landmark estimates. We propose that these initial noisy landmark estimates can be used as an “anchor” in conjunction with known state-of-the-art objectives for unsupervised image ensemble alignment. Impressive alignment performance is obtained using well known deformable face fitting algorithms as “anchors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image representations derived from simplified models of the primary visual cortex (V1), such as HOG and SIFT, elicit good performance in a myriad of visual classification tasks including object recognition/detection, pedestrian detection and facial expression classification. A central question in the vision, learning and neuroscience communities regards why these architectures perform so well. In this paper, we offer a unique perspective to this question by subsuming the role of V1-inspired features directly within a linear support vector machine (SVM). We demonstrate that a specific class of such features in conjunction with a linear SVM can be reinterpreted as inducing a weighted margin on the Kronecker basis expansion of an image. This new viewpoint on the role of V1-inspired features allows us to answer fundamental questions on the uniqueness and redundancies of these features, and offer substantial improvements in terms of computational and storage efficiency.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The selection of optimal camera configurations (camera locations, orientations etc.) for multi-camera networks remains an unsolved problem. Previous approaches largely focus on proposing various objective functions to achieve different tasks. Most of them, however, do not generalize well to large scale networks. To tackle this, we introduce a statistical formulation of the optimal selection of camera configurations as well as propose a Trans-Dimensional Simulated Annealing (TDSA) algorithm to effectively solve the problem. We compare our approach with a state-of-the-art method based on Binary Integer Programming (BIP) and show that our approach offers similar performance on small scale problems. However, we also demonstrate the capability of our approach in dealing with large scale problems and show that our approach produces better results than 2 alternative heuristics designed to deal with the scalability issue of BIP.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers the problem of reconstructing the motion of a 3D articulated tree from 2D point correspondences subject to some temporal prior. Hitherto, smooth motion has been encouraged using a trajectory basis, yielding a hard combinatorial problem with time complexity growing exponentially in the number of frames. Branch and bound strategies have previously attempted to curb this complexity whilst maintaining global optimality. However, they provide no guarantee of being more efficient than exhaustive search. Inspired by recent work which reconstructs general trajectories using compact high-pass filters, we develop a dynamic programming approach which scales linearly in the number of frames, leveraging the intrinsically local nature of filter interactions. Extension to affine projection enables reconstruction without estimating cameras.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper analyses the probabilistic linear discriminant analysis (PLDA) speaker verification approach with limited development data. This paper investigates the use of the median as the central tendency of a speaker’s i-vector representation, and the effectiveness of weighted discriminative techniques on the performance of state-of-the-art length-normalised Gaussian PLDA (GPLDA) speaker verification systems. The analysis within shows that the median (using a median fisher discriminator (MFD)) provides a better representation of a speaker when the number of representative i-vectors available during development is reduced, and that further, usage of the pair-wise weighting approach in weighted LDA and weighted MFD provides further improvement in limited development conditions. Best performance is obtained using a weighted MFD approach, which shows over 10% improvement in EER over the baseline GPLDA system on mismatched and interview-interview conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The phase separation in fluids close to a critical point can be observed in the form of either an interconnected pattern (critical case) or a disconnected pattern (off-critical case). These two regimes have been investigated in different ways. First, a sharp change in pattern is shown to occur very close to the critical point when the composition is varied. No crossover has been observed between the t1 behaviour (interconnected) and a t1/3 behaviour (disconnected), where t is time. This latter growth law, which occurs in the case of compact droplets, will be discussed. Second, it has been observed that a growing interconnected pattern leaves a signature in the form of small droplets. The origin of such a distribution will be discussed in terms of coalescence of domains. No distribution of this kind is observed in the off-critical case.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Localization of underwater acoustic sources is a problem of great interest in the area of ocean acoustics. There exist several algorithms for source localization based on array signal processing.It is of interest to know the theoretical performance limits of these estimators. In this paper we develop expressions for the Cramer-Rao-Bound (CRB) on the variance of direction-of-arrival(DOA) and range-depth estimators of underwater acoustic sources in a shallow range-independent ocean for the case of generalized Gaussian noise. We then study the performance of some of the popular source localization techniques,through simulations, for DOA/range-depth estimation of underwater acoustic sources in shallow ocean by comparing the variance of the estimators with the corresponding CRBs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we consider the problem of designing minimum mean squared error (MMSE) filterbank precoder and equalizer for multiple input multiple output (MIMO) frequency selective channels. We derive the conditions to be satisfied by the optimal precoder-equalizer pair, and provide an iterative algorithm for solving them. The optimal design is very general, in that it is not constrained by channel dimensions, channel order, channel rank, or the input constellation. We also discuss some pertinent difierences between the filterbank approach and the space-time approach to the design of optimal precoder and equalizer. Simulation results demonstrate that the proposed design performs better than the space-time systems while supporting a higher data rate.