Biblioteca Digital

962 resultados para Battistero di San Giovanni (Florence, Italy)

Cross likelihood ratio based speaker clustering using eigenvoice models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

‘Show Me Your Wiki and I’ll Show you Mine’: Using Online Interactive Media to Improve Academic Writing and Research in a Public Health Under-Graduate Cohort

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The number of Internet users in Australia has been steadily increasing, with over 10.9 million people currently subscribed to an internet provider (ABS, 2011). Over the past year, the most avid users of the Internet were 15 – 24 year olds, with approximately 95% accessing the internet on a regular basis (ABS, Social Trends, 2011). While the internet has been described as fundamental to higher education students, social and leisure internet tools are also increasingly being used by these students to generate and maintain their social and professional networks and interactions (Duffy & Bruns 2006). Rapid technological advancements have enabled greater and faster access to information for learning and education (Hemmi et al, 2009; Glassman and Kang, 2011). As such, we sought to integrate interactive, online social media into the assessment profile of a Public Health undergraduate cohort at the Queensland University of Technology (QUT). The aim of this exercise was to engage students to both develop and showcase their research on a range of complex, contemporary health issues within the online forum of Wikispaces (http://www.wikispaces.com/) for review and critique by their peers. We applied Bandura’s Social Learning Theory (SLT) to analyse the interactive processes from which students developed deeper and more sustained learning, and via which their overall academic writing standards were raised. This paper outlines the assessment task, and the students’ feedback on their learning outcomes in relation to the Attentional, Retentional, Motor Reproduction, and Motivational Processes outlined by Bandura in SLT. We conceptualise the findings in a theoretical model, and discuss the implications for this approach within the broader tertiary environment.

Anchored deformable face ensemble alignment

Relevância:

100.00% 100.00%

Publicador:

Resumo:

At present, many approaches have been proposed for deformable face alignment with varying degrees of success. However, the common drawback to nearly all these approaches is the inaccurate landmark registrations. The registration errors which occur are predominantly heterogeneous (i.e. low error for some frames in a sequence and higher error for others). In this paper we propose an approach for simultaneously aligning an ensemble of deformable face images stemming from the same subject given noisy heterogeneous landmark estimates. We propose that these initial noisy landmark estimates can be used as an “anchor” in conjunction with known state-of-the-art objectives for unsupervised image ensemble alignment. Impressive alignment performance is obtained using well known deformable face fitting algorithms as “anchors.

V1-inspired features induce a weighted margin in SVMs

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Image representations derived from simplified models of the primary visual cortex (V1), such as HOG and SIFT, elicit good performance in a myriad of visual classification tasks including object recognition/detection, pedestrian detection and facial expression classification. A central question in the vision, learning and neuroscience communities regards why these architectures perform so well. In this paper, we offer a unique perspective to this question by subsuming the role of V1-inspired features directly within a linear support vector machine (SVM). We demonstrate that a specific class of such features in conjunction with a linear SVM can be reinterpreted as inducing a weighted margin on the Kronecker basis expansion of an image. This new viewpoint on the role of V1-inspired features allows us to answer fundamental questions on the uniqueness and redundancies of these features, and offer substantial improvements in terms of computational and storage efficiency.

On the statistical determination of optimal camera configurations in large scale surveillance networks

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The selection of optimal camera configurations (camera locations, orientations etc.) for multi-camera networks remains an unsolved problem. Previous approaches largely focus on proposing various objective functions to achieve different tasks. Most of them, however, do not generalize well to large scale networks. To tackle this, we introduce a statistical formulation of the optimal selection of camera configurations as well as propose a Trans-Dimensional Simulated Annealing (TDSA) algorithm to effectively solve the problem. We compare our approach with a state-of-the-art method based on Binary Integer Programming (BIP) and show that our approach offers similar performance on small scale problems. However, we also demonstrate the capability of our approach in dealing with large scale problems and show that our approach produces better results than 2 alternative heuristics designed to deal with the scalability issue of BIP.

Efficient articulated trajectory reconstruction using dynamic programming and filters

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers the problem of reconstructing the motion of a 3D articulated tree from 2D point correspondences subject to some temporal prior. Hitherto, smooth motion has been encouraged using a trajectory basis, yielding a hard combinatorial problem with time complexity growing exponentially in the number of frames. Branch and bound strategies have previously attempted to curb this complexity whilst maintaining global optimality. However, they provide no guarantee of being more efficient than exhaustive search. Inspired by recent work which reconstructs general trajectories using compact high-pass filters, we develop a dynamic programming approach which scales linearly in the number of frames, leveraging the intrinsically local nature of filter interactions. Extension to affine projection enables reconstruction without estimating cameras.

Improving PLDA speaker verification with limited development data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper analyses the probabilistic linear discriminant analysis (PLDA) speaker verification approach with limited development data. This paper investigates the use of the median as the central tendency of a speaker’s i-vector representation, and the effectiveness of weighted discriminative techniques on the performance of state-of-the-art length-normalised Gaussian PLDA (GPLDA) speaker verification systems. The analysis within shows that the median (using a median fisher discriminator (MFD)) provides a better representation of a speaker when the number of representative i-vectors available during development is reduced, and that further, usage of the pair-wise weighting approach in weighted LDA and weighted MFD provides further improvement in limited development conditions. Best performance is obtained using a weighted MFD approach, which shows over 10% improvement in EER over the baseline GPLDA system on mismatched and interview-interview conditions.

Investigating design for disassembly through creative practice

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The construction industry is responsible for a significant part of the solid waste that industrialised nations dispose of each year. One reason for this is the inability to easily separate materials and components from each other and from the building structure. If buildings were designed for disassembly in the first instance, then future material and component recovery would be easier. This paper presents a number of principles for design for disassembly that have been tested and developed through a process of research through creative practice. A number of architectural designs have been used to trial the principles in practice.

Parallel streaming signature EM-tree: A clustering algorithm for web scale applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The proliferation of the web presents an unsolved problem of automatically analyzing billions of pages of natural language. We introduce a scalable algorithm that clusters hundreds of millions of web pages into hundreds of thousands of clusters. It does this on a single mid-range machine using efficient algorithms and compressed document representations. It is applied to two web-scale crawls covering tens of terabytes. ClueWeb09 and ClueWeb12 contain 500 and 733 million web pages and were clustered into 500,000 to 700,000 clusters. To the best of our knowledge, such fine grained clustering has not been previously demonstrated. Previous approaches clustered a sample that limits the maximum number of discoverable clusters. The proposed EM-tree algorithm uses the entire collection in clustering and produces several orders of magnitude more clusters than the existing algorithms. Fine grained clustering is necessary for meaningful clustering in massive collections where the number of distinct topics grows linearly with collection size. These fine-grained clusters show an improved cluster quality when assessed with two novel evaluations using ad hoc search relevance judgments and spam classifications for external validation. These evaluations solve the problem of assessing the quality of clusters where categorical labeling is unavailable and unfeasible.

Phase separation in critical and off-critical fluids: specific experimental behaviours at criticality

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The phase separation in fluids close to a critical point can be observed in the form of either an interconnected pattern (critical case) or a disconnected pattern (off-critical case). These two regimes have been investigated in different ways. First, a sharp change in pattern is shown to occur very close to the critical point when the composition is varied. No crossover has been observed between the t1 behaviour (interconnected) and a t1/3 behaviour (disconnected), where t is time. This latter growth law, which occurs in the case of compact droplets, will be discussed. Second, it has been observed that a growing interconnected pattern leaves a signature in the form of small droplets. The origin of such a distribution will be discussed in terms of coalescence of domains. No distribution of this kind is observed in the off-critical case.

Cramer-Rao bounds for source localization in shallow ocean with generalized Gaussian noise

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Localization of underwater acoustic sources is a problem of great interest in the area of ocean acoustics. There exist several algorithms for source localization based on array signal processing.It is of interest to know the theoretical performance limits of these estimators. In this paper we develop expressions for the Cramer-Rao-Bound (CRB) on the variance of direction-of-arrival(DOA) and range-depth estimators of underwater acoustic sources in a shallow range-independent ocean for the case of generalized Gaussian noise. We then study the performance of some of the popular source localization techniques,through simulations, for DOA/range-depth estimation of underwater acoustic sources in shallow ocean by comparing the variance of the estimators with the corresponding CRBs.

Design of MMSE filterbank precoder and equalizer for MIMO frequency selective channels

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we consider the problem of designing minimum mean squared error (MMSE) filterbank precoder and equalizer for multiple input multiple output (MIMO) frequency selective channels. We derive the conditions to be satisfied by the optimal precoder-equalizer pair, and provide an iterative algorithm for solving them. The optimal design is very general, in that it is not constrained by channel dimensions, channel order, channel rank, or the input constellation. We also discuss some pertinent difierences between the filterbank approach and the space-time approach to the design of optimal precoder and equalizer. Simulation results demonstrate that the proposed design performs better than the space-time systems while supporting a higher data rate.

Novel auditory motivated subband temporal envelope based fundamental frequency estimation algorithm

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We address the problem of estimating the fundamental frequency of voiced speech. We present a novel solution motivated by the importance of amplitude modulation in sound processing and speech perception. The new algorithm is based on a cumulative spectrum computed from the temporal envelope of various subbands. We provide theoretical analysis to derive the new pitch estimator based on the temporal envelope of the bandpass speech signal. We report extensive experimental performance for synthetic as well as natural vowels for both realworld noisy and noise-free data. Experimental results show that the new technique performs accurate pitch estimation and is robust to noise. We also show that the technique is superior to the autocorrelation technique for pitch estimation.

A hybrid pre-whitening technique for detection of additive spread spectrum watermarks in audio signals

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pre-whitening techniques are employed in blind correlation detection of additive spread spectrum watermarks in audio signals to reduce the host signal interference. A direct deterministic whitening (DDW) scheme is derived in this paper from the frequency domain analysis of the time domain correlation process. Our experimental studies reveal that, the Savitzky-Golay Whitening (SGW), which is otherwise inferior to DDW technique, performs better when the audio signal is predominantly lowpass. The novelty of this paper lies in exploiting the complementary nature to the two whitening techniques to obtain a hybrid whitening (HbW) scheme. In the hybrid scheme the DDW and SGW techniques are selectively applied, based on short time spectral characteristics of the audio signal. The hybrid scheme extends the reliability of watermark detection to a wider range of audio signals.

Time-varying signal adaptive transform and IHT recovery of compressive sensed speech

Relevância:

100.00% 100.00%

Publicador:

«
1
2
...
15
16
17
18
19
20
21
...
64
65
»