Biblioteca Digital

250 resultados para Funes, Dean

Investigating in-domain data requirements for PLDA training

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper analyzes the limitations upon the amount of in- domain (NIST SREs) data required for training a probabilistic linear discriminant analysis (PLDA) speaker verification system based on out-domain (Switchboard) total variability subspaces. By limiting the number of speakers, the number of sessions per speaker and the length of active speech per session available in the target domain for PLDA training, we investigated the relative effect of these three parameters on PLDA speaker verification performance in the NIST 2008 and NIST 2010 speaker recognition evaluation datasets. Experimental results indicate that while these parameters depend highly on each other, to beat out-domain PLDA training, more than 10 seconds of active speech should be available for at least 4 sessions/speaker for a minimum of 800 speakers. If further data is available, considerable improvement can be made over solely out-domain PLDA training.

Smart Cities, Social Capital, and Citizens at Play: A Critique and a Way Forward

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Digital transformations are not contained within the digital domain but are increasingly spilling over into the physical world. In this chapter, we analyse some of the transformations undergoing in cities today towards becoming smart cities. We offer a critique of smart cities and a way forward, divided into three parts: First, we explore the concept of Smart Citizens in terms of both localities, the move towards a hyperlocal network and also the citizen’s role in the creation and use of data. We use the ‘Smart London’ plan drawn up by the Mayor of London, as a way to illustrate our discussion. Second, we turn to the civic innovations enabled by digital transformations and their potential impact on citizens and citizenship. Specifically, we are interested in the notion of social capital as an alternative form of in-kind currency and its function as an indicator of value, in order to ask, can digital transformations give rise to ‘civic capital,’ and how can such a concept help, for instance, a local government invite more representative residents and community champions to participate in community engagement for better urban planning. Third, we introduce a hybrid, location-based game under development by design agency Preliminal Games in London, UK. This illustrative case critiques and highlights the current challenges to establishing a new economic model that bridges the digital / physical divide. The game provides a vehicle for us to explore how established principles and strategies in game design such as immersive storytelling and goal setting, can be employed to encourage players to think of the interconnections of their hybrid digital / physical environments in new ways.

Improving PLDA speaker verification using WMFD and linear-weighted approaches in limited microphone data conditions

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes the addition of a weighted median Fisher discriminator (WMFD) projection prior to length-normalised Gaussian probabilistic linear discriminant analysis (GPLDA) modelling in order to compensate the additional session variation. In limited microphone data conditions, a linear-weighted approach is introduced to increase the influence of microphone speech dataset. The linear-weighted WMFD-projected GPLDA system shows improvements in EER and DCF values over the pooled LDA- and WMFD-projected GPLDA systems in inter-view-interview condition as WMFD projection extracts more speaker discriminant information with limited number of sessions/ speaker data, and linear-weighted GPLDA approach estimates reliable model parameters with limited microphone data.

Dataset-invariant covariance normalization for out-domain PLDA speaker verification

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we introduce a novel domain-invariant covariance normalization (DICN) technique to relocate both in-domain and out-domain i-vectors into a third dataset-invariant space, providing an improvement for out-domain PLDA speaker verification with a very small number of unlabelled in-domain adaptation i-vectors. By capturing the dataset variance from a global mean using both development out-domain i-vectors and limited unlabelled in-domain i-vectors, we could obtain domain- invariant representations of PLDA training data. The DICN- compensated out-domain PLDA system is shown to perform as well as in-domain PLDA training with as few as 500 unlabelled in-domain i-vectors for NIST-2010 SRE and 2000 unlabelled in-domain i-vectors for NIST-2008 SRE, and considerable relative improvement over both out-domain and in-domain PLDA development if more are available.

A cluster-voting approach for speaker diarization and linking of Australian broadcast news recordings

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We present a clustering-only approach to the problem of speaker diarization to eliminate the need for the commonly employed and computationally expensive Viterbi segmentation and realignment stage. We use multiple linear segmentations of a recording and carry out complete-linkage clustering within each segmentation scenario to obtain a set of clustering decisions for each case. We then collect all clustering decisions, across all cases, to compute a pairwise vote between the segments and conduct complete-linkage clustering to cluster them at a resolution equal to the minimum segment length used in the linear segmentations. We use our proposed cluster-voting approach to carry out speaker diarization and linking across the SAIVT-BNEWS corpus of Australian broadcast news data. We compare our technique to an equivalent baseline system with Viterbi realignment and show that our approach can outperform the baseline technique with respect to the diarization error rate (DER) and attribution error rate (AER).

Complete-linkage clustering for voice activity detection in audio and visual speech

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.

The QUT-NOISE-SRE protocol for the evaluation of noisy speaker recognition

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The QUT-NOISE-SRE protocol is designed to mix the large QUT-NOISE database, consisting of over 10 hours of back- ground noise, collected across 10 unique locations covering 5 common noise scenarios, with commonly used speaker recognition datasets such as Switchboard, Mixer and the speaker recognition evaluation (SRE) datasets provided by NIST. By allowing common, clean, speech corpora to be mixed with a wide variety of noise conditions, environmental reverberant responses, and signal-to-noise ratios, this protocol provides a solid basis for the development, evaluation and benchmarking of robust speaker recognition algorithms, and is freely available to download alongside the QUT-NOISE database. In this work, we use the QUT-NOISE-SRE protocol to evaluate a state-of-the-art PLDA i-vector speaker recognition system, demonstrating the importance of designing voice-activity-detection front-ends specifically for speaker recognition, rather than aiming for perfect coherence with the true speech/non-speech boundaries.

Pull the plug or take the plunge: Multiple opportunities and the speed of venturing decisions in the Australian mining industry

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Effectively capturing opportunities requires rapid decision-making. We investigate the speed of opportunity evaluation decisions by focusing on firms' venture termination and venture advancement decisions. Experience, standard operating procedures, and confidence allow firms to make opportunity evaluation decisions faster; we propose that a firm's attentional orientation, as reflected in its project portfolio, limits the number of domains in which these speed-enhancing mechanisms can be developed. Hence firms' decision speed is likely to vary between different types of decisions. Using unique data on 3,269 mineral exploration ventures in the Australian mining industry, we find that firms with a higher degree of attention toward earlier-stage exploration activities are quicker to abandon potential opportunities in early development but slower to do so later, and that such firms are also slower to advance on potential opportunities at all stages compared to firms that focus their attention differently. Market dynamism moderates these relationships, but only with regard to initial evaluation decisions. Our study extends research on decision speed by showing that firms are not necessarily fast or slow regarding all the decisions they make, and by offering an opportunity evaluation framework that recognizes that decision makers can, in fact often do, pursue multiple potential opportunities simultaneously.

Acoustic adaptation in cross database audio visual SHMM training for phonetic spoken term detection

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.

Incorporating visual information for spoken term detection

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Spoken term detection (STD) is the task of looking up a spoken term in a large volume of speech segments. In order to provide fast search, speech segments are first indexed into an intermediate representation using speech recognition engines which provide multiple hypotheses for each speech segment. Approximate matching techniques are usually applied at the search stage to compensate the poor performance of automatic speech recognition engines during indexing. Recently, using visual information in addition to audio information has been shown to improve phone recognition performance, particularly in noisy environments. In this paper, we will make use of visual information in the form of lip movements of the speaker in indexing stage and will investigate its effect on STD performance. Particularly, we will investigate if gains in phone recognition accuracy will carry through the approximate matching stage to provide similar gains in the final audio-visual STD system over a traditional audio only approach. We will also investigate the effect of using visual information on STD performance in different noise environments.

Cross database training of audio-visual hidden Markov models for phone recognition

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.

The impact of snares on the continuity of adolescent-onset antisocial behaviour: A test of Moffitt's developmental taxonomy

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Moffitt’s dual typology of ‘life-course persistent’ and ‘adolescence limited’ offending has received extensive empirical attention, but the extent to which the antisocial behaviour of adolescence limited offenders is constrained to adolescence is relatively under-examined.Using data from the Australian Mater University Study of Pregnancy and its Outcomes, we explore Moffitt’s concept of snares, or those factors that may lead to an adolescent persisting in antisocial behaviour such as drug addiction, educational failure, and contact with the justice system. The Mater University Study of Pregnancy and its Outcomes is a longitudinal study of mother–child dyads from the pre-natal stage to 21 years of age. Findings show that one-third of individuals identified as having an adolescent onset of antisocial behaviour persisted with this antisocial behaviour as young adults. This continuity can, in part, be explained by snares and the research suggests that reducing exposure to snares may lead to less antisocial behaviour in adulthood.

Party On! A call for entrepreneurship research that is more interactive, activity based, cognitively hot, compassionate, and prosocial

Relevância:

10.00% 10.00%

Publicador:

Resumo:

It is the Journal of Business Venturing's (JBV) 30th birthday. Although the community of entrepreneurship scholars deserves to celebrate JBV's achievements over the last 30 years (and congratulate the journal's parents—Ian Macmillan and S. Venkataraman), my focus is more on the future of entrepreneurship (and by extension JBV). A focus on entrepreneurship is both timeless and timely. On the one hand, entrepreneurship is timeless given the long-recognized importance of entrepreneurs to economies and societies (e.g., Jean Baptiste who supposedly coined the term in about 1800). On the other hand, a discussion of entrepreneurship is timely because now that the field of entrepreneurship has achieved legitimacy, it faces both opportunities and threats. It is thus timely to acknowledge the threats and think about opportunities to advance the field. A discussion of entrepreneurship is also timely because society faces a number of grand challenges (including the durability of poverty, environmental degradation [ Dorado and Ventresca, 2013]), challenges well suited to entrepreneurial responses...

Solution chemistry impacts on the seawater neutralisation process: Benefits of nanofiltered seawater and reverse osmosis brine

Relevância:

10.00% 10.00%

Publicador:

Resumo:

It is well known that the neutralisation of Bayer liquor with seawater causes the precipitation of stable alkaline products and a reduction in pH and dissolved metal concentrations in the effluent. However, there is limited information available on solution chemistry effects on the stability and reaction kinetics of these precipitates. This investigation shows the influence of reactive species (magnesium and calcium) in seawater on precipitate stabilities and volumetric efficiencies during the neutralisation of bauxite refinery residues. Correlations between synthetic seawater solutions and real samples of seawater (filtered seawater, nanofiltered seawater and reverse osmosis brine) have been made. These investigations have been used to confirm that alternative seawater sources can be used to increase the productivity potential of the neutralisation process with minimal implications on the composition and stability of precipitates formed. The volume efficiency of the neutralisation process using synthetic analogues has been shown to be almost directly proportional with the concentration of magnesium. This was further confirmed in the nanofiltered seawater and reverse osmosis brine that showed increases in the efficiency of neutralisation by factors of 3 and 2 compared to seawater, which corresponds with relatively the same increase in the concentration of magnesium in these alternative seawater sources. An assessment of the chemical stability of the precipitates, volumetric efficiency, and discharge water quality have been determined using numerous techniques that include pH, conductivity, inductively coupled plasma optical emission spectroscopy, infrared spectroscopy, thermogravimetric analysis coupled to mass spectrometry and X-ray diffraction. Correlations between synthetic solution compositions and alternative seawater sources have been used to determine if alternative seawater sources are potential substitutes for seawater based on improvements in productivity, implementation costs, savings to operations and environmental benefits.

Soluble mediators in platelet concentrates modulate dendritic cell inflammatory responses in an experimental model of transfusion

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The transfusion of platelet concentrates (PCs) is widely used to treat thrombocytopenia and severe trauma. Ex vivo storage of PCs is associated with a storage lesion characterized by partial platelet activation and the release of soluble mediators, such as soluble CD40 ligand (sCD40L), RANTES, and interleukin (IL)-8. An in vitro whole blood culture transfusion model was employed to assess whether mediators present in PC supernatants (PC-SNs) modulated dendritic cell (DC)-specific inflammatory responses (intracellular staining) and the overall inflammatory response (cytometric bead array). Lipopolysaccharide (LPS) was included in parallel cultures to model the impact of PC-SNs on cell responses following toll-like receptor-mediated pathogen recognition. The impact of both the PC dose (10%, 25%) and ex vivo storage period was investigated [day 2 (D2), day 5 (D5), day 7 (D7)]. PC-SNs alone had minimal impact on DC-specific inflammatory responses and the overall inflammatory response. However, in the presence of LPS, exposure to PC-SNs resulted in a significant dose associated suppression of the production of DC IL-12, IL-6, IL-1a, tumor necrosis factor-a (TNF-a), and macrophage inflammatory protein (MIP)-1b and storage-associated suppression of the production of DC IL-10, TNF-a, and IL-8. For the overall inflammatory response, IL-6, TNF-a, MIP-1a, MIP-1b, and inflammatory protein (IP)-10 were significantly suppressed and IL-8, IL-10, and IL-1b significantly increased following exposure to PC-SNs in the presence of LPS. These data suggest that soluble mediators present in PCs significantly suppress DC function and modulate the overall inflammatory response, particularly in the presence of an infectious stimulus. Given the central role of DCs in the initiation and regulation of the immune response, these results suggest that modulation of the DC inflammatory profile is a probable mechanism contributing to transfusion-related complications.

«
1
2
...
9
10
11
12
13
14
15
16
17
»