253 resultados para Dean Adami
Resumo:
Experimental studies have found that when the state-of-the-art probabilistic linear discriminant analysis (PLDA) speaker verification systems are trained using out-domain data, it significantly affects speaker verification performance due to the mismatch between development data and evaluation data. To overcome this problem we propose a novel unsupervised inter dataset variability (IDV) compensation approach to compensate the dataset mismatch. IDV-compensated PLDA system achieves over 10% relative improvement in EER values over out-domain PLDA system by effectively compensating the mismatch between in-domain and out-domain data.
Resumo:
BACKGROUND Quantification of the disease burden caused by different risks informs prevention by providing an account of health loss different to that provided by a disease-by-disease analysis. No complete revision of global disease burden caused by risk factors has been done since a comparative risk assessment in 2000, and no previous analysis has assessed changes in burden attributable to risk factors over time. METHODS We estimated deaths and disability-adjusted life years (DALYs; sum of years lived with disability [YLD] and years of life lost [YLL]) attributable to the independent effects of 67 risk factors and clusters of risk factors for 21 regions in 1990 and 2010. We estimated exposure distributions for each year, region, sex, and age group, and relative risks per unit of exposure by systematically reviewing and synthesising published and unpublished data. We used these estimates, together with estimates of cause-specific deaths and DALYs from the Global Burden of Disease Study 2010, to calculate the burden attributable to each risk factor exposure compared with the theoretical-minimum-risk exposure. We incorporated uncertainty in disease burden, relative risks, and exposures into our estimates of attributable burden. FINDINGS In 2010, the three leading risk factors for global disease burden were high blood pressure (7·0% [95% uncertainty interval 6·2-7·7] of global DALYs), tobacco smoking including second-hand smoke (6·3% [5·5-7·0]), and alcohol use (5·5% [5·0-5·9]). In 1990, the leading risks were childhood underweight (7·9% [6·8-9·4]), household air pollution from solid fuels (HAP; 7·0% [5·6-8·3]), and tobacco smoking including second-hand smoke (6·1% [5·4-6·8]). Dietary risk factors and physical inactivity collectively accounted for 10·0% (95% UI 9·2-10·8) of global DALYs in 2010, with the most prominent dietary risks being diets low in fruits and those high in sodium. Several risks that primarily affect childhood communicable diseases, including unimproved water and sanitation and childhood micronutrient deficiencies, fell in rank between 1990 and 2010, with unimproved water and sanitation accounting for 0·9% (0·4-1·6) of global DALYs in 2010. However, in most of sub-Saharan Africa childhood underweight, HAP, and non-exclusive and discontinued breastfeeding were the leading risks in 2010, while HAP was the leading risk in south Asia. The leading risk factor in Eastern Europe, most of Latin America, and southern sub-Saharan Africa in 2010 was alcohol use; in most of Asia, North Africa and Middle East, and central Europe it was high blood pressure. Despite declines, tobacco smoking including second-hand smoke remained the leading risk in high-income north America and western Europe. High body-mass index has increased globally and it is the leading risk in Australasia and southern Latin America, and also ranks high in other high-income regions, North Africa and Middle East, and Oceania. INTERPRETATION Worldwide, the contribution of different risk factors to disease burden has changed substantially, with a shift away from risks for communicable diseases in children towards those for non-communicable diseases in adults. These changes are related to the ageing population, decreased mortality among children younger than 5 years, changes in cause-of-death composition, and changes in risk factor exposures. New evidence has led to changes in the magnitude of key risks including unimproved water and sanitation, vitamin A and zinc deficiencies, and ambient particulate matter pollution. The extent to which the epidemiological shift has occurred and what the leading risks currently are varies greatly across regions. In much of sub-Saharan Africa, the leading risks are still those associated with poverty and those that affect children.
Resumo:
Introduction Poor medication adherence is common in children and adolescents with chronic illness, but there is uncertainty about the best way to enhance medication adherence in this group. The authors conducted a systematic review of controlled trials examining interventions that aim to improve medication adherence. Method A comprehensive literature search was undertaken to locate controlled trials that described specific interventions aiming to improve adherence to long-term medication, where participants were aged 18 years and under, medication adherence was reported as an outcome measure, and which could be implemented by individual health practitioners. Studies were reviewed for quality and outcome. Results 17 studies met inclusion criteria: seven studies examined educational strategies, seven studies examined behavioural interventions and three studies examined educational intervention combined with other forms of psychological therapies. Only two of seven studies reported a clear benefit for education on medication adherence, whereas four of seven trials indicated a benefit of behavioural approaches on medication adherence. One trial reported that combining education with behavioural management may be more effective than education alone. Studies which combined education with other non-medication specific psychological interventions failed to demonstrate a beneficial effect on medication adherence. Only two studies examined adherence-promoting interventions in young people with established adherence problems. Conclusion These findings suggest that education interventions alone are insufficient to promote adherence in children and adolescents, and that incorporating a behavioural component to adherence interventions may increase potential efficacy. Future research should examine interventions in high-risk groups.
Resumo:
This paper analyzes the limitations upon the amount of in- domain (NIST SREs) data required for training a probabilistic linear discriminant analysis (PLDA) speaker verification system based on out-domain (Switchboard) total variability subspaces. By limiting the number of speakers, the number of sessions per speaker and the length of active speech per session available in the target domain for PLDA training, we investigated the relative effect of these three parameters on PLDA speaker verification performance in the NIST 2008 and NIST 2010 speaker recognition evaluation datasets. Experimental results indicate that while these parameters depend highly on each other, to beat out-domain PLDA training, more than 10 seconds of active speech should be available for at least 4 sessions/speaker for a minimum of 800 speakers. If further data is available, considerable improvement can be made over solely out-domain PLDA training.
Resumo:
Digital transformations are not contained within the digital domain but are increasingly spilling over into the physical world. In this chapter, we analyse some of the transformations undergoing in cities today towards becoming smart cities. We offer a critique of smart cities and a way forward, divided into three parts: First, we explore the concept of Smart Citizens in terms of both localities, the move towards a hyperlocal network and also the citizen’s role in the creation and use of data. We use the ‘Smart London’ plan drawn up by the Mayor of London, as a way to illustrate our discussion. Second, we turn to the civic innovations enabled by digital transformations and their potential impact on citizens and citizenship. Specifically, we are interested in the notion of social capital as an alternative form of in-kind currency and its function as an indicator of value, in order to ask, can digital transformations give rise to ‘civic capital,’ and how can such a concept help, for instance, a local government invite more representative residents and community champions to participate in community engagement for better urban planning. Third, we introduce a hybrid, location-based game under development by design agency Preliminal Games in London, UK. This illustrative case critiques and highlights the current challenges to establishing a new economic model that bridges the digital / physical divide. The game provides a vehicle for us to explore how established principles and strategies in game design such as immersive storytelling and goal setting, can be employed to encourage players to think of the interconnections of their hybrid digital / physical environments in new ways.
Resumo:
This paper proposes the addition of a weighted median Fisher discriminator (WMFD) projection prior to length-normalised Gaussian probabilistic linear discriminant analysis (GPLDA) modelling in order to compensate the additional session variation. In limited microphone data conditions, a linear-weighted approach is introduced to increase the influence of microphone speech dataset. The linear-weighted WMFD-projected GPLDA system shows improvements in EER and DCF values over the pooled LDA- and WMFD-projected GPLDA systems in inter-view-interview condition as WMFD projection extracts more speaker discriminant information with limited number of sessions/ speaker data, and linear-weighted GPLDA approach estimates reliable model parameters with limited microphone data.
Resumo:
In this paper we introduce a novel domain-invariant covariance normalization (DICN) technique to relocate both in-domain and out-domain i-vectors into a third dataset-invariant space, providing an improvement for out-domain PLDA speaker verification with a very small number of unlabelled in-domain adaptation i-vectors. By capturing the dataset variance from a global mean using both development out-domain i-vectors and limited unlabelled in-domain i-vectors, we could obtain domain- invariant representations of PLDA training data. The DICN- compensated out-domain PLDA system is shown to perform as well as in-domain PLDA training with as few as 500 unlabelled in-domain i-vectors for NIST-2010 SRE and 2000 unlabelled in-domain i-vectors for NIST-2008 SRE, and considerable relative improvement over both out-domain and in-domain PLDA development if more are available.
Resumo:
We present a clustering-only approach to the problem of speaker diarization to eliminate the need for the commonly employed and computationally expensive Viterbi segmentation and realignment stage. We use multiple linear segmentations of a recording and carry out complete-linkage clustering within each segmentation scenario to obtain a set of clustering decisions for each case. We then collect all clustering decisions, across all cases, to compute a pairwise vote between the segments and conduct complete-linkage clustering to cluster them at a resolution equal to the minimum segment length used in the linear segmentations. We use our proposed cluster-voting approach to carry out speaker diarization and linking across the SAIVT-BNEWS corpus of Australian broadcast news data. We compare our technique to an equivalent baseline system with Viterbi realignment and show that our approach can outperform the baseline technique with respect to the diarization error rate (DER) and attribution error rate (AER).
Resumo:
We propose a novel technique for conducting robust voice activity detection (VAD) in high-noise recordings. We use Gaussian mixture modeling (GMM) to train two generic models; speech and non-speech. We then score smaller segments of a given (unseen) recording against each of these GMMs to obtain two respective likelihood scores for each segment. These scores are used to compute a dissimilarity measure between pairs of segments and to carry out complete-linkage clustering of the segments into speech and non-speech clusters. We compare the accuracy of our method against state-of-the-art and standardised VAD techniques to demonstrate an absolute improvement of 15% in half-total error rate (HTER) over the best performing baseline system and across the QUT-NOISE-TIMIT database. We then apply our approach to the Audio-Visual Database of American English (AVDBAE) to demonstrate the performance of our algorithm in using visual, audio-visual or a proposed fusion of these features.
Resumo:
The QUT-NOISE-SRE protocol is designed to mix the large QUT-NOISE database, consisting of over 10 hours of back- ground noise, collected across 10 unique locations covering 5 common noise scenarios, with commonly used speaker recognition datasets such as Switchboard, Mixer and the speaker recognition evaluation (SRE) datasets provided by NIST. By allowing common, clean, speech corpora to be mixed with a wide variety of noise conditions, environmental reverberant responses, and signal-to-noise ratios, this protocol provides a solid basis for the development, evaluation and benchmarking of robust speaker recognition algorithms, and is freely available to download alongside the QUT-NOISE database. In this work, we use the QUT-NOISE-SRE protocol to evaluate a state-of-the-art PLDA i-vector speaker recognition system, demonstrating the importance of designing voice-activity-detection front-ends specifically for speaker recognition, rather than aiming for perfect coherence with the true speech/non-speech boundaries.
Resumo:
Effectively capturing opportunities requires rapid decision-making. We investigate the speed of opportunity evaluation decisions by focusing on firms' venture termination and venture advancement decisions. Experience, standard operating procedures, and confidence allow firms to make opportunity evaluation decisions faster; we propose that a firm's attentional orientation, as reflected in its project portfolio, limits the number of domains in which these speed-enhancing mechanisms can be developed. Hence firms' decision speed is likely to vary between different types of decisions. Using unique data on 3,269 mineral exploration ventures in the Australian mining industry, we find that firms with a higher degree of attention toward earlier-stage exploration activities are quicker to abandon potential opportunities in early development but slower to do so later, and that such firms are also slower to advance on potential opportunities at all stages compared to firms that focus their attention differently. Market dynamism moderates these relationships, but only with regard to initial evaluation decisions. Our study extends research on decision speed by showing that firms are not necessarily fast or slow regarding all the decisions they make, and by offering an opportunity evaluation framework that recognizes that decision makers can, in fact often do, pursue multiple potential opportunities simultaneously.
Resumo:
Visual information in the form of lip movements of the speaker has been shown to improve the performance of speech recognition and search applications. In our previous work, we proposed cross database training of synchronous hidden Markov models (SHMMs) to make use of external large and publicly available audio databases in addition to the relatively small given audio visual database. In this work, the cross database training approach is improved by performing an additional audio adaptation step, which enables audio visual SHMMs to benefit from audio observations of the external audio models before adding visual modality to them. The proposed approach outperforms the baseline cross database training approach in clean and noisy environments in terms of phone recognition accuracy as well as spoken term detection (STD) accuracy.
Resumo:
Spoken term detection (STD) is the task of looking up a spoken term in a large volume of speech segments. In order to provide fast search, speech segments are first indexed into an intermediate representation using speech recognition engines which provide multiple hypotheses for each speech segment. Approximate matching techniques are usually applied at the search stage to compensate the poor performance of automatic speech recognition engines during indexing. Recently, using visual information in addition to audio information has been shown to improve phone recognition performance, particularly in noisy environments. In this paper, we will make use of visual information in the form of lip movements of the speaker in indexing stage and will investigate its effect on STD performance. Particularly, we will investigate if gains in phone recognition accuracy will carry through the approximate matching stage to provide similar gains in the final audio-visual STD system over a traditional audio only approach. We will also investigate the effect of using visual information on STD performance in different noise environments.
Resumo:
Speech recognition can be improved by using visual information in the form of lip movements of the speaker in addition to audio information. To date, state-of-the-art techniques for audio-visual speech recognition continue to use audio and visual data of the same database for training their models. In this paper, we present a new approach to make use of one modality of an external dataset in addition to a given audio-visual dataset. By so doing, it is possible to create more powerful models from other extensive audio-only databases and adapt them on our comparatively smaller multi-stream databases. Results show that the presented approach outperforms the widely adopted synchronous hidden Markov models (HMM) trained jointly on audio and visual data of a given audio-visual database for phone recognition by 29% relative. It also outperforms the external audio models trained on extensive external audio datasets and also internal audio models by 5.5% and 46% relative respectively. We also show that the proposed approach is beneficial in noisy environments where the audio source is affected by the environmental noise.
Resumo:
Moffitt’s dual typology of ‘life-course persistent’ and ‘adolescence limited’ offending has received extensive empirical attention, but the extent to which the antisocial behaviour of adolescence limited offenders is constrained to adolescence is relatively under-examined.Using data from the Australian Mater University Study of Pregnancy and its Outcomes, we explore Moffitt’s concept of snares, or those factors that may lead to an adolescent persisting in antisocial behaviour such as drug addiction, educational failure, and contact with the justice system. The Mater University Study of Pregnancy and its Outcomes is a longitudinal study of mother–child dyads from the pre-natal stage to 21 years of age. Findings show that one-third of individuals identified as having an adolescent onset of antisocial behaviour persisted with this antisocial behaviour as young adults. This continuity can, in part, be explained by snares and the research suggests that reducing exposure to snares may lead to less antisocial behaviour in adulthood.