204 resultados para Asymptotic behaviour, Bayesian methods, Mixture models, Overfitting, Posterior concentration


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The effectiveness of higher-order spectral (HOS) phase features in speaker recognition is investigated by comparison with Mel Cepstral features on the same speech data. HOS phase features retain phase information from the Fourier spectrum unlikeMel–frequency Cepstral coefficients (MFCC). Gaussian mixture models are constructed from Mel– Cepstral features and HOS features, respectively, for the same data from various speakers in the Switchboard telephone Speech Corpus. Feature clusters, model parameters and classification performance are analyzed. HOS phase features on their own provide a correct identification rate of about 97% on the chosen subset of the corpus. This is the same level of accuracy as provided by MFCCs. Cluster plots and model parameters are compared to show that HOS phase features can provide complementary information to better discriminate between speakers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Water-filled portable road safety barriers are a common fixture in road works, however their use of water can be problematic, both in terms of the quantity of water used and the transportation of the water to the installation site. This project aims to develop a new design of portable road safety barrier, which will make novel use of composite and foam materials in order to reduce the barrier’s reliance on water in order to control errant vehicles. The project makes use of finite element (FE) techniques in order to simulate and evaluate design concepts. FE methods and models that have previously been tested and validated will be used in combination in order to provide the most accurate numerical simulations available to drive the project forward. LS-DYNA code is as highly dynamic, non-linear numerical solver which is commonly used in the automotive and road safety industries. Several complex materials and physical interactions are to be simulated throughout the course of the project including aluminium foams, composite laminates and water within the barrier during standardised impact tests. Techniques to be used include FE, smoothed particle hydrodynamics (SPH) and weighted multi-parameter optimisation techniques. A detailed optimisation of several design parameters with specific design goals will be performed with LS-DYNA and LS-OPT, which will require a large number of high accuracy simulations and advanced visualisation techniques. Supercomputing will play a central role in the project, enabling the numerous medium element count simulations necessary in order to determine the optimal design parameters of the barrier to be performed. Supercomputing will also allow the development of useful methods of visualisation results and the production of highly detailed simulations for end-product validation purposes. Efforts thus far have been towards integrating various numerical methods (including FEM, SPH and advanced materials models) together in an efficient and accurate manner. Various designs of joining mechanisms have been developed and are currently being developed into FE models and simulations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The problem of impostor dataset selection for GMM-based speaker verification is addressed through the recently proposed data-driven background dataset refinement technique. The SVM-based refinement technique selects from a candidate impostor dataset those examples that are most frequently selected as support vectors when training a set of SVMs on a development corpus. This study demonstrates the versatility of dataset refinement in the task of selecting suitable impostor datasets for use in GMM-based speaker verification. The use of refined Z- and T-norm datasets provided performance gains of 15% in EER in the NIST 2006 SRE over the use of heuristically selected datasets. The refined datasets were shown to generalise well to the unseen data of the NIST 2008 SRE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Perez-Losada et al. [1] analyzed 72 complete genomes corresponding to nine mammalian (67 strains) and 2 avian (5 strains) polyomavirus species using maximum likelihood and Bayesian methods of phylogenetic inference. Because some data of 2 genomes in their work are now not available in GenBank, in this work, we analyze the phylogenetic relationship of the remaining 70 complete genomes corresponding to nine mammalian (65 strains) and two avian (5 strains) polyomavirus species using a dynamical language model approach developed by our group (Yu et al., [26]). This distance method does not require sequence alignment for deriving species phylogeny based on overall similarities of the complete genomes. Our best tree separates the bird polyomaviruses (avian polyomaviruses and goose hemorrhagic polymaviruses) from the mammalian polyomaviruses, which supports the idea of splitting the genus into two subgenera. Such a split is consistent with the different viral life strategies of each group. In the mammalian polyomavirus subgenera, mouse polyomaviruses (MPV), simian viruses 40 (SV40), BK viruses (BKV) and JC viruses (JCV) are grouped as different branches as expected. The topology of our best tree is quite similar to that of the tree constructed by Perez-Losada et al.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an extended study on the implementation of support vector machine(SVM) based speaker verification in systems that employ continuous progressive model adaptation using the weight-based factor analysis model. The weight-based factor analysis model compensates for session variations in unsupervised scenarios by incorporating trial confidence measures in the general statistics used in the inter-session variability modelling process. Employing weight-based factor analysis in Gaussian mixture models (GMM) was recently found to provide significant performance gains to unsupervised classification. Further improvements in performance were found through the integration of SVM-based classification in the system by means of GMM supervectors. This study focuses particularly on the way in which a client is represented in the SVM kernel space using single and multiple target supervectors. Experimental results indicate that training client SVMs using a single target supervector maximises performance while exhibiting a certain robustness to the inclusion of impostor training data in the model. Furthermore, the inclusion of low-scoring target trials in the adaptation process is investigated where they were found to significantly aid performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Saffman-Taylor finger problem is to predict the shape and,in particular, width of a finger of fluid travelling in a Hele-Shaw cell filled with a different, more viscous fluid. In experiments the width is dependent on the speed of propagation of the finger, tending to half the total cell width as the speed increases. To predict this result mathematically, nonlinear effects on the fluid interface must be considered; usually surface tension is included for this purpose. This makes the mathematical problem suffciently diffcult that asymptotic or numerical methods must be used. In this paper we adapt numerical methods used to solve the Saffman-Taylor finger problem with surface tension to instead include the effect of kinetic undercooling, a regularisation effect important in Stefan melting-freezing problems, for which Hele-Shaw flow serves as a leading order approximation when the specific heat of a substance is much smaller than its latent heat. We find the existence of a solution branch where the finger width tends to zero as the propagation speed increases, disagreeing with some aspects of the asymptotic analysis of the same problem. We also find a second solution branch, supporting the idea of a countably infinite number of branches as with the surface tension problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Gaussian mixture models (GMMs) have become an established means of modeling feature distributions in speaker recognition systems. It is useful for experimentation and practical implementation purposes to develop and test these models in an efficient manner particularly when computational resources are limited. A method of combining vector quantization (VQ) with single multi-dimensional Gaussians is proposed to rapidly generate a robust model approximation to the Gaussian mixture model. A fast method of testing these systems is also proposed and implemented. Results on the NIST 1996 Speaker Recognition Database suggest comparable and in some cases an improved verification performance to the traditional GMM based analysis scheme. In addition, previous research for the task of speaker identification indicated a similar system perfomance between the VQ Gaussian based technique and GMMs

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examines consumer adoption of 3G mobile technology in China. The qualitative study involved 45 in-depth interviews undertaken in three major Chinese cities to explore the beliefs and attitudes which determine Chinese consumers’ acceptance of the mobile technological innovation. The findings are compared and contrasted against those reported in Western studies. The variations underpinning adoption of 3G between consumers in the three regional cities were identified. Specifically, it was found that the regions differed in terms of the relative importance of the identified adoption determinants, such as perceived social outcomes for using the innovation and the effects of social influence on the adoption. These findings provide subtle insight into the nature of Chinese consumers’ responses to new mobile technologies and a better understanding of variations among regional Chinese consumers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Building Web 2.0 sites does not necessarily ensure the success of the site. We aim to better understand what improves the success of a site by drawing insight from biologically inspired design patterns. Web 2.0 sites provide a mechanism for human interaction enabling powerful intercommunication between massive volumes of users. Early Web 2.0 site providers that were previously dominant are being succeeded by newer sites providing innovative social interaction mechanisms. Understanding what site traits contribute to this success drives research into Web sites mechanics using models to describe the associated social networking behaviour. Some of these models attempt to show how the volume of users provides a self-organising and self-contextualisation of content. One model describing coordinated environments is called stigmergy, a term originally describing coordinated insect behavior. This paper explores how exploiting stigmergy can provide a valuable mechanism for identifying and analysing online user behavior specifically when considering that user freedom of choice is restricted by the provided web site functionality. This will aid our building better collaborative Web sites improving the collaborative processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is general agreement in the scientific community that entrepreneurship plays a central role in the growth and development of an economy in rapidly changing environments (Acs & Virgill 2010). In particular, when business activities are regarded as a vehicle for sustainable growth at large, that goes beyond mere economic returns of singular entities, encompassing also social problems and heavily relying on collaborative actions, then we more precisely fall into the domain of ‘social entrepreneurship’(Robinson et al. 2009). In the entrepreneurship literature, prior studies demonstrated the role of intentionality as the best predictor of planned behavior (Ajzen 1991), and assumed that the intention to start a business derives from the perception of desirability and feasibility and from a propensity to act upon an opportunity (Fishbein & Ajzen 1975). Recognizing that starting a business is an intentional act (Krueger et al. 2000) and entrepreneurship is a planned behaviour (Katz & Gartner 1988), models of entrepreneurial intentions have substantial implications for intentionality research in entrepreneurship. The purpose of this paper is to explore the emerging practice of social entrepreneurship by comparing the determinants of entrepreneurial intention in general versus those leading to startups with a social mission. Social entrepreneurial intentions clearly merit to be investigated given that the opportunity identification process is an intentional process not only typical of for profit start-ups, and yet there is a lack of research examining opportunity recognition in social entrepreneurship (Haugh 2005). The key argument is that intentionality in both traditional and social entrepreneurs during the decision-making process of new venture creation is influenced by an individual's perceptions toward opportunities (Fishbein & Ajzen 1975). Besides opportunity recognition, at least two other aspects can substantially influence intentionality: human and social capital (Davidsson, 2003). This paper is set to establish if and to what extent the social intentions of potential entrepreneurs, at the cognitive level, are influenced by opportunities recognition, human capital, and social capital. By applying established theoretical constructs, the paper draws comparisons between ‘for-profit’ and ‘social’ intentionality using two samples of students enrolled in Economy and Business Administration at the University G. d’Annunzio in Pescara, Italy. A questionnaire was submitted to 310 potential entrepreneurs to test the robustness of the model. The collected data were used to measure the theoretical constructs of the paper. Reliability of the multi-item scale for each dimension was measured using Cronbach alpha, and for all the dimensions measures of reliability are above 0.70. We empirically tested the model using structural equation modeling with AMOS. The results allow us to empirically contribute to the argument regarding the influence of human and social cognitive capital on social and non-social entrepreneurial intentions. Moreover, we highlight the importance for further researchers to look deeper into the determinants of traditional and social entrepreneurial intention so that governments can one day define better polices and regulations that promote sustainable businesses with a social imprint, rather than inhibit their formation and growth.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background/objectives This study estimates the economic outcomes of a nutrition intervention to at-risk patients compared with standard care in the prevention of pressure ulcer. Subjects/methods Statistical models were developed to predict ‘cases of pressure ulcer avoided’, ‘number of bed days gained’ and ‘change to economic costs’ in public hospitals in 2002–2003 in Queensland, Australia. Input parameters were specified and appropriate probability distributions fitted for: number of discharges per annum; incidence rate for pressure ulcer; independent effect of pressure ulcer on length of stay; cost of a bed day; change in risk in developing a pressure ulcer associated with nutrition support; annual cost of the provision of a nutrition support intervention for at-risk patients. A total of 1000 random re-samples were made and the results expressed as output probability distributions. Results The model predicts a mean 2896 (s.d. 632) cases of pressure ulcer avoided; 12 397 (s.d. 4491) bed days released and corresponding mean economic cost saving of euros 2 869 526 (s.d. 2 078 715) with a nutrition support intervention, compared with standard care. Conclusion Nutrition intervention is predicted to be a cost-effective approach in the prevention of pressure ulcer in at-risk patients.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent studies have started to explore context-awareness as a driver in the design of adaptable business processes. The emerging challenge of identifying and considering contextual drivers in the environment of a business process are well understood, however, typical methods and models for business process design do not yet consider this context. In this paper, we describe our work on the design of a method framework and appropriate models to enable a context-aware process design approach. We report on our ongoing work with an Australian insurance provider and describe the design science we employed to develop innovative and useful artifacts as part of a context-aware method framework. We discuss the utility of these artifacts in an application in the claims handling process at the case organization.