253 resultados para Data selection

em Queensland University of Technology - ePrints Archive


Relevância:

80.00% 80.00%

Publicador:

Resumo:

A data-driven background dataset refinement technique was recently proposed for SVM based speaker verification. This method selects a refined SVM background dataset from a set of candidate impostor examples after individually ranking examples by their relevance. This paper extends this technique to the refinement of the T-norm dataset for SVM-based speaker verification. The independent refinement of the background and T-norm datasets provides a means of investigating the sensitivity of SVM-based speaker verification performance to the selection of each of these datasets. Using refined datasets provided improvements of 13% in min. DCF and 9% in EER over the full set of impostor examples on the 2006 SRE corpus with the majority of these gains due to refinement of the T-norm dataset. Similar trends were observed for the unseen data of the NIST 2008 SRE.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The high degree of variability and inconsistency in cash flow study usage by property professionals demands improvement in knowledge and processes. Until recently limited research was being undertaken on the use of cash flow studies in property valuations but the growing acceptance of this approach for major investment valuations has resulted in renewed interest in this topic. Studies on valuation variations identify data accuracy, model consistency and bias as major concerns. In cash flow studies there are practical problems with the input data and the consistency of the models. This study will refer to the recent literature and identify the major factors in model inconsistency and data selection. A detailed case study will be used to examine the effects of changes in structure and inputs. The key variable inputs will be identified and proposals developed to improve the selection process for these key variables. The variables will be selected with the aid of sensitivity studies and alternative ways of quantifying the key variables explained. The paper recommends, with reservations, the use of probability profiles of the variables and the incorporation of this data in simulation exercises. The use of Monte Carlo simulation is demonstrated and the factors influencing the structure of the probability distributions of the key variables are outline. This study relates to ongoing research into functional performance of commercial property within an Australian Cooperative Research Centre.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The problem of impostor dataset selection for GMM-based speaker verification is addressed through the recently proposed data-driven background dataset refinement technique. The SVM-based refinement technique selects from a candidate impostor dataset those examples that are most frequently selected as support vectors when training a set of SVMs on a development corpus. This study demonstrates the versatility of dataset refinement in the task of selecting suitable impostor datasets for use in GMM-based speaker verification. The use of refined Z- and T-norm datasets provided performance gains of 15% in EER in the NIST 2006 SRE over the use of heuristically selected datasets. The refined datasets were shown to generalise well to the unseen data of the NIST 2008 SRE.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Management of groundwater systems requires realistic conceptual hydrogeological models as a framework for numerical simulation modelling, but also for system understanding and communicating this to stakeholders and the broader community. To help overcome these challenges we developed GVS (Groundwater Visualisation System), a stand-alone desktop software package that uses interactive 3D visualisation and animation techniques. The goal was a user-friendly groundwater management tool that could support a range of existing real-world and pre-processed data, both surface and subsurface, including geology and various types of temporal hydrological information. GVS allows these data to be integrated into a single conceptual hydrogeological model. In addition, 3D geological models produced externally using other software packages, can readily be imported into GVS models, as can outputs of simulations (e.g. piezometric surfaces) produced by software such as MODFLOW or FEFLOW. Boreholes can be integrated, showing any down-hole data and properties, including screen information, intersected geology, water level data and water chemistry. Animation is used to display spatial and temporal changes, with time-series data such as rainfall, standing water levels and electrical conductivity, displaying dynamic processes. Time and space variations can be presented using a range of contouring and colour mapping techniques, in addition to interactive plots of time-series parameters. Other types of data, for example, demographics and cultural information, can also be readily incorporated. The GVS software can execute on a standard Windows or Linux-based PC with a minimum of 2 GB RAM, and the model output is easy and inexpensive to distribute, by download or via USB/DVD/CD. Example models are described here for three groundwater systems in Queensland, northeastern Australia: two unconfined alluvial groundwater systems with intensive irrigation, the Lockyer Valley and the upper Condamine Valley, and the Surat Basin, a large sedimentary basin of confined artesian aquifers. This latter example required more detail in the hydrostratigraphy, correlation of formations with drillholes and visualisation of simulation piezometric surfaces. Both alluvial system GVS models were developed during drought conditions to support government strategies to implement groundwater management. The Surat Basin model was industry sponsored research, for coal seam gas groundwater management and community information and consultation. The “virtual” groundwater systems in these 3D GVS models can be interactively interrogated by standard functions, plus production of 2D cross-sections, data selection from the 3D scene, rear end database and plot displays. A unique feature is that GVS allows investigation of time-series data across different display modes, both 2D and 3D. GVS has been used successfully as a tool to enhance community/stakeholder understanding and knowledge of groundwater systems and is of value for training and educational purposes. Projects completed confirm that GVS provides a powerful support to management and decision making, and as a tool for interpretation of groundwater system hydrological processes. A highly effective visualisation output is the production of short videos (e.g. 2–5 min) based on sequences of camera ‘fly-throughs’ and screen images. Further work involves developing support for multi-screen displays and touch-screen technologies, distributed rendering, gestural interaction systems. To highlight the visualisation and animation capability of the GVS software, links to related multimedia hosted online sites are included in the references.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Operational modal analysis (OMA) is prevalent in modal identifi cation of civil structures. It asks for response measurements of the underlying structure under ambient loads. A valid OMA method requires the excitation be white noise in time and space. Although there are numerous applications of OMA in the literature, few have investigated the statistical distribution of a measurement and the infl uence of such randomness to modal identifi cation. This research has attempted modifi ed kurtosis to evaluate the statistical distribution of raw measurement data. In addition, a windowing strategy employing this index has been proposed to select quality datasets. In order to demonstrate how the data selection strategy works, the ambient vibration measurements of a laboratory bridge model and a real cable-stayed bridge have been respectively considered. The analysis incorporated with frequency domain decomposition (FDD) as the target OMA approach for modal identifi cation. The modal identifi cation results using the data segments with different randomness have been compared. The discrepancy in FDD spectra of the results indicates that, in order to fulfi l the assumption of an OMA method, special care shall be taken in processing a long vibration measurement data. The proposed data selection strategy is easy-to-apply and verifi ed effective in modal analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As a part of vital infrastructure and transportation network, bridge structures must function safely at all times. Bridges are designed to have a long life span. At any point in time, however, some bridges are aged. The ageing of bridge structures, given the rapidly growing demand of heavy and fast inter-city passages and continuous increase of freight transportation, would require diligence on bridge owners to ensure that the infrastructure is healthy at reasonable cost. In recent decades, a new technique, structural health monitoring (SHM), has emerged to meet this challenge. In this new engineering discipline, structural modal identification and damage detection have formed a vital component. Witnessed by an increasing number of publications is that the change in vibration characteristics is widely and deeply investigated to assess structural damage. Although a number of publications have addressed the feasibility of various methods through experimental verifications, few of them have focused on steel truss bridges. Finding a feasible vibration-based damage indicator for steel truss bridges and solving the difficulties in practical modal identification to support damage detection motivated this research project. This research was to derive an innovative method to assess structural damage in steel truss bridges. First, it proposed a new damage indicator that relies on optimising the correlation between theoretical and measured modal strain energy. The optimisation is powered by a newly proposed multilayer genetic algorithm. In addition, a selection criterion for damage-sensitive modes has been studied to achieve more efficient and accurate damage detection results. Second, in order to support the proposed damage indicator, the research studied the applications of two state-of-the-art modal identification techniques by considering some practical difficulties: the limited instrumentation, the influence of environmental noise, the difficulties in finite element model updating, and the data selection problem in the output-only modal identification methods. The numerical (by a planer truss model) and experimental (by a laboratory through truss bridge) verifications have proved the effectiveness and feasibility of the proposed damage detection scheme. The modal strain energy-based indicator was found to be sensitive to the damage in steel truss bridges with incomplete measurement. It has shown the damage indicator's potential in practical applications of steel truss bridges. Lastly, the achievement and limitation of this study, and lessons learnt from the modal analysis have been summarised.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Vietnam has a unique culture which is revealed in the way that people have built and designed their traditional housing. Vietnamese dwellings reflect occupants’ activities in their everyday lives, while adapting to tropical climatic conditions impacted by seasoning monsoons. It is said that these characteristics of Vietnamese dwellings have remained unchanged until the economic reform in 1986, when Vietnam experienced an accelerated development based on the market-oriented economy. New housing types, including modern shop-houses, detached houses, and apartments, have been designed in many places, especially satisfying dwellers’ new lifestyles in Vietnamese cities. The contemporary housing, which has been mostly designed by architects, has reflected rules of spatial organisation so that occupants’ social activities are carried out. However, contemporary housing spaces seem unsustainable in relation to socio-cultural values because they has been influenced by globalism that advocates the use of homogeneous spatial patterns, modern technologies, materials and construction methods. This study investigates the rules of spaces in Vietnamese houses that were built before and after the reform to define the socio-cultural implications in Vietnamese housing design. Firstly, it describes occupants’ views of their current dwellings in terms of indoor comfort conditions and social activities in spaces. Then, it examines the use of spaces in pre-reform Vietnamese housing through occupants’ activities and material applications. Finally, it discusses the organisation of spaces in both pre- and post-reform housing to understand how Vietnamese housing has been designed for occupants to live, act, work, and conduct traditional activities. Understanding spatial organisation is a way to identify characteristics of the lived spaces of the occupants created from the conceived space, which is designed by designers. The characteristics of the housing spaces will inform the designers the way to design future Vietnamese housing in response to cultural contexts. The study applied an abductive approach for the investigation of housing spaces. It used a conceptual framework in relation to Henri Lefebvre’s (1991) theory to understand space as the main factor constituting the language of design, and the principles of semiotics to examine spatial structure in housing as a language used in the everyday life. The study involved a door-knocking survey to 350 households in four regional cities of Vietnam for interpretation of occupancy conditions and levels of occupants’ comfort. A statistical analysis was applied to interpret the survey data. The study also required a process of data selection and collection of fourteen cases of housing in three main climatic regions of the country for analysing spatial organisation and housing characteristics. The study found that there has been a shift in the relationship of spaces from the pre- to post-reform Vietnamese housing. It also indentified that the space for guest welcoming and family activity has been the central space of the Vietnamese housing. Based on the relationships of the central space with the others, theoretical models were proposed for three types of contemporary Vietnamese housing. The models will be significant in adapting to Vietnamese conditions to achieve socioenvironmental characteristics for housing design because it was developed from the occupants’ requirements for their social activities. Another contribution of the study is the use of methodological concepts to understand the language of living spaces. Further work will be needed to test future Vietnamese housing designs from the applications of the models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS–SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS–SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65–85% for hybrid PLS–SVM model respectively. Also it was found that the hybrid PLS–SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS–SVM model.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This study assesses the recently proposed data-driven background dataset refinement technique for speaker verification using alternate SVM feature sets to the GMM supervector features for which it was originally designed. The performance improvements brought about in each trialled SVM configuration demonstrate the versatility of background dataset refinement. This work also extends on the originally proposed technique to exploit support vector coefficients as an impostor suitability metric in the data-driven selection process. Using support vector coefficients improved the performance of the refined datasets in the evaluation of unseen data. Further, attempts are made to exploit the differences in impostor example suitability measures from varying features spaces to provide added robustness.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Selection criteria and misspecification tests for the intra-cluster correlation structure (ICS) in longitudinal data analysis are considered. In particular, the asymptotical distribution of the correlation information criterion (CIC) is derived and a new method for selecting a working ICS is proposed by standardizing the selection criterion as the p-value. The CIC test is found to be powerful in detecting misspecification of the working ICS structures, while with respect to the working ICS selection, the standardized CIC test is also shown to have satisfactory performance. Some simulation studies and applications to two real longitudinal datasets are made to illustrate how these criteria and tests might be useful.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A modeling paradigm is proposed for covariate, variance and working correlation structure selection for longitudinal data analysis. Appropriate selection of covariates is pertinent to correct variance modeling and selecting the appropriate covariates and variance function is vital to correlation structure selection. This leads to a stepwise model selection procedure that deploys a combination of different model selection criteria. Although these criteria find a common theoretical root based on approximating the Kullback-Leibler distance, they are designed to address different aspects of model selection and have different merits and limitations. For example, the extended quasi-likelihood information criterion (EQIC) with a covariance penalty performs well for covariate selection even when the working variance function is misspecified, but EQIC contains little information on correlation structures. The proposed model selection strategies are outlined and a Monte Carlo assessment of their finite sample properties is reported. Two longitudinal studies are used for illustration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability to accurately predict the lifetime of building components is crucial to optimizing building design, material selection and scheduling of required maintenance. This paper discusses a number of possible data mining methods that can be applied to do the lifetime prediction of metallic components and how different sources of service life information could be integrated to form the basis of the lifetime prediction model