989 resultados para Statistical variance
Resumo:
A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this “strong margin adaptivity” makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model selection procedure (even a randomized one), there is a problem for which it does not demonstrate strong margin adaptivity.
An approach to statistical lip modelling for speaker identification via chromatic feature extraction
Resumo:
This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS1 database, show encouraging results
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
It is important to promote a sustainable development approach to ensure that economic, environmental and social developments are maintained in balance. Sustainable development and its implications are not just a global concern, it also affects Australia. In particular, rural Australian communities are facing various economic, environmental and social challenges. Thus, the need for sustainable development in rural regions is becoming increasingly important. To promote sustainable development, proper frameworks along with the associated tools optimised for the specific regions, need to be developed. This will ensure that the decisions made for sustainable development are evidence based, instead of subjective opinions. To address these issues, Queensland University of Technology (QUT), through an Australian Research Council (ARC) linkage grant, has initiated research into the development of a Rural Statistical Sustainability Framework (RSSF) to aid sustainable decision making in rural Queensland. This particular branch of the research developed a decision support tool that will become the integrating component of the RSSF. This tool is developed on the web-based platform to allow easy dissemination, quick maintenance and to minimise compatibility issues. The tool is developed based on MapGuide Open Source and it follows the three-tier architecture: Client tier, Web tier and the Server tier. The developed tool is interactive and behaves similar to a familiar desktop-based application. It has the capability to handle and display vector-based spatial data and can give further visual outputs using charts and tables. The data used in this tool is obtained from the QUT research team. Overall the tool implements four tasks to help in the decision-making process. These are the Locality Classification, Trend Display, Impact Assessment and Data Entry and Update. The developed tool utilises open source and freely available software and accounts for easy extensibility and long-term sustainability.
Resumo:
Continuous user authentication with keystroke dynamics uses characters sequences as features. Since users can type characters in any order, it is imperative to find character sequences (n-graphs) that are representative of user typing behavior. The contemporary feature selection approaches do not guarantee selecting frequently-typed features which may cause less accurate statistical user-representation. Furthermore, the selected features do not inherently reflect user typing behavior. We propose four statistical based feature selection techniques that mitigate limitations of existing approaches. The first technique selects the most frequently occurring features. The other three consider different user typing behaviors by selecting: n-graphs that are typed quickly; n-graphs that are typed with consistent time; and n-graphs that have large time variance among users. We use Gunetti’s keystroke dataset and k-means clustering algorithm for our experiments. The results show that among the proposed techniques, the most-frequent feature selection technique can effectively find user representative features. We further substantiate our results by comparing the most-frequent feature selection technique with three existing approaches (popular Italian words, common n-graphs, and least frequent ngraphs). We find that it performs better than the existing approaches after selecting a certain number of most-frequent n-graphs.
Resumo:
There are many applications in aeronautical/aerospace engineering where some values of the design parameters states cannot be provided or determined accurately. These values can be related to the geometry(wingspan, length, angles) and or to operational flight conditions that vary due to the presence of uncertainty parameters (Mach, angle of attack, air density and temperature, etc.). These uncertainty design parameters cannot be ignored in engineering design and must be taken into the optimisation task to produce more realistic and reliable solutions. In this paper, a robust/uncertainty design method with statistical constraints is introduced to produce a set of reliable solutions which have high performance and low sensitivity. Robust design concept coupled with Multi Objective Evolutionary Algorithms (MOEAs) is defined by applying two statistical sampling formulas; mean and variance/standard deviation associated with the optimisation fitness/objective functions. The methodology is based on a canonical evolution strategy and incorporates the concepts of hierarchical topology, parallel computing and asynchronous evaluation. It is implemented for two practical Unmanned Aerial System (UAS) design problems; the flrst case considers robust multi-objective (single disciplinary: aerodynamics) design optimisation and the second considers a robust multidisciplinary (aero structures) design optimisation. Numerical results show that the solutions obtained by the robust design method with statistical constraints have a more reliable performance and sensitivity in both aerodynamics and structures when compared to the baseline design.
Resumo:
Many studies into construction procurement methods reveal evidence of a need to change the culture and attitude in the construction industry, transition from traditional adversarial relationships to cooperative and collaborative relationships. At the same time there is also increasing concern and discussion on alternative procurement methods, involving a movement away from traditional procurement systems. Relational contracting approaches, such as partnering and relationship management, are business strategies that align the objectives of clients, commercial participants and stakeholders. It provides a collaborative environment and a framework for all participants to adapt their behaviour to project objectives and allows for engagement of those subcontractors and suppliers down the supply chain. The efficacy of relationship management in the client and contractor groups is proven and well documented. However, the industry has a history of slow implementation of relational contracting down the supply chain. Furthermore, there exists little research on relationship management conducted in the supply chain context. This research aims to explore the association between relational contracting structures and processes and supply chain sustainability in the civil engineering construction industry. It endeavours to shed light on the practices and prerequisites for relationship management implementation success and for supply sustainability to develop. The research methodology is a triangulated approach based on Cheung.s (2006) earlier research where questionnaire survey, interviews and case studies were conducted. This new research includes a face-to-face questionnaire survey that was carried out with 100 professionals from 27 contracting organisations in Queensland from June 2008 to January 2009. A follow-up survey sub-questionnaire, further examining project participants. perspectives was sent to another group of professionals (as identified in the main questionnaire survey). Statistical analysis including multiple regression, correlation, principal component factor analysis and analysis of variance were used to identify the underlying dimensions and test the relationships among variables. Interviews and case studies were conducted to assist in providing a deeper understanding as well as explaining findings of the quantitative study. The qualitative approaches also gave the opportunity to critique and validate the research findings. This research presents the implementation of relationship management from the contractor.s perspective. Findings show that the adaption of relational contracting approach in the supply chain is found to be limited; contractors still prefer to keep the suppliers and subcontractors at arm.s length. This research shows that the degree of match and mismatch between organisational structuring and organisational process has an impact on staff.s commitment level and performance effectiveness. Key issues affecting performance effectiveness and relationship effectiveness include total influence between parties, access to information, personal acquaintance, communication process, risk identification, timely problem solving and commercial framework. Findings also indicate that alliance and Early Contractor Involvement (ECI) projects achieve higher performance effectiveness at both short-term and long-term levels compared to projects with either no or partial relationship management adopted.
Resumo:
Background: It is predicted that China will have the largest number of cases of dementia in the world by 2025 (Ferri et al., 2005). Research has demonstrated that caring for family members with dementia can be a long-term, burdensome activity resulting in physical and emotional distress and impairment (Pinquart & Sorensen, 2003b). The establishment of family caregiver supportive services in China can be considered urgent; and the knowledge of the caregiving experience and related influencing factors is necessary to inform such services. Nevertheless, in the context of rapid demographic and socioeconomic change, the impact of caregiving for rural and urban Chinese adult-child caregivers may be different, and different needs in supportive services may therefore be expected. Objectives: The aims of this research were 1) to examine the potential differences existing in the caregiving experience between rural and urban adult-child caregivers caring for parents with dementia in China; and 2) to examine the potential differences existing in the influencing factors of the caregiving experience for rural as compared with urban adult-child caregivers caring for parents with dementia in China. Based on the literature review and Kramer.s (1997) caregiver adaptation model, six concepts and their relationships of caregiving experience were studied: severity of the care receivers. dementia, caregivers. appraisal of role strain and role gain, negative and positive well-being outcomes, and health related quality of life. Furthermore, four influencing factors (i.e., filial piety, social support, resilience, and personal mastery) were studied respectively. Methods: A cross-sectional, comparative design was used to achieve the aims of the study. A questionnaire, which was designed based on the literature review and on Kramer.s (1997) caregiver adaptation model, was completed by 401 adult-child caregivers caring for their parents with dementia from the mental health outpatient departments in five hospitals in the Yunnan province, P.R. China. Structural equation modelling (SEM) was employed as the main statistical technique for data analyses. Other statistical techniques (e.g., t-tests and Chi-Square tests) were also conducted to compare the demographic characteristics and the measured variables between rural and urban groups. Results: For the first research aim, the results indicated that urban adult-child caregivers in China experienced significantly greater strain and negative well-being outcomes than their rural peers; whereas, the difference on the appraisal of role gain and positive outcomes was nonsignificant between the two groups. The results also indicated that the amounts of severity of care receivers. dementia and caregivers. health related quality of life do not have the same meanings between the two groups. Thus, the levels of these two concepts were not comparable between the rural and urban groups in this study. Moreover, the results also demonstrated that the negative direct effect of gain on negative outcomes in urban caregivers was stronger than that in rural caregivers, suggesting that the urban caregivers tended to use appraisal of role gain to protect themselves from negative well-being outcomes to a greater extent. In addition, the unexplained variance in strain in the urban group was significantly more than that in the rural group, suggesting that there were other unmeasured variables besides the severity of care receivers. dementia which would predict strain in urban caregivers compared with their rural peers. For the second research aim, the results demonstrated that rural adult-child caregivers reported a significantly higher level of filial piety and more social support than their urban counterparts, although the two groups did not significantly differ on the levels of their resilience and personal mastery. Furthermore, although the mediation effects of these four influencing factors on both positive and negative aspects remained constant across rural and urban adult-child caregivers, urban caregivers tended to be more effective in using personal mastery to protect themselves from role strain than rural caregivers, which in turn protects them more from the negative well-being outcomes than was the case with their rural peers. Conclusions: The study extends the application of Kramer.s caregiving adaptation process model (Kramer, 1997) to a sample of adult-child caregivers in China by demonstrating that both positive and negative aspects of caregiving may impact on the caregiver.s health related quality of life, suggesting that both aspects should be targeted in supportive interventions for Chinese family caregivers. Moreover, by demonstrating partial mediation effects, the study provides four influencing factors (i.e., filial piety, social support, resilience, and personal mastery) as specific targets for clinical interventions. Furthermore, the study found evidence that urban adult-child caregivers had more negative but similar positive experience compared to their rural peers, suggesting that the establishment of supportive services for urban caregivers may be more urgent at present stage in China. Additionally, since urban caregivers tended to use appraisal of role gain and personal mastery to protect themselves from negative well-being outcomes than rural caregivers to a greater extend, interventions targeting utility of gain or/and personal mastery to decrease negative outcomes might be more effective in urban caregivers than in rural caregivers. On the other hand, as cultural expectations and expression of filial piety tend to be more traditional in rural areas, interventions targeting filial piety could be more effective among rural caregivers. Last but not least, as rural adult-child caregivers have more existing natural social support than their urban counterparts, mobilising existing natural social support resources may be more beneficial for rural caregivers, whereas, formal supports (e.g., counselling services, support groups and adult day care centres) should be enhanced for urban caregivers.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
Twin studies offer the opportunity to determine the relative contribution of genes versus environment in traits of interest. Here, we investigate the extent to which variance in brain structure is reduced in monozygous twins with identical genetic make-up. We investigate whether using twins as compared to a control population reduces variability in a number of common magnetic resonance (MR) structural measures, and we investigate the location of areas under major genetic influences. This is fundamental to understanding the benefit of using twins in studies where structure is the phenotype of interest. Twenty-three pairs of healthy MZ twins were compared to matched control pairs. Volume, T2 and diffusion MR imaging were performed as well as spectroscopy (MRS). Images were compared using (i) global measures of standard deviation and effect size, (ii) voxel-based analysis of similarity and (iii) intra-pair correlation. Global measures indicated a consistent increase in structural similarity in twins. The voxel-based and correlation analyses indicated a widespread pattern of increased similarity in twin pairs, particularly in frontal and temporal regions. The areas of increased similarity were most widespread for the diffusion trace and least widespread for T2. MRS showed consistent reduction in metabolite variation that was significant in the temporal lobe N-acetylaspartate (NAA). This study has shown the distribution and magnitude of reduced variability in brain volume, diffusion, T2 and metabolites in twins. The data suggest that evaluation of twins discordant for disease is indeed a valid way to attribute genetic or environmental influences to observed abnormalities in patients since evidence is provided for the underlying assumption of decreased variability in twins.
Resumo:
In this paper, spatially offset Raman spectroscopy (SORS) is demonstrated for non-invasively investigating the composition of drug mixtures inside an opaque plastic container. The mixtures consisted of three components including a target drug (acetaminophen or phenylephrine hydrochloride) and two diluents (glucose and caffeine). The target drug concentrations ranged from 5% to 100%. After conducting SORS analysis to ascertain the Raman spectra of the concealed mixtures, principal component analysis (PCA) was performed on the SORS spectra to reveal trends within the data. Partial least squares (PLS) regression was used to construct models that predicted the concentration of each target drug, in the presence of the other two diluents. The PLS models were able to predict the concentration of acetaminophen in the validation samples with a root-mean-square error of prediction (RMSEP) of 3.8% and the concentration of phenylephrine hydrochloride with an RMSEP of 4.6%. This work demonstrates the potential of SORS, used in conjunction with multivariate statistical techniques, to perform non-invasive, quantitative analysis on mixtures inside opaque containers. This has applications for pharmaceutical analysis, such as monitoring the degradation of pharmaceutical products on the shelf, in forensic investigations of counterfeit drugs, and for the analysis of illicit drug mixtures which may contain multiple components.
Resumo:
Although the design-build (DB) system has been demonstrated to be an effective delivery method and has gained popularity worldwide, it has not gained the same popularity in the construction market of China. The objective of this study was, theretofore, to investigate the barriers to entry in the DB market. A total of 22 entry barriers were first identified through an open-ended questionnaire survey with 15 top construction professionals in the construction market of China. A broad questionnaire survey was further conducted to prioritize these entry barriers. Statistical analysis of responses shows that the most dominant barriers to entry into the DB market are, namely, lack of design expertise, lack of interest from owners, lack of suitable organization structure, lack of DB specialists, and lack of credit record system. Analysis of variance indicates that there is no difference of opinions among the respondent groups of academia, government departments, state-owned company, and private company, at the 5% significance level, on most of the barriers to entry. Finally, the underlying dimensions of barriers to entry in the DB market were investigated through factor analysis. The results indicate that there are six major underlying dimensions of entry barriers in DB market, which include, namely, the competence of design-builders, difficulty in project procurement, characteristics of DB projects, lack of support from public sectors, the competence of DB owners, and the immaturity of DB market. These findings are useful for both potential and incumbent design-builders to understand and analyze the DB market in China.
Resumo:
Purpose. To create a binocular statistical eye model based on previously measured ocular biometric data. Methods. Thirty-nine parameters were determined for a group of 127 healthy subjects (37 male, 90 female; 96.8% Caucasian) with an average age of 39.9 ± 12.2 years and spherical equivalent refraction of −0.98 ± 1.77 D. These parameters described the biometry of both eyes and the subjects' age. Missing parameters were complemented by data from a previously published study. After confirmation of the Gaussian shape of their distributions, these parameters were used to calculate their mean and covariance matrices. These matrices were then used to calculate a multivariate Gaussian distribution. From this, an amount of random biometric data could be generated, which were then randomly selected to create a realistic population of random eyes. Results. All parameters had Gaussian distributions, with the exception of the parameters that describe total refraction (i.e., three parameters per eye). After these non-Gaussian parameters were omitted from the model, the generated data were found to be statistically indistinguishable from the original data for the remaining 33 parameters (TOST [two one-sided t tests]; P < 0.01). Parameters derived from the generated data were also significantly indistinguishable from those calculated with the original data (P > 0.05). The only exception to this was the lens refractive index, for which the generated data had a significantly larger SD. Conclusions. A statistical eye model can describe the biometric variations found in a population and is a useful addition to the classic eye models.
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.