167 resultados para Statistical Foundations
Resumo:
A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this “strong margin adaptivity” makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model selection procedure (even a randomized one), there is a problem for which it does not demonstrate strong margin adaptivity.
An approach to statistical lip modelling for speaker identification via chromatic feature extraction
Resumo:
This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS1 database, show encouraging results
Resumo:
Multivariate volatility forecasts are an important input in many financial applications, in particular portfolio optimisation problems. Given the number of models available and the range of loss functions to discriminate between them, it is obvious that selecting the optimal forecasting model is challenging. The aim of this thesis is to thoroughly investigate how effective many commonly used statistical (MSE and QLIKE) and economic (portfolio variance and portfolio utility) loss functions are at discriminating between competing multivariate volatility forecasts. An analytical investigation of the loss functions is performed to determine whether they identify the correct forecast as the best forecast. This is followed by an extensive simulation study examines the ability of the loss functions to consistently rank forecasts, and their statistical power within tests of predictive ability. For the tests of predictive ability, the model confidence set (MCS) approach of Hansen, Lunde and Nason (2003, 2011) is employed. As well, an empirical study investigates whether simulation findings hold in a realistic setting. In light of these earlier studies, a major empirical study seeks to identify the set of superior multivariate volatility forecasting models from 43 models that use either daily squared returns or realised volatility to generate forecasts. This study also assesses how the choice of volatility proxy affects the ability of the statistical loss functions to discriminate between forecasts. Analysis of the loss functions shows that QLIKE, MSE and portfolio variance can discriminate between multivariate volatility forecasts, while portfolio utility cannot. An examination of the effective loss functions shows that they all can identify the correct forecast at a point in time, however, their ability to discriminate between competing forecasts does vary. That is, QLIKE is identified as the most effective loss function, followed by portfolio variance which is then followed by MSE. The major empirical analysis reports that the optimal set of multivariate volatility forecasting models includes forecasts generated from daily squared returns and realised volatility. Furthermore, it finds that the volatility proxy affects the statistical loss functions’ ability to discriminate between forecasts in tests of predictive ability. These findings deepen our understanding of how to choose between competing multivariate volatility forecasts.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
It is important to promote a sustainable development approach to ensure that economic, environmental and social developments are maintained in balance. Sustainable development and its implications are not just a global concern, it also affects Australia. In particular, rural Australian communities are facing various economic, environmental and social challenges. Thus, the need for sustainable development in rural regions is becoming increasingly important. To promote sustainable development, proper frameworks along with the associated tools optimised for the specific regions, need to be developed. This will ensure that the decisions made for sustainable development are evidence based, instead of subjective opinions. To address these issues, Queensland University of Technology (QUT), through an Australian Research Council (ARC) linkage grant, has initiated research into the development of a Rural Statistical Sustainability Framework (RSSF) to aid sustainable decision making in rural Queensland. This particular branch of the research developed a decision support tool that will become the integrating component of the RSSF. This tool is developed on the web-based platform to allow easy dissemination, quick maintenance and to minimise compatibility issues. The tool is developed based on MapGuide Open Source and it follows the three-tier architecture: Client tier, Web tier and the Server tier. The developed tool is interactive and behaves similar to a familiar desktop-based application. It has the capability to handle and display vector-based spatial data and can give further visual outputs using charts and tables. The data used in this tool is obtained from the QUT research team. Overall the tool implements four tasks to help in the decision-making process. These are the Locality Classification, Trend Display, Impact Assessment and Data Entry and Update. The developed tool utilises open source and freely available software and accounts for easy extensibility and long-term sustainability.
Resumo:
Purpose – In the 21st Century, as knowledge, technology and education are widely accepted to play key roles in the local economic development, the importance of making space and place for knowledge production is, therefore, on the rise resulting many city administrations and urban policy-makers worldwide restructuring their cities to become highly competitive and creative. Consequently, this has led to a new type of city form, knowledge city, and a new approach in their development, knowledge-based urban development. In this context, knowledge-based foundations of universities are regarded as one of the key elements for knowledge-based urban development and knowledge city formation due to their ability to provide a strong platform for knowledge generation, marketing and transfer. This paper aims to investigate the role and importance of universities and their knowledge-based foundations in the context of developing countries, particularly in Malaysia, in building prosperous knowledge cities of the era of the knowledge economy. Design/Methodology/Approach – The main methodological techniques employed in this research includes: a thorough review of the literature on the role of universities in spatial and socio-economic development of cities; a best practice analysis and policy review of urban and regional development policies targeting to use of university clusters in leveraging knowledge-based development, and; a case study in Malaysia with a review of various policy documents and strategic plans of the local universities and local and state authorities, interviews with key actors, and a trend analysis of local socio-economic and spatial changes. Originality/Value – This paper reports the findings of a pioneering research on examining the role and impact of universities and their knowledge-based foundations, in the context of Malaysia, in building knowledge cities of the era of the knowledge economy. By undertaking a case study investigation in Bandar Seri Iskandar, which is a newly emerging Malaysian knowledge city, located in Perak, Malaysia, the paper sheds light on an important issue of the 21st Century of how universities contribute to the knowledge-based development of cities. Practical Implications – Universities with their rich knowledge-based foundations are increasingly being recognised as knowledge hubs, exercising a strong influence in the intellectual vitality of the city where they are embedded. This paper reveals that universities, in joint action with business and society at large, are necessary prerequisites for constructing and maintaining knowledge societies and, therefore, building prosperous knowledge cities. In light of the literature and case findings, the paper sheds light on the contribution of knowledge-based foundations of universities in knowledge city formation and provides generic recommendations for cities and regions seeking knowledge city transformation.
Resumo:
The research objectives of this thesis were to contribute to Bayesian statistical methodology by contributing to risk assessment statistical methodology, and to spatial and spatio-temporal methodology, by modelling error structures using complex hierarchical models. Specifically, I hoped to consider two applied areas, and use these applications as a springboard for developing new statistical methods as well as undertaking analyses which might give answers to particular applied questions. Thus, this thesis considers a series of models, firstly in the context of risk assessments for recycled water, and secondly in the context of water usage by crops. The research objective was to model error structures using hierarchical models in two problems, namely risk assessment analyses for wastewater, and secondly, in a four dimensional dataset, assessing differences between cropping systems over time and over three spatial dimensions. The aim was to use the simplicity and insight afforded by Bayesian networks to develop appropriate models for risk scenarios, and again to use Bayesian hierarchical models to explore the necessarily complex modelling of four dimensional agricultural data. The specific objectives of the research were to develop a method for the calculation of credible intervals for the point estimates of Bayesian networks; to develop a model structure to incorporate all the experimental uncertainty associated with various constants thereby allowing the calculation of more credible credible intervals for a risk assessment; to model a single day’s data from the agricultural dataset which satisfactorily captured the complexities of the data; to build a model for several days’ data, in order to consider how the full data might be modelled; and finally to build a model for the full four dimensional dataset and to consider the timevarying nature of the contrast of interest, having satisfactorily accounted for possible spatial and temporal autocorrelations. This work forms five papers, two of which have been published, with two submitted, and the final paper still in draft. The first two objectives were met by recasting the risk assessments as directed, acyclic graphs (DAGs). In the first case, we elicited uncertainty for the conditional probabilities needed by the Bayesian net, incorporated these into a corresponding DAG, and used Markov chain Monte Carlo (MCMC) to find credible intervals, for all the scenarios and outcomes of interest. In the second case, we incorporated the experimental data underlying the risk assessment constants into the DAG, and also treated some of that data as needing to be modelled as an ‘errors-invariables’ problem [Fuller, 1987]. This illustrated a simple method for the incorporation of experimental error into risk assessments. In considering one day of the three-dimensional agricultural data, it became clear that geostatistical models or conditional autoregressive (CAR) models over the three dimensions were not the best way to approach the data. Instead CAR models are used with neighbours only in the same depth layer. This gave flexibility to the model, allowing both the spatially structured and non-structured variances to differ at all depths. We call this model the CAR layered model. Given the experimental design, the fixed part of the model could have been modelled as a set of means by treatment and by depth, but doing so allows little insight into how the treatment effects vary with depth. Hence, a number of essentially non-parametric approaches were taken to see the effects of depth on treatment, with the model of choice incorporating an errors-in-variables approach for depth in addition to a non-parametric smooth. The statistical contribution here was the introduction of the CAR layered model, the applied contribution the analysis of moisture over depth and estimation of the contrast of interest together with its credible intervals. These models were fitted using WinBUGS [Lunn et al., 2000]. The work in the fifth paper deals with the fact that with large datasets, the use of WinBUGS becomes more problematic because of its highly correlated term by term updating. In this work, we introduce a Gibbs sampler with block updating for the CAR layered model. The Gibbs sampler was implemented by Chris Strickland using pyMCMC [Strickland, 2010]. This framework is then used to consider five days data, and we show that moisture in the soil for all the various treatments reaches levels particular to each treatment at a depth of 200 cm and thereafter stays constant, albeit with increasing variances with depth. In an analysis across three spatial dimensions and across time, there are many interactions of time and the spatial dimensions to be considered. Hence, we chose to use a daily model and to repeat the analysis at all time points, effectively creating an interaction model of time by the daily model. Such an approach allows great flexibility. However, this approach does not allow insight into the way in which the parameter of interest varies over time. Hence, a two-stage approach was also used, with estimates from the first-stage being analysed as a set of time series. We see this spatio-temporal interaction model as being a useful approach to data measured across three spatial dimensions and time, since it does not assume additivity of the random spatial or temporal effects.
Resumo:
From the viewpoint of fundraisers the life of the around 11,000 grant-making foundations in the UK may appear carefree. Grant-making foundations 'merely' have to dispense funds rather than raise them, and surely spending money has to be easier than getting it? So, with around £2 billion to spend each year largely as they choose, what could possibly keep foundations awake at night? Before attempting to answer that question it is important to distinguish between different types of grant-makers. To fundraisers grant-making foundations may appear to be very much alike but not all foundations are grant-makers - some operate their own programmes - and not all grant-makers are endowed foundations.
Resumo:
The advocacy for inquiry-based learning in contemporary curricula assumes the principle that students learn in their own way by drawing on direct experience fostered by the teacher. That students should be able to discover answers themselves through active engagement with new experiences was central to the thinking of eminent educators such as Pestalozzi, Dewey and Montessori. However, even after many years of research and practice, inquiry learning as a referent for teaching still struggles to find expression in the average teachers' pedagogy. This study drew on interview data from 20 primary teachers. A phenomenographic analysis revealed three conceptions of teaching that support inquiry learning in science in the primary years of schooling: (a) The Experience-centred conception where teachers focused on providing interesting sensory experiences to students; (b) The Problem-centred conception where teachers focused on challenging students with engaging problems; and (c) The Question-centred conception where teachers focused on helping students to ask and answer their own questions. Understanding teachers' conceptions of teaching has implications for both the enactment of inquiry teaching in the classroom as well as the uptake of new teaching behaviours during professional development, with enhanced outcomes for engaging students in STEM.
Resumo:
In this paper, spatially offset Raman spectroscopy (SORS) is demonstrated for non-invasively investigating the composition of drug mixtures inside an opaque plastic container. The mixtures consisted of three components including a target drug (acetaminophen or phenylephrine hydrochloride) and two diluents (glucose and caffeine). The target drug concentrations ranged from 5% to 100%. After conducting SORS analysis to ascertain the Raman spectra of the concealed mixtures, principal component analysis (PCA) was performed on the SORS spectra to reveal trends within the data. Partial least squares (PLS) regression was used to construct models that predicted the concentration of each target drug, in the presence of the other two diluents. The PLS models were able to predict the concentration of acetaminophen in the validation samples with a root-mean-square error of prediction (RMSEP) of 3.8% and the concentration of phenylephrine hydrochloride with an RMSEP of 4.6%. This work demonstrates the potential of SORS, used in conjunction with multivariate statistical techniques, to perform non-invasive, quantitative analysis on mixtures inside opaque containers. This has applications for pharmaceutical analysis, such as monitoring the degradation of pharmaceutical products on the shelf, in forensic investigations of counterfeit drugs, and for the analysis of illicit drug mixtures which may contain multiple components.
Resumo:
Philanthropic foundations in Australia have traditionally been labelled ‘icebergs’. Much of what they do and who they are is not apparent on the surface. Many are unknown and apart from an occasional biography, almost all are sparsely documented in terms of the very personal decisions behind establishing them. Practically and academically, scant data exist on the decision journeys people make into formalised philanthropy. This study seeks to fill that gap. It is believed to be the largest such study of foundation decision-making ever undertaken in this country. It is the latest in a series of ACPNS research into types of considered (versus spontaneous) giving in Australia. This research has been supported by the Perpetual Foundation, the EF and SL Gluyas Trust and the Edward Corbould Charitable Trust under the management of Perpetual Trustee Company Ltd.
Resumo:
Purpose. To create a binocular statistical eye model based on previously measured ocular biometric data. Methods. Thirty-nine parameters were determined for a group of 127 healthy subjects (37 male, 90 female; 96.8% Caucasian) with an average age of 39.9 ± 12.2 years and spherical equivalent refraction of −0.98 ± 1.77 D. These parameters described the biometry of both eyes and the subjects' age. Missing parameters were complemented by data from a previously published study. After confirmation of the Gaussian shape of their distributions, these parameters were used to calculate their mean and covariance matrices. These matrices were then used to calculate a multivariate Gaussian distribution. From this, an amount of random biometric data could be generated, which were then randomly selected to create a realistic population of random eyes. Results. All parameters had Gaussian distributions, with the exception of the parameters that describe total refraction (i.e., three parameters per eye). After these non-Gaussian parameters were omitted from the model, the generated data were found to be statistically indistinguishable from the original data for the remaining 33 parameters (TOST [two one-sided t tests]; P < 0.01). Parameters derived from the generated data were also significantly indistinguishable from those calculated with the original data (P > 0.05). The only exception to this was the lens refractive index, for which the generated data had a significantly larger SD. Conclusions. A statistical eye model can describe the biometric variations found in a population and is a useful addition to the classic eye models.
Resumo:
A key issue in the economic development and performance of organizations is the existence of standards. Their definition and control are sources of power and it is important to understand their concept, as it gives standards their direction and their legitimacy, and to explore how they are represented and applied. The difficulties posed by classical micro-economics in establishing a theory of standardization that is compatible with its fundamental axiomatic are acknowledged. We propose to reconsider the problem by taking the opposite perspective in questioning its theoretical base and by reformulating assumptions about the independent and autonomous decisions taken by actors. The Theory of Conventions will offer us a theoretical framework and tools enabling us to understand the systemic dimension and dynamic structure of standards. These will be seen as a special case of conventions. This work aims to provide a sound basis and promote a better consciousness in the development of global project management standards. It aims also to emphasize that social construction is not a matter of copyright but a matter of open minds, collective cognitive process and freedom for the common wealth.