984 resultados para Mathematical statistics.
Resumo:
A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this “strong margin adaptivity” makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model selection procedure (even a randomized one), there is a problem for which it does not demonstrate strong margin adaptivity.
Resumo:
The measurement error model is a well established statistical method for regression problems in medical sciences, although rarely used in ecological studies. While the situations in which it is appropriate may be less common in ecology, there are instances in which there may be benefits in its use for prediction and estimation of parameters of interest. We have chosen to explore this topic using a conditional independence model in a Bayesian framework using a Gibbs sampler, as this gives a great deal of flexibility, allowing us to analyse a number of different models without losing generality. Using simulations and two examples, we show how the conditional independence model can be used in ecology, and when it is appropriate.
Resumo:
In the multi-view approach to semisupervised learning, we choose one predictor from each of multiple hypothesis classes, and we co-regularize our choices by penalizing disagreement among the predictors on the unlabeled data. We examine the co-regularization method used in the co-regularized least squares (CoRLS) algorithm, in which the views are reproducing kernel Hilbert spaces (RKHS's), and the disagreement penalty is the average squared difference in predictions. The final predictor is the pointwise average of the predictors from each view. We call the set of predictors that can result from this procedure the co-regularized hypothesis class. Our main result is a tight bound on the Rademacher complexity of the co-regularized hypothesis class in terms of the kernel matrices of each RKHS. We find that the co-regularization reduces the Rademacher complexity by an amount that depends on the distance between the two views, as measured by a data dependent metric. We then use standard techniques to bound the gap between training error and test error for the CoRLS algorithm. Experimentally, we find that the amount of reduction in complexity introduced by co regularization correlates with the amount of improvement that co-regularization gives in the CoRLS algorithm.
Resumo:
We study the problem of allocating stocks to dark pools. We propose and analyze an optimal approach for allocations, if continuous-valued allocations are allowed. We also propose a modification for the case when only integer-valued allocations are possible. We extend the previous work on this problem to adversarial scenarios, while also improving on their results in the iid setup. The resulting algorithms are efficient, and perform well in simulations under stochastic and adversarial inputs.
Resumo:
Its mission is to promote Mathematics and Science in Africa and to provide a focal point for Mathematics university training in Africa. It offers scholarships for up to 50 students to come and study for a period of nine months. Of the 50 students, about 15 positions are reserved for females. In the 2006/2007 intake there were over 250 applicants. The students are housed and fed and their return travel from their home town is fully funded. Lecturers also stay at AIMS and share their meals with the students, so that a rapport quickly develops. The students are away from their families and friends for nine months and are absolutely committed to the discipline of Mathematics. When they first arrive, some of them have little ability in English but since all tuition is in English they quickly learn. Some find the transitions difficult but they all support one another and at the end of their time their English skills are very good. The students do a series of subjects that last for about three weeks each, consisting of 30 contact hours, as well as a thesis/project. Each course has a number of assignments associated with it and these get evaluated. AIMS has seven or eight teaching assistants who help with the tutorials, marking, advice, and who are a vital component of AIMS.
Resumo:
In this study, we consider how Fractional Differential Equations (FDEs) can be used to study the travelling wave phenomena in parabolic equations. As our method is conducted under intracellular environments that are highly crowded, it was discovered that there is a simple relationship between the travelling wave speed and obstacle density.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
In this paper we construct a mathematical model for the genetic regulatory network of the lactose operon. This mathematical model contains transcription and translation of the lactose permease (LacY) and a reporter gene GFP. The probability of transcription of LacY is determined by 14 binding states out of all 50 possible binding states of the lactose operon based on the quasi-steady-state assumption for the binding reactions, while we calculate the probability of transcription for the reporter gene GFP based on 5 binding states out of 19 possible binding states because the binding site O2 is missing for this reporter gene. We have tested different mechanisms for the transport of thio-methylgalactoside (TMG) and the effect of different Hill coefficients on the simulated LacY expression levels. Using this mathematical model we have realized one of the experimental results with different LacY concentrations, which are induced by different concentrations of TMG.
Resumo:
Statistics of the estimates of tricoherence are obtained analytically for nonlinear harmonic random processes with known true tricoherence. Expressions are presented for the bias, variance, and probability distributions of estimates of tricoherence as functions of the true tricoherence and the number of realizations averaged in the estimates. The expressions are applicable to arbitrary higher order coherence and arbitrary degree of interaction between modes. Theoretical results are compared with those obtained from numerical simulations of nonlinear harmonic random processes. Estimation of true values of tricoherence given observed values is also discussed
Resumo:
To address issues of divisive ideologies in the Mathematics Education community and to subsequently advance educational practice, an alternative theoretical framework and operational model is proposed which represents a consilience of constructivist learning theories whilst acknowledging the objective but improvable nature of domain knowledge. Based upon Popper’s three-world model of knowledge, the proposed theory supports the differentiation and explicit modelling of both shared domain knowledge and idiosyncratic personal understanding using a visual nomenclature. The visual nomenclature embodies Piaget’s notion of reflective abstraction and so may support an individual’s experience-based transformation of personal understanding with regards to shared domain knowledge. Using the operational model and visual nomenclature, seminal literature regarding early-number counting and addition was analysed and described. Exemplars of the resultant visual artefacts demonstrate the proposed theory’s viability as a tool with which to characterise the reflective abstraction-based organisation of a domain’s shared knowledge. Utilising such a description of knowledge, future research needs to consider the refinement of the operational model and visual nomenclature to include the analysis, description and scaffolded transformation of personal understanding. A detailed model of knowledge and understanding may then underpin the future development of educational software tools such as computer-mediated teaching and learning environments.
Resumo:
Goldin (2003) and McDonald, Yanchar, and Osguthorpe (2005) have called for mathematics learning theory that reconciles the chasm between ideologies, and which may advance mathematics teaching and learning practice. This paper discusses the theoretical underpinnings of a recently completed PhD study that draws upon Popper’s (1978) three-world model of knowledge as a lens through which to reconsider a variety of learning theories, including Piaget’s reflective abstraction. Based upon this consideration of theories, an alternative theoretical framework and complementary operational model was synthesised, the viability of which was demonstrated by its use to analyse the domain of early-number counting, addition and subtraction.
Resumo:
As the development of ICD-11 progresses, the Australian Bureau of Statistics is beginning to consider what will be required to successfully implement the new version of the classification. This paper will present early thoughts on the following: building understanding amongst the user community of upcoming changes and the implications of those changes; the need for training of coders and data users; development of analytical methods and conduct of comparability studies; processes to test, accept and implement new or updated coding software; assessment of coding quality; changes to data analyses and reporting processes; updates to regular publications; and assessing the resources required for successful implementation.
Resumo:
An introduction to elicitation of experts' probabilities, which illustrates common problems with reasoning and how to circumvent them during elicitation.
Resumo:
An introduction to design of eliciting knowledge from experts.
Resumo:
An introduction to eliciting a conditional probability table in a Bayesian Network model, highlighting three efficient methods for populating a CPT.