934 resultados para RANDOM PERMUTATION MODEL


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The FANOVA (or “Sobol’-Hoeffding”) decomposition of multivariate functions has been used for high-dimensional model representation and global sensitivity analysis. When the objective function f has no simple analytic form and is costly to evaluate, computing FANOVA terms may be unaffordable due to numerical integration costs. Several approximate approaches relying on Gaussian random field (GRF) models have been proposed to alleviate these costs, where f is substituted by a (kriging) predictor or by conditional simulations. Here we focus on FANOVA decompositions of GRF sample paths, and we notably introduce an associated kernel decomposition into 4 d 4d terms called KANOVA. An interpretation in terms of tensor product projections is obtained, and it is shown that projected kernels control both the sparsity of GRF sample paths and the dependence structure between FANOVA effects. Applications on simulated data show the relevance of the approach for designing new classes of covariance kernels dedicated to high-dimensional kriging.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Clay mineral assemblages at ODP Site 1146 in the northern South China Sea are used to investigate sediment source and transport processes and to evaluate the evolution of the East Asian monsoon over the past 2 Myr. Clay minerals consist mainly of illite (22-43%) and smectite (12-48%), with associated chlorite (10-30%), kaolinite (2-18%), and random mixed-layer clays (5-22%). Hydrodynamic and mineralogical studies indicate that illite and chlorite sources include Taiwan and the Yangtze River, that smectite and mixed-layer clays originate predominantly from Luzon and Indonesia, and that kaolinite is primarily derived from the Pearl River. Mineral assemblages indicate strong glacial-interglacial cyclicity, with high illite, chlorite, and kaolinite content during glacials and high smectite and mixed-layer clay content during interglacials. During interglacials, summer enhanced monsoon (southwesterly) currents transport more smectite and mixed-layer clays to Site 1146 whereas during glacials, enhanced winter monsoon (northerly) currents transport more illite and chlorite from Taiwan and the Yangtze River. The ratio (smectite+mixed layers)/(illite+chlorite) was adopted as a proxy for East Asian monsoon variability. Higher ratios indicate strengthened summer-monsoon winds and weakened winter-monsoon winds during interglacials. In contrast, lower ratios indicate a strongly intensified winter monsoon and weakened summer monsoon during glacials. Spectral analysis indicates the mineral ratio was dominantly forced by monsoon variability prior to the development of large-scale glaciation at 1.2 Myr and by both monsoon variability and the effects of changing sea level in the interval 1.2 Myr to present.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-06

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mixture models implemented via the expectation-maximization (EM) algorithm are being increasingly used in a wide range of problems in pattern recognition such as image segmentation. However, the EM algorithm requires considerable computational time in its application to huge data sets such as a three-dimensional magnetic resonance (MR) image of over 10 million voxels. Recently, it was shown that a sparse, incremental version of the EM algorithm could improve its rate of convergence. In this paper, we show how this modified EM algorithm can be speeded up further by adopting a multiresolution kd-tree structure in performing the E-step. The proposed algorithm outperforms some other variants of the EM algorithm for segmenting MR images of the human brain. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate whether relative contributions of genetic and shared environmental factors are associated with an increased risk in melanoma. Data from the Queensland Familial Melanoma Project comprising 15,907 subjects arising from 1912 families were analyzed to estimate the additive genetic, common and unique environmental contributions to variation in the age at onset of melanoma. Two complementary approaches for analyzing correlated time-to-onset family data were considered: the generalized estimating equations (GEE) method in which one can estimate relationship-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modeled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov Chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the free ware package BUGS. In addition, we also used a Bayesian model to investigate the relative contribution of genetic and environmental effects on the expression of naevi and freckles, which are known risk factors for melanoma.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When studying genotype X environment interaction in multi-environment trials, plant breeders and geneticists often consider one of the effects, environments or genotypes, to be fixed and the other to be random. However, there are two main formulations for variance component estimation for the mixed model situation, referred to as the unconstrained-parameters (UP) and constrained-parameters (CP) formulations. These formulations give different estimates of genetic correlation and heritability as well as different tests of significance for the random effects factor. The definition of main effects and interactions and the consequences of such definitions should be clearly understood, and the selected formulation should be consistent for both fixed and random effects. A discussion of the practical outcomes of using the two formulations in the analysis of balanced data from multi-environment trials is presented. It is recommended that the CP formulation be used because of the meaning of its parameters and the corresponding variance components. When managed (fixed) environments are considered, users will have more confidence in prediction for them but will not be overconfident in prediction in the target (random) environments. Genetic gain (predicted response to selection in the target environments from the managed environments) is independent of formulation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper investigates a Bayesian hierarchical model for the analysis of categorical longitudinal data from a large social survey of immigrants to Australia. Data for each subject are observed on three separate occasions, or waves, of the survey. One of the features of the data set is that observations for some variables are missing for at least one wave. A model for the employment status of immigrants is developed by introducing, at the first stage of a hierarchical model, a multinomial model for the response and then subsequent terms are introduced to explain wave and subject effects. To estimate the model, we use the Gibbs sampler, which allows missing data for both the response and the explanatory variables to be imputed at each iteration of the algorithm, given some appropriate prior distributions. After accounting for significant covariate effects in the model, results show that the relative probability of remaining unemployed diminished with time following arrival in Australia.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is well known that even slight changes in nonuniform illumination lead to a large image variability and are crucial for many visual tasks. This paper presents a new ICA related probabilistic model where the number of sources exceeds the number of sensors to perform an image segmentation and illumination removal, simultaneously. We model illumination and reflectance in log space by a generalized autoregressive process and Hidden Gaussian Markov random field, respectively. The model ability to deal with segmentation of illuminated images is compared with a Canny edge detector and homomorphic filtering. We apply the model to two problems: synthetic image segmentation and sea surface pollution detection from intensity images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Computer models, or simulators, are widely used in a range of scientific fields to aid understanding of the processes involved and make predictions. Such simulators are often computationally demanding and are thus not amenable to statistical analysis. Emulators provide a statistical approximation, or surrogate, for the simulators accounting for the additional approximation uncertainty. This thesis develops a novel sequential screening method to reduce the set of simulator variables considered during emulation. This screening method is shown to require fewer simulator evaluations than existing approaches. Utilising the lower dimensional active variable set simplifies subsequent emulation analysis. For random output, or stochastic, simulators the output dispersion, and thus variance, is typically a function of the inputs. This work extends the emulator framework to account for such heteroscedasticity by constructing two new heteroscedastic Gaussian process representations and proposes an experimental design technique to optimally learn the model parameters. The design criterion is an extension of Fisher information to heteroscedastic variance models. Replicated observations are efficiently handled in both the design and model inference stages. Through a series of simulation experiments on both synthetic and real world simulators, the emulators inferred on optimal designs with replicated observations are shown to outperform equivalent models inferred on space-filling replicate-free designs in terms of both model parameter uncertainty and predictive variance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Properties of computing Boolean circuits composed of noisy logical gates are studied using the statistical physics methodology. A formula-growth model that gives rise to random Boolean functions is mapped onto a spin system, which facilitates the study of their typical behavior in the presence of noise. Bounds on their performance, derived in the information theory literature for specific gates, are straightforwardly retrieved, generalized and identified as the corresponding macroscopic phase transitions. The framework is employed for deriving results on error-rates at various function-depths and function sensitivity, and their dependence on the gate-type and noise model used. These are difficult to obtain via the traditional methods used in this field.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

When constructing and using environmental models, it is typical that many of the inputs to the models will not be known perfectly. In some cases, it will be possible to make observations, or occasionally physics-based uncertainty propagation, to ascertain the uncertainty on these inputs. However, such observations are often either not available or even possible, and another approach to characterising the uncertainty on the inputs must be sought. Even when observations are available, if the analysis is being carried out within a Bayesian framework then prior distributions will have to be specified. One option for gathering or at least estimating this information is to employ expert elicitation. Expert elicitation is well studied within statistics and psychology and involves the assessment of the beliefs of a group of experts about an uncertain quantity, (for example an input / parameter within a model), typically in terms of obtaining a probability distribution. One of the challenges in expert elicitation is to minimise the biases that might enter into the judgements made by the individual experts, and then to come to a consensus decision within the group of experts. Effort is made in the elicitation exercise to prevent biases clouding the judgements through well-devised questioning schemes. It is also important that, when reaching a consensus, the experts are exposed to the knowledge of the others in the group. Within the FP7 UncertWeb project (http://www.uncertweb.org/), there is a requirement to build a Webbased tool for expert elicitation. In this paper, we discuss some of the issues of building a Web-based elicitation system - both the technological aspects and the statistical and scientific issues. In particular, we demonstrate two tools: a Web-based system for the elicitation of continuous random variables and a system designed to elicit uncertainty about categorical random variables in the setting of landcover classification uncertainty. The first of these examples is a generic tool developed to elicit uncertainty about univariate continuous random variables. It is designed to be used within an application context and extends the existing SHELF method, adding a web interface and access to metadata. The tool is developed so that it can be readily integrated with environmental models exposed as web services. The second example was developed for the TREES-3 initiative which monitors tropical landcover change through ground-truthing at confluence points. It allows experts to validate the accuracy of automated landcover classifications using site-specific imagery and local knowledge. Experts may provide uncertainty information at various levels: from a general rating of their confidence in a site validation to a numerical ranking of the possible landcover types within a segment. A key challenge in the web based setting is the design of the user interface and the method of interacting between the problem owner and the problem experts. We show the workflow of the elicitation tool, and show how we can represent the final elicited distributions and confusion matrices using UncertML, ready for integration into uncertainty enabled workflows.We also show how the metadata associated with the elicitation exercise is captured and can be referenced from the elicited result, providing crucial lineage information and thus traceability in the decision making process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a novel method for emulating a stochastic, or random output, computer model and show its application to a complex rabies model. The method is evaluated both in terms of accuracy and computational efficiency on synthetic data and the rabies model. We address the issue of experimental design and provide empirical evidence on the effectiveness of utilizing replicate model evaluations compared to a space-filling design. We employ the Mahalanobis error measure to validate the heteroscedastic Gaussian process based emulator predictions for both the mean and (co)variance. The emulator allows efficient screening to identify important model inputs and better understanding of the complex behaviour of the rabies model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis studied the effect of (i) the number of grating components and (ii) parameter randomisation on root-mean-square (r.m.s.) contrast sensitivity and spatial integration. The effectiveness of spatial integration without external spatial noise depended on the number of equally spaced orientation components in the sum of gratings. The critical area marking the saturation of spatial integration was found to decrease when the number of components increased from 1 to 5-6 but increased again at 8-16 components. The critical area behaved similarly as a function of the number of grating components when stimuli consisted of 3, 6 or 16 components with different orientations and/or phases embedded in spatial noise. Spatial integration seemed to depend on the global Fourier structure of the stimulus. Spatial integration was similar for sums of two vertical cosine or sine gratings with various Michelson contrasts in noise. The critical area for a grating sum was found to be a sum of logarithmic critical areas for the component gratings weighted by their relative Michelson contrasts. The human visual system was modelled as a simple image processor where the visual stimuli is first low-pass filtered by the optical modulation transfer function of the human eye and secondly high-pass filtered, up to the spatial cut-off frequency determined by the lowest neural sampling density, by the neural modulation transfer function of the visual pathways. The internal noise is then added before signal interpretation occurs in the brain. The detection is mediated by a local spatially windowed matched filter. The model was extended to include complex stimuli and its applicability to the data was found to be successful. The shape of spatial integration function was similar for non-randomised and randomised simple and complex gratings. However, orientation and/or phase randomised reduced r.m.s contrast sensitivity by a factor of 2. The effect of parameter randomisation on spatial integration was modelled under the assumption that human observers change the observer strategy from cross-correlation (i.e., a matched filter) to auto-correlation detection when uncertainty is introduced to the task. The model described the data accurately.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis includes analysis of disordered spin ensembles corresponding to Exact Cover, a multi-access channel problem, and composite models combining sparse and dense interactions. The satisfiability problem in Exact Cover is addressed using a statistical analysis of a simple branch and bound algorithm. The algorithm can be formulated in the large system limit as a branching process, for which critical properties can be analysed. Far from the critical point a set of differential equations may be used to model the process, and these are solved by numerical integration and exact bounding methods. The multi-access channel problem is formulated as an equilibrium statistical physics problem for the case of bit transmission on a channel with power control and synchronisation. A sparse code division multiple access method is considered and the optimal detection properties are examined in typical case by use of the replica method, and compared to detection performance achieved by interactive decoding methods. These codes are found to have phenomena closely resembling the well-understood dense codes. The composite model is introduced as an abstraction of canonical sparse and dense disordered spin models. The model includes couplings due to both dense and sparse topologies simultaneously. The new type of codes are shown to outperform sparse and dense codes in some regimes both in optimal performance, and in performance achieved by iterative detection methods in finite systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the random input problem for a nonlinear system modeled by the integrable one-dimensional self-focusing nonlinear Schrödinger equation (NLSE). We concentrate on the properties obtained from the direct scattering problem associated with the NLSE. We discuss some general issues regarding soliton creation from random input. We also study the averaged spectral density of random quasilinear waves generated in the NLSE channel for two models of the disordered input field profile. The first model is symmetric complex Gaussian white noise and the second one is a real dichotomous (telegraph) process. For the former model, the closed-form expression for the averaged spectral density is obtained, while for the dichotomous real input we present the small noise perturbative expansion for the same quantity. In the case of the dichotomous input, we also obtain the distribution of minimal pulse width required for a soliton generation. The obtained results can be applied to a multitude of problems including random nonlinear Fraunhoffer diffraction, transmission properties of randomly apodized long period Fiber Bragg gratings, and the propagation of incoherent pulses in optical fibers.