837 resultados para Robust stochastic approximation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper considers the global synchronisation of a stochastic version of coupled map lattices networks through an innovative stochastic adaptive linear quadratic pinning control methodology. In a stochastic network, each state receives only noisy measurement of its neighbours' states. For such networks we derive a generalised Riccati solution that quantifies and incorporates uncertainty of the forward dynamics and inverse controller in the derivation of the stochastic optimal control law. The generalised Riccati solution is derived using the Lyapunov approach. A probabilistic approximation type algorithm is employed to estimate the conditional distributions of the state and inverse controller from historical data and quantifying model uncertainties. The theoretical derivation is complemented by its validation on a set of representative examples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.

Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.

One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.

Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.

In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.

Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.

The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.

Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Subspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.

A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.

The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.

From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.

Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Strategic supply chain optimization (SCO) problems are often modelled as a two-stage optimization problem, in which the first-stage variables represent decisions on the development of the supply chain and the second-stage variables represent decisions on the operations of the supply chain. When uncertainty is explicitly considered, the problem becomes an intractable infinite-dimensional optimization problem, which is usually solved approximately via a scenario or a robust approach. This paper proposes a novel synergy of the scenario and robust approaches for strategic SCO under uncertainty. Two formulations are developed, namely, naïve robust scenario formulation and affinely adjustable robust scenario formulation. It is shown that both formulations can be reformulated into tractable deterministic optimization problems if the uncertainty is bounded with the infinity-norm, and the uncertain equality constraints can be reformulated into deterministic constraints without assumption of the uncertainty region. Case studies of a classical farm planning problem and an energy and bioproduct SCO problem demonstrate the advantages of the proposed formulations over the classical scenario formulation. The proposed formulations not only can generate solutions with guaranteed feasibility or indicate infeasibility of a problem, but also can achieve optimal expected economic performance with smaller numbers of scenarios.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Incidents and rolling stock breakdowns are commonplace in rapid transit rail systems and may disrupt the system performance imposing deviations from planned operations. A network design model is proposed for reducing the effect of disruptions less likely to occur. Failure probabilities are considered functions of the amount of services and the rolling stock’s routing on the designed network so that they cannot be calculated a priori but result from the design process itself. A two recourse stochastic programming model is formulated where the failure probabilities are an implicit function of the number of services and routing of the transit lines.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Une approche classique pour traiter les problèmes d’optimisation avec incertitude à deux- et multi-étapes est d’utiliser l’analyse par scénario. Pour ce faire, l’incertitude de certaines données du problème est modélisée par vecteurs aléatoires avec des supports finis spécifiques aux étapes. Chacune de ces réalisations représente un scénario. En utilisant des scénarios, il est possible d’étudier des versions plus simples (sous-problèmes) du problème original. Comme technique de décomposition par scénario, l’algorithme de recouvrement progressif est une des méthodes les plus populaires pour résoudre les problèmes de programmation stochastique multi-étapes. Malgré la décomposition complète par scénario, l’efficacité de la méthode du recouvrement progressif est très sensible à certains aspects pratiques, tels que le choix du paramètre de pénalisation et la manipulation du terme quadratique dans la fonction objectif du lagrangien augmenté. Pour le choix du paramètre de pénalisation, nous examinons quelques-unes des méthodes populaires, et nous proposons une nouvelle stratégie adaptive qui vise à mieux suivre le processus de l’algorithme. Des expériences numériques sur des exemples de problèmes stochastiques linéaires multi-étapes suggèrent que la plupart des techniques existantes peuvent présenter une convergence prématurée à une solution sous-optimale ou converger vers la solution optimale, mais avec un taux très lent. En revanche, la nouvelle stratégie paraît robuste et efficace. Elle a convergé vers l’optimalité dans toutes nos expériences et a été la plus rapide dans la plupart des cas. Pour la question de la manipulation du terme quadratique, nous faisons une revue des techniques existantes et nous proposons l’idée de remplacer le terme quadratique par un terme linéaire. Bien que qu’il nous reste encore à tester notre méthode, nous avons l’intuition qu’elle réduira certaines difficultés numériques et théoriques de la méthode de recouvrement progressif.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Une approche classique pour traiter les problèmes d’optimisation avec incertitude à deux- et multi-étapes est d’utiliser l’analyse par scénario. Pour ce faire, l’incertitude de certaines données du problème est modélisée par vecteurs aléatoires avec des supports finis spécifiques aux étapes. Chacune de ces réalisations représente un scénario. En utilisant des scénarios, il est possible d’étudier des versions plus simples (sous-problèmes) du problème original. Comme technique de décomposition par scénario, l’algorithme de recouvrement progressif est une des méthodes les plus populaires pour résoudre les problèmes de programmation stochastique multi-étapes. Malgré la décomposition complète par scénario, l’efficacité de la méthode du recouvrement progressif est très sensible à certains aspects pratiques, tels que le choix du paramètre de pénalisation et la manipulation du terme quadratique dans la fonction objectif du lagrangien augmenté. Pour le choix du paramètre de pénalisation, nous examinons quelques-unes des méthodes populaires, et nous proposons une nouvelle stratégie adaptive qui vise à mieux suivre le processus de l’algorithme. Des expériences numériques sur des exemples de problèmes stochastiques linéaires multi-étapes suggèrent que la plupart des techniques existantes peuvent présenter une convergence prématurée à une solution sous-optimale ou converger vers la solution optimale, mais avec un taux très lent. En revanche, la nouvelle stratégie paraît robuste et efficace. Elle a convergé vers l’optimalité dans toutes nos expériences et a été la plus rapide dans la plupart des cas. Pour la question de la manipulation du terme quadratique, nous faisons une revue des techniques existantes et nous proposons l’idée de remplacer le terme quadratique par un terme linéaire. Bien que qu’il nous reste encore à tester notre méthode, nous avons l’intuition qu’elle réduira certaines difficultés numériques et théoriques de la méthode de recouvrement progressif.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mathematical skills that we acquire during formal education mostly entail exact numerical processing. Besides this specifically human faculty, an additional system exists to represent and manipulate quantities in an approximate manner. We share this innate approximate number system (ANS) with other nonhuman animals and are able to use it to process large numerosities long before we can master the formal algorithms taught in school. Dehaene´s (1992) Triple Code Model (TCM) states that also after the onset of formal education, approximate processing is carried out in this analogue magnitude code no matter if the original problem was presented nonsymbolically or symbolically. Despite the wide acceptance of the model, most research only uses nonsymbolic tasks to assess ANS acuity. Due to this silent assumption that genuine approximation can only be tested with nonsymbolic presentations, up to now important implications in research domains of high practical relevance remain unclear, and existing potential is not fully exploited. For instance, it has been found that nonsymbolic approximation can predict math achievement one year later (Gilmore, McCarthy, & Spelke, 2010), that it is robust against the detrimental influence of learners´ socioeconomic status (SES), and that it is suited to foster performance in exact arithmetic in the short-term (Hyde, Khanum, & Spelke, 2014). We provided evidence that symbolic approximation might be equally and in some cases even better suited to generate predictions and foster more formal math skills independently of SES. In two longitudinal studies, we realized exact and approximate arithmetic tasks in both a nonsymbolic and a symbolic format. With first graders, we demonstrated that performance in symbolic approximation at the beginning of term was the only measure consistently not varying according to children´s SES, and among both approximate tasks it was the better predictor for math achievement at the end of first grade. In part, the strong connection seems to come about from mediation through ordinal skills. In two further experiments, we tested the suitability of both approximation formats to induce an arithmetic principle in elementary school children. We found that symbolic approximation was equally effective in making children exploit the additive law of commutativity in a subsequent formal task as a direct instruction. Nonsymbolic approximation on the other hand had no beneficial effect. The positive influence of the symbolic approximate induction was strongest in children just starting school and decreased with age. However, even third graders still profited from the induction. The results show that also symbolic problems can be processed as genuine approximation, but that beyond that they have their own specific value with regard to didactic-educational concerns. Our findings furthermore demonstrate that the two often con-founded factors ꞌformatꞌ and ꞌdemanded accuracyꞌ cannot be disentangled easily in first graders numerical understanding, but that children´s SES also influences existing interrelations between the different abilities tested here.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Super elastic nitinol (NiTi) wires were exploited as highly robust supports for three distinct crosslinked polymeric ionic liquid (PIL)-based coatings in solid-phase microextraction (SPME). The oxidation of NiTi wires in a boiling (30%w/w) H2O2 solution and subsequent derivatization in vinyltrimethoxysilane (VTMS) allowed for vinyl moieties to be appended to the surface of the support. UV-initiated on-fiber copolymerization of the vinyl-substituted NiTi support with monocationic ionic liquid (IL) monomers and dicationic IL crosslinkers produced a crosslinked PIL-based network that was covalently attached to the NiTi wire. This alteration alleviated receding of the coating from the support, which was observed for an analogous crosslinked PIL applied on unmodified NiTi wires. A series of demanding extraction conditions, including extreme pH, pre-exposure to pure organic solvents, and high temperatures, were applied to investigate the versatility and robustness of the fibers. Acceptable precision of the model analytes was obtained for all fibers under these conditions. Method validation by examining the relative recovery of a homologous group of phthalate esters (PAEs) was performed in drip-brewed coffee (maintained at 60 °C) by direct immersion SPME. Acceptable recoveries were obtained for most PAEs in the part-per-billion level, even in this exceedingly harsh and complex matrix.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Evolving interfaces were initially focused on solutions to scientific problems in Fluid Dynamics. With the advent of the more robust modeling provided by Level Set method, their original boundaries of applicability were extended. Specifically to the Geometric Modeling area, works published until then, relating Level Set to tridimensional surface reconstruction, centered themselves on reconstruction from a data cloud dispersed in space; the approach based on parallel planar slices transversal to the object to be reconstructed is still incipient. Based on this fact, the present work proposes to analyse the feasibility of Level Set to tridimensional reconstruction, offering a methodology that simultaneously integrates the proved efficient ideas already published about such approximation and the proposals to process the inherent limitations of the method not satisfactorily treated yet, in particular the excessive smoothing of fine characteristics of contours evolving under Level Set. In relation to this, the application of the variant Particle Level Set is suggested as a solution, for its intrinsic proved capability to preserve mass of dynamic fronts. At the end, synthetic and real data sets are used to evaluate the presented tridimensional surface reconstruction methodology qualitatively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Evolving interfaces were initially focused on solutions to scientific problems in Fluid Dynamics. With the advent of the more robust modeling provided by Level Set method, their original boundaries of applicability were extended. Specifically to the Geometric Modeling area, works published until then, relating Level Set to tridimensional surface reconstruction, centered themselves on reconstruction from a data cloud dispersed in space; the approach based on parallel planar slices transversal to the object to be reconstructed is still incipient. Based on this fact, the present work proposes to analyse the feasibility of Level Set to tridimensional reconstruction, offering a methodology that simultaneously integrates the proved efficient ideas already published about such approximation and the proposals to process the inherent limitations of the method not satisfactorily treated yet, in particular the excessive smoothing of fine characteristics of contours evolving under Level Set. In relation to this, the application of the variant Particle Level Set is suggested as a solution, for its intrinsic proved capability to preserve mass of dynamic fronts. At the end, synthetic and real data sets are used to evaluate the presented tridimensional surface reconstruction methodology qualitatively.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Universidade Estadual de Campinas. Faculdade de Educação Física

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We have the purpose of analyzing the effect of explicit diffusion processes in a predator-prey stochastic lattice model. More precisely we wish to investigate the possible effects due to diffusion upon the thresholds of coexistence of species, i. e., the possible changes in the transition between the active state and the absorbing state devoid of predators. To accomplish this task we have performed time dependent simulations and dynamic mean-field approximations. Our results indicate that the diffusive process can enhance the species coexistence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consider N sites randomly and uniformly distributed in a d-dimensional hypercube. A walker explores this disordered medium going to the nearest site, which has not been visited in the last mu (memory) steps. The walker trajectory is composed of a transient part and a periodic part (cycle). For one-dimensional systems, travelers can or cannot explore all available space, giving rise to a crossover between localized and extended regimes at the critical memory mu(1) = log(2) N. The deterministic rule can be softened to consider more realistic situations with the inclusion of a stochastic parameter T (temperature). In this case, the walker movement is driven by a probability density function parameterized by T and a cost function. The cost function increases as the distance between two sites and favors hops to closer sites. As the temperature increases, the walker can escape from cycles that are reminiscent of the deterministic nature and extend the exploration. Here, we report an analytical model and numerical studies of the influence of the temperature and the critical memory in the exploration of one-dimensional disordered systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Consider a random medium consisting of N points randomly distributed so that there is no correlation among the distances separating them. This is the random link model, which is the high dimensionality limit (mean-field approximation) for the Euclidean random point structure. In the random link model, at discrete time steps, a walker moves to the nearest point, which has not been visited in the last mu steps (memory), producing a deterministic partially self-avoiding walk (the tourist walk). We have analytically obtained the distribution of the number n of points explored by the walker with memory mu=2, as well as the transient and period joint distribution. This result enables us to explain the abrupt change in the exploratory behavior between the cases mu=1 (memoryless walker, driven by extreme value statistics) and mu=2 (walker with memory, driven by combinatorial statistics). In the mu=1 case, the mean newly visited points in the thermodynamic limit (N >> 1) is just < n >=e=2.72... while in the mu=2 case, the mean number < n > of visited points grows proportionally to N(1/2). Also, this result allows us to establish an equivalence between the random link model with mu=2 and random map (uncorrelated back and forth distances) with mu=0 and the abrupt change between the probabilities for null transient time and subsequent ones.