10 resultados para Stochastic adding machine

em CaltechTHESIS


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of "exit against a flow" for dynamical systems subject to small Gaussian white noise excitation is studied. Here the word "flow" refers to the behavior in phase space of the unperturbed system's state variables. "Exit against a flow" occurs if a perturbation causes the phase point to leave a phase space region within which it would normally be confined. In particular, there are two components of the problem of exit against a flow:

i) the mean exit time

ii) the phase-space distribution of exit locations.

When the noise perturbing the dynamical systems is small, the solution of each component of the problem of exit against a flow is, in general, the solution of a singularly perturbed, degenerate elliptic-parabolic boundary value problem.

Singular perturbation techniques are used to express the asymptotic solution in terms of an unknown parameter. The unknown parameter is determined using the solution of the adjoint boundary value problem.

The problem of exit against a flow for several dynamical systems of physical interest is considered, and the mean exit times and distributions of exit positions are calculated. The systems are then simulated numerically, using Monte Carlo techniques, in order to determine the validity of the asymptotic solutions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A theory of two-point boundary value problems analogous to the theory of initial value problems for stochastic ordinary differential equations whose solutions form Markov processes is developed. The theory of initial value problems consists of three main parts: the proof that the solution process is markovian and diffusive; the construction of the Kolmogorov or Fokker-Planck equation of the process; and the proof that the transistion probability density of the process is a unique solution of the Fokker-Planck equation.

It is assumed here that the stochastic differential equation under consideration has, as an initial value problem, a diffusive markovian solution process. When a given boundary value problem for this stochastic equation almost surely has unique solutions, we show that the solution process of the boundary value problem is also a diffusive Markov process. Since a boundary value problem, unlike an initial value problem, has no preferred direction for the parameter set, we find that there are two Fokker-Planck equations, one for each direction. It is shown that the density of the solution process of the boundary value problem is the unique simultaneous solution of this pair of Fokker-Planck equations.

This theory is then applied to the problem of a vibrating string with stochastic density.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Partial differential equations (PDEs) with multiscale coefficients are very difficult to solve due to the wide range of scales in the solutions. In the thesis, we propose some efficient numerical methods for both deterministic and stochastic PDEs based on the model reduction technique.

For the deterministic PDEs, the main purpose of our method is to derive an effective equation for the multiscale problem. An essential ingredient is to decompose the harmonic coordinate into a smooth part and a highly oscillatory part of which the magnitude is small. Such a decomposition plays a key role in our construction of the effective equation. We show that the solution to the effective equation is smooth, and could be resolved on a regular coarse mesh grid. Furthermore, we provide error analysis and show that the solution to the effective equation plus a correction term is close to the original multiscale solution.

For the stochastic PDEs, we propose the model reduction based data-driven stochastic method and multilevel Monte Carlo method. In the multiquery, setting and on the assumption that the ratio of the smallest scale and largest scale is not too small, we propose the multiscale data-driven stochastic method. We construct a data-driven stochastic basis and solve the coupled deterministic PDEs to obtain the solutions. For the tougher problems, we propose the multiscale multilevel Monte Carlo method. We apply the multilevel scheme to the effective equations and assemble the stiffness matrices efficiently on each coarse mesh grid. In both methods, the $\KL$ expansion plays an important role in extracting the main parts of some stochastic quantities.

For both the deterministic and stochastic PDEs, numerical results are presented to demonstrate the accuracy and robustness of the methods. We also show the computational time cost reduction in the numerical examples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Hamilton Jacobi Bellman (HJB) equation is central to stochastic optimal control (SOC) theory, yielding the optimal solution to general problems specified by known dynamics and a specified cost functional. Given the assumption of quadratic cost on the control input, it is well known that the HJB reduces to a particular partial differential equation (PDE). While powerful, this reduction is not commonly used as the PDE is of second order, is nonlinear, and examples exist where the problem may not have a solution in a classical sense. Furthermore, each state of the system appears as another dimension of the PDE, giving rise to the curse of dimensionality. Since the number of degrees of freedom required to solve the optimal control problem grows exponentially with dimension, the problem becomes intractable for systems with all but modest dimension.

In the last decade researchers have found that under certain, fairly non-restrictive structural assumptions, the HJB may be transformed into a linear PDE, with an interesting analogue in the discretized domain of Markov Decision Processes (MDP). The work presented in this thesis uses the linearity of this particular form of the HJB PDE to push the computational boundaries of stochastic optimal control.

This is done by crafting together previously disjoint lines of research in computation. The first of these is the use of Sum of Squares (SOS) techniques for synthesis of control policies. A candidate polynomial with variable coefficients is proposed as the solution to the stochastic optimal control problem. An SOS relaxation is then taken to the partial differential constraints, leading to a hierarchy of semidefinite relaxations with improving sub-optimality gap. The resulting approximate solutions are shown to be guaranteed over- and under-approximations for the optimal value function. It is shown that these results extend to arbitrary parabolic and elliptic PDEs, yielding a novel method for Uncertainty Quantification (UQ) of systems governed by partial differential constraints. Domain decomposition techniques are also made available, allowing for such problems to be solved via parallelization and low-order polynomials.

The optimization-based SOS technique is then contrasted with the Separated Representation (SR) approach from the applied mathematics community. The technique allows for systems of equations to be solved through a low-rank decomposition that results in algorithms that scale linearly with dimensionality. Its application in stochastic optimal control allows for previously uncomputable problems to be solved quickly, scaling to such complex systems as the Quadcopter and VTOL aircraft. This technique may be combined with the SOS approach, yielding not only a numerical technique, but also an analytical one that allows for entirely new classes of systems to be studied and for stability properties to be guaranteed.

The analysis of the linear HJB is completed by the study of its implications in application. It is shown that the HJB and a popular technique in robotics, the use of navigation functions, sit on opposite ends of a spectrum of optimization problems, upon which tradeoffs may be made in problem complexity. Analytical solutions to the HJB in these settings are available in simplified domains, yielding guidance towards optimality for approximation schemes. Finally, the use of HJB equations in temporal multi-task planning problems is investigated. It is demonstrated that such problems are reducible to a sequence of SOC problems linked via boundary conditions. The linearity of the PDE allows us to pre-compute control policy primitives and then compose them, at essentially zero cost, to satisfy a complex temporal logic specification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is a growing interest in taking advantage of possible patterns and structures in data so as to extract the desired information and overcome the curse of dimensionality. In a wide range of applications, including computer vision, machine learning, medical imaging, and social networks, the signal that gives rise to the observations can be modeled to be approximately sparse and exploiting this fact can be very beneficial. This has led to an immense interest in the problem of efficiently reconstructing a sparse signal from limited linear observations. More recently, low-rank approximation techniques have become prominent tools to approach problems arising in machine learning, system identification and quantum tomography.

In sparse and low-rank estimation problems, the challenge is the inherent intractability of the objective function, and one needs efficient methods to capture the low-dimensionality of these models. Convex optimization is often a promising tool to attack such problems. An intractable problem with a combinatorial objective can often be "relaxed" to obtain a tractable but almost as powerful convex optimization problem. This dissertation studies convex optimization techniques that can take advantage of low-dimensional representations of the underlying high-dimensional data. We provide provable guarantees that ensure that the proposed algorithms will succeed under reasonable conditions, and answer questions of the following flavor:

  • For a given number of measurements, can we reliably estimate the true signal?
  • If so, how good is the reconstruction as a function of the model parameters?

More specifically, i) Focusing on linear inverse problems, we generalize the classical error bounds known for the least-squares technique to the lasso formulation, which incorporates the signal model. ii) We show that intuitive convex approaches do not perform as well as expected when it comes to signals that have multiple low-dimensional structures simultaneously. iii) Finally, we propose convex relaxations for the graph clustering problem and give sharp performance guarantees for a family of graphs arising from the so-called stochastic block model. We pay particular attention to the following aspects. For i) and ii), we aim to provide a general geometric framework, in which the results on sparse and low-rank estimation can be obtained as special cases. For i) and iii), we investigate the precise performance characterization, which yields the right constants in our bounds and the true dependence between the problem parameters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the first part of the thesis we explore three fundamental questions that arise naturally when we conceive a machine learning scenario where the training and test distributions can differ. Contrary to conventional wisdom, we show that in fact mismatched training and test distribution can yield better out-of-sample performance. This optimal performance can be obtained by training with the dual distribution. This optimal training distribution depends on the test distribution set by the problem, but not on the target function that we want to learn. We show how to obtain this distribution in both discrete and continuous input spaces, as well as how to approximate it in a practical scenario. Benefits of using this distribution are exemplified in both synthetic and real data sets.

In order to apply the dual distribution in the supervised learning scenario where the training data set is fixed, it is necessary to use weights to make the sample appear as if it came from the dual distribution. We explore the negative effect that weighting a sample can have. The theoretical decomposition of the use of weights regarding its effect on the out-of-sample error is easy to understand but not actionable in practice, as the quantities involved cannot be computed. Hence, we propose the Targeted Weighting algorithm that determines if, for a given set of weights, the out-of-sample performance will improve or not in a practical setting. This is necessary as the setting assumes there are no labeled points distributed according to the test distribution, only unlabeled samples.

Finally, we propose a new class of matching algorithms that can be used to match the training set to a desired distribution, such as the dual distribution (or the test distribution). These algorithms can be applied to very large datasets, and we show how they lead to improved performance in a large real dataset such as the Netflix dataset. Their computational complexity is the main reason for their advantage over previous algorithms proposed in the covariate shift literature.

In the second part of the thesis we apply Machine Learning to the problem of behavior recognition. We develop a specific behavior classifier to study fly aggression, and we develop a system that allows analyzing behavior in videos of animals, with minimal supervision. The system, which we call CUBA (Caltech Unsupervised Behavior Analysis), allows detecting movemes, actions, and stories from time series describing the position of animals in videos. The method summarizes the data, as well as it provides biologists with a mathematical tool to test new hypotheses. Other benefits of CUBA include finding classifiers for specific behaviors without the need for annotation, as well as providing means to discriminate groups of animals, for example, according to their genetic line.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A general review of stochastic processes is given in the introduction; definitions, properties and a rough classification are presented together with the position and scope of the author's work as it fits into the general scheme.

The first section presents a brief summary of the pertinent analytical properties of continuous stochastic processes and their probability-theoretic foundations which are used in the sequel.

The remaining two sections (II and III), comprising the body of the work, are the author's contribution to the theory. It turns out that a very inclusive class of continuous stochastic processes are characterized by a fundamental partial differential equation and its adjoint (the Fokker-Planck equations). The coefficients appearing in those equations assimilate, in a most concise way, all the salient properties of the process, freed from boundary value considerations. The writer’s work consists in characterizing the processes through these coefficients without recourse to solving the partial differential equations.

First, a class of coefficients leading to a unique, continuous process is presented, and several facts are proven to show why this class is restricted. Then, in terms of the coefficients, the unconditional statistics are deduced, these being the mean, variance and covariance. The most general class of coefficients leading to the Gaussian distribution is deduced, and a complete characterization of these processes is presented. By specializing the coefficients, all the known stochastic processes may be readily studied, and some examples of these are presented; viz. the Einstein process, Bachelier process, Ornstein-Uhlenbeck process, etc. The calculations are effectively reduced down to ordinary first order differential equations, and in addition to giving a comprehensive characterization, the derivations are materially simplified over the solution to the original partial differential equations.

In the last section the properties of the integral process are presented. After an expository section on the definition, meaning, and importance of the integral process, a particular example is carried through starting from basic definition. This illustrates the fundamental properties, and an inherent paradox. Next the basic coefficients of the integral process are studied in terms of the original coefficients, and the integral process is uniquely characterized. It is shown that the integral process, with a slight modification, is a continuous Markoff process.

The elementary statistics of the integral process are deduced: means, variances, and covariances, in terms of the original coefficients. It is shown that an integral process is never temporally homogeneous in a non-degenerate process.

Finally, in terms of the original class of admissible coefficients, the statistics of the integral process are explicitly presented, and the integral process of all known continuous processes are specified.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

H. J. Kushner has obtained the differential equation satisfied by the optimal feedback control law for a stochastic control system in which the plant dynamics and observations are perturbed by independent additive Gaussian white noise processes. However, the differentiation includes the first and second functional derivatives and, except for a restricted set of systems, is too complex to solve with present techniques.

This investigation studies the optimal control law for the open loop system and incorporates it in a sub-optimal feedback control law. This suboptimal control law's performance is at least as good as that of the optimal control function and satisfies a differential equation involving only the first functional derivative. The solution of this equation is equivalent to solving two two-point boundary valued integro-partial differential equations. An approximate solution has advantages over the conventional approximate solution of Kushner's equation.

As a result of this study, well known results of deterministic optimal control are deduced from the analysis of optimal open loop control.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Optical Coherence Tomography(OCT) is a popular, rapidly growing imaging technique with an increasing number of bio-medical applications due to its noninvasive nature. However, there are three major challenges in understanding and improving an OCT system: (1) Obtaining an OCT image is not easy. It either takes a real medical experiment or requires days of computer simulation. Without much data, it is difficult to study the physical processes underlying OCT imaging of different objects simply because there aren't many imaged objects. (2) Interpretation of an OCT image is also hard. This challenge is more profound than it appears. For instance, it would require a trained expert to tell from an OCT image of human skin whether there is a lesion or not. This is expensive in its own right, but even the expert cannot be sure about the exact size of the lesion or the width of the various skin layers. The take-away message is that analyzing an OCT image even from a high level would usually require a trained expert, and pixel-level interpretation is simply unrealistic. The reason is simple: we have OCT images but not their underlying ground-truth structure, so there is nothing to learn from. (3) The imaging depth of OCT is very limited (millimeter or sub-millimeter on human tissues). While OCT utilizes infrared light for illumination to stay noninvasive, the downside of this is that photons at such long wavelengths can only penetrate a limited depth into the tissue before getting back-scattered. To image a particular region of a tissue, photons first need to reach that region. As a result, OCT signals from deeper regions of the tissue are both weak (since few photons reached there) and distorted (due to multiple scatterings of the contributing photons). This fact alone makes OCT images very hard to interpret.

This thesis addresses the above challenges by successfully developing an advanced Monte Carlo simulation platform which is 10000 times faster than the state-of-the-art simulator in the literature, bringing down the simulation time from 360 hours to a single minute. This powerful simulation tool not only enables us to efficiently generate as many OCT images of objects with arbitrary structure and shape as we want on a common desktop computer, but it also provides us the underlying ground-truth of the simulated images at the same time because we dictate them at the beginning of the simulation. This is one of the key contributions of this thesis. What allows us to build such a powerful simulation tool includes a thorough understanding of the signal formation process, clever implementation of the importance sampling/photon splitting procedure, efficient use of a voxel-based mesh system in determining photon-mesh interception, and a parallel computation of different A-scans that consist a full OCT image, among other programming and mathematical tricks, which will be explained in detail later in the thesis.

Next we aim at the inverse problem: given an OCT image, predict/reconstruct its ground-truth structure on a pixel level. By solving this problem we would be able to interpret an OCT image completely and precisely without the help from a trained expert. It turns out that we can do much better. For simple structures we are able to reconstruct the ground-truth of an OCT image more than 98% correctly, and for more complicated structures (e.g., a multi-layered brain structure) we are looking at 93%. We achieved this through extensive uses of Machine Learning. The success of the Monte Carlo simulation already puts us in a great position by providing us with a great deal of data (effectively unlimited), in the form of (image, truth) pairs. Through a transformation of the high-dimensional response variable, we convert the learning task into a multi-output multi-class classification problem and a multi-output regression problem. We then build a hierarchy architecture of machine learning models (committee of experts) and train different parts of the architecture with specifically designed data sets. In prediction, an unseen OCT image first goes through a classification model to determine its structure (e.g., the number and the types of layers present in the image); then the image is handed to a regression model that is trained specifically for that particular structure to predict the length of the different layers and by doing so reconstruct the ground-truth of the image. We also demonstrate that ideas from Deep Learning can be useful to further improve the performance.

It is worth pointing out that solving the inverse problem automatically improves the imaging depth, since previously the lower half of an OCT image (i.e., greater depth) can be hardly seen but now becomes fully resolved. Interestingly, although OCT signals consisting the lower half of the image are weak, messy, and uninterpretable to human eyes, they still carry enough information which when fed into a well-trained machine learning model spits out precisely the true structure of the object being imaged. This is just another case where Artificial Intelligence (AI) outperforms human. To the best knowledge of the author, this thesis is not only a success but also the first attempt to reconstruct an OCT image at a pixel level. To even give a try on this kind of task, it would require fully annotated OCT images and a lot of them (hundreds or even thousands). This is clearly impossible without a powerful simulation tool like the one developed in this thesis.