8 resultados para Expectation Maximization Estimation
em CaltechTHESIS
Resumo:
The brain is perhaps the most complex system to have ever been subjected to rigorous scientific investigation. The scale is staggering: over 10^11 neurons, each making an average of 10^3 synapses, with computation occurring on scales ranging from a single dendritic spine, to an entire cortical area. Slowly, we are beginning to acquire experimental tools that can gather the massive amounts of data needed to characterize this system. However, to understand and interpret these data will also require substantial strides in inferential and statistical techniques. This dissertation attempts to meet this need, extending and applying the modern tools of latent variable modeling to problems in neural data analysis.
It is divided into two parts. The first begins with an exposition of the general techniques of latent variable modeling. A new, extremely general, optimization algorithm is proposed - called Relaxation Expectation Maximization (REM) - that may be used to learn the optimal parameter values of arbitrary latent variable models. This algorithm appears to alleviate the common problem of convergence to local, sub-optimal, likelihood maxima. REM leads to a natural framework for model size selection; in combination with standard model selection techniques the quality of fits may be further improved, while the appropriate model size is automatically and efficiently determined. Next, a new latent variable model, the mixture of sparse hidden Markov models, is introduced, and approximate inference and learning algorithms are derived for it. This model is applied in the second part of the thesis.
The second part brings the technology of part I to bear on two important problems in experimental neuroscience. The first is known as spike sorting; this is the problem of separating the spikes from different neurons embedded within an extracellular recording. The dissertation offers the first thorough statistical analysis of this problem, which then yields the first powerful probabilistic solution. The second problem addressed is that of characterizing the distribution of spike trains recorded from the same neuron under identical experimental conditions. A latent variable model is proposed. Inference and learning in this model leads to new principled algorithms for smoothing and clustering of spike data.
Resumo:
Be it a physical object or a mathematical model, a nonlinear dynamical system can display complicated aperiodic behavior, or "chaos." In many cases, this chaos is associated with motion on a strange attractor in the system's phase space. And the dimension of the strange attractor indicates the effective number of degrees of freedom in the dynamical system.
In this thesis, we investigate numerical issues involved with estimating the dimension of a strange attractor from a finite time series of measurements on the dynamical system.
Of the various definitions of dimension, we argue that the correlation dimension is the most efficiently calculable and we remark further that it is the most commonly calculated. We are concerned with the practical problems that arise in attempting to compute the correlation dimension. We deal with geometrical effects (due to the inexact self-similarity of the attractor), dynamical effects (due to the nonindependence of points generated by the dynamical system that defines the attractor), and statistical effects (due to the finite number of points that sample the attractor). We propose a modification of the standard algorithm, which eliminates a specific effect due to autocorrelation, and a new implementation of the correlation algorithm, which is computationally efficient.
Finally, we apply the algorithm to chaotic data from the Caltech tokamak and the Texas tokamak (TEXT); we conclude that plasma turbulence is not a low- dimensional phenomenon.
Resumo:
This thesis presents a novel framework for state estimation in the context of robotic grasping and manipulation. The overall estimation approach is based on fusing various visual cues for manipulator tracking, namely appearance and feature-based, shape-based, and silhouette-based visual cues. Similarly, a framework is developed to fuse the above visual cues, but also kinesthetic cues such as force-torque and tactile measurements, for in-hand object pose estimation. The cues are extracted from multiple sensor modalities and are fused in a variety of Kalman filters.
A hybrid estimator is developed to estimate both a continuous state (robot and object states) and discrete states, called contact modes, which specify how each finger contacts a particular object surface. A static multiple model estimator is used to compute and maintain this mode probability. The thesis also develops an estimation framework for estimating model parameters associated with object grasping. Dual and joint state-parameter estimation is explored for parameter estimation of a grasped object's mass and center of mass. Experimental results demonstrate simultaneous object localization and center of mass estimation.
Dual-arm estimation is developed for two arm robotic manipulation tasks. Two types of filters are explored; the first is an augmented filter that contains both arms in the state vector while the second runs two filters in parallel, one for each arm. These two frameworks and their performance is compared in a dual-arm task of removing a wheel from a hub.
This thesis also presents a new method for action selection involving touch. This next best touch method selects an available action for interacting with an object that will gain the most information. The algorithm employs information theory to compute an information gain metric that is based on a probabilistic belief suitable for the task. An estimation framework is used to maintain this belief over time. Kinesthetic measurements such as contact and tactile measurements are used to update the state belief after every interactive action. Simulation and experimental results are demonstrated using next best touch for object localization, specifically a door handle on a door. The next best touch theory is extended for model parameter determination. Since many objects within a particular object category share the same rough shape, principle component analysis may be used to parametrize the object mesh models. These parameters can be estimated using the action selection technique that selects the touching action which best both localizes and estimates these parameters. Simulation results are then presented involving localizing and determining a parameter of a screwdriver.
Lastly, the next best touch theory is further extended to model classes. Instead of estimating parameters, object class determination is incorporated into the information gain metric calculation. The best touching action is selected in order to best discern between the possible model classes. Simulation results are presented to validate the theory.
Resumo:
Signal processing techniques play important roles in the design of digital communication systems. These include information manipulation, transmitter signal processing, channel estimation, channel equalization and receiver signal processing. By interacting with communication theory and system implementing technologies, signal processing specialists develop efficient schemes for various communication problems by wisely exploiting various mathematical tools such as analysis, probability theory, matrix theory, optimization theory, and many others. In recent years, researchers realized that multiple-input multiple-output (MIMO) channel models are applicable to a wide range of different physical communications channels. Using the elegant matrix-vector notations, many MIMO transceiver (including the precoder and equalizer) design problems can be solved by matrix and optimization theory. Furthermore, the researchers showed that the majorization theory and matrix decompositions, such as singular value decomposition (SVD), geometric mean decomposition (GMD) and generalized triangular decomposition (GTD), provide unified frameworks for solving many of the point-to-point MIMO transceiver design problems.
In this thesis, we consider the transceiver design problems for linear time invariant (LTI) flat MIMO channels, linear time-varying narrowband MIMO channels, flat MIMO broadcast channels, and doubly selective scalar channels. Additionally, the channel estimation problem is also considered. The main contributions of this dissertation are the development of new matrix decompositions, and the uses of the matrix decompositions and majorization theory toward the practical transmit-receive scheme designs for transceiver optimization problems. Elegant solutions are obtained, novel transceiver structures are developed, ingenious algorithms are proposed, and performance analyses are derived.
The first part of the thesis focuses on transceiver design with LTI flat MIMO channels. We propose a novel matrix decomposition which decomposes a complex matrix as a product of several sets of semi-unitary matrices and upper triangular matrices in an iterative manner. The complexity of the new decomposition, generalized geometric mean decomposition (GGMD), is always less than or equal to that of geometric mean decomposition (GMD). The optimal GGMD parameters which yield the minimal complexity are derived. Based on the channel state information (CSI) at both the transmitter (CSIT) and receiver (CSIR), GGMD is used to design a butterfly structured decision feedback equalizer (DFE) MIMO transceiver which achieves the minimum average mean square error (MSE) under the total transmit power constraint. A novel iterative receiving detection algorithm for the specific receiver is also proposed. For the application to cyclic prefix (CP) systems in which the SVD of the equivalent channel matrix can be easily computed, the proposed GGMD transceiver has K/log_2(K) times complexity advantage over the GMD transceiver, where K is the number of data symbols per data block and is a power of 2. The performance analysis shows that the GGMD DFE transceiver can convert a MIMO channel into a set of parallel subchannels with the same bias and signal to interference plus noise ratios (SINRs). Hence, the average bit rate error (BER) is automatically minimized without the need for bit allocation. Moreover, the proposed transceiver can achieve the channel capacity simply by applying independent scalar Gaussian codes of the same rate at subchannels.
In the second part of the thesis, we focus on MIMO transceiver design for slowly time-varying MIMO channels with zero-forcing or MMSE criterion. Even though the GGMD/GMD DFE transceivers work for slowly time-varying MIMO channels by exploiting the instantaneous CSI at both ends, their performance is by no means optimal since the temporal diversity of the time-varying channels is not exploited. Based on the GTD, we develop space-time GTD (ST-GTD) for the decomposition of linear time-varying flat MIMO channels. Under the assumption that CSIT, CSIR and channel prediction are available, by using the proposed ST-GTD, we develop space-time geometric mean decomposition (ST-GMD) DFE transceivers under the zero-forcing or MMSE criterion. Under perfect channel prediction, the new system minimizes both the average MSE at the detector in each space-time (ST) block (which consists of several coherence blocks), and the average per ST-block BER in the moderate high SNR region. Moreover, the ST-GMD DFE transceiver designed under an MMSE criterion maximizes Gaussian mutual information over the equivalent channel seen by each ST-block. In general, the newly proposed transceivers perform better than the GGMD-based systems since the super-imposed temporal precoder is able to exploit the temporal diversity of time-varying channels. For practical applications, a novel ST-GTD based system which does not require channel prediction but shares the same asymptotic BER performance with the ST-GMD DFE transceiver is also proposed.
The third part of the thesis considers two quality of service (QoS) transceiver design problems for flat MIMO broadcast channels. The first one is the power minimization problem (min-power) with a total bitrate constraint and per-stream BER constraints. The second problem is the rate maximization problem (max-rate) with a total transmit power constraint and per-stream BER constraints. Exploiting a particular class of joint triangularization (JT), we are able to jointly optimize the bit allocation and the broadcast DFE transceiver for the min-power and max-rate problems. The resulting optimal designs are called the minimum power JT broadcast DFE transceiver (MPJT) and maximum rate JT broadcast DFE transceiver (MRJT), respectively. In addition to the optimal designs, two suboptimal designs based on QR decomposition are proposed. They are realizable for arbitrary number of users.
Finally, we investigate the design of a discrete Fourier transform (DFT) modulated filterbank transceiver (DFT-FBT) with LTV scalar channels. For both cases with known LTV channels and unknown wide sense stationary uncorrelated scattering (WSSUS) statistical channels, we show how to optimize the transmitting and receiving prototypes of a DFT-FBT such that the SINR at the receiver is maximized. Also, a novel pilot-aided subspace channel estimation algorithm is proposed for the orthogonal frequency division multiplexing (OFDM) systems with quasi-stationary multi-path Rayleigh fading channels. Using the concept of a difference co-array, the new technique can construct M^2 co-pilots from M physical pilot tones with alternating pilot placement. Subspace methods, such as MUSIC and ESPRIT, can be used to estimate the multipath delays and the number of identifiable paths is up to O(M^2), theoretically. With the delay information, a MMSE estimator for frequency response is derived. It is shown through simulations that the proposed method outperforms the conventional subspace channel estimator when the number of multipaths is greater than or equal to the number of physical pilots minus one.
Resumo:
A central objective in signal processing is to infer meaningful information from a set of measurements or data. While most signal models have an overdetermined structure (the number of unknowns less than the number of equations), traditionally very few statistical estimation problems have considered a data model which is underdetermined (number of unknowns more than the number of equations). However, in recent times, an explosion of theoretical and computational methods have been developed primarily to study underdetermined systems by imposing sparsity on the unknown variables. This is motivated by the observation that inspite of the huge volume of data that arises in sensor networks, genomics, imaging, particle physics, web search etc., their information content is often much smaller compared to the number of raw measurements. This has given rise to the possibility of reducing the number of measurements by down sampling the data, which automatically gives rise to underdetermined systems.
In this thesis, we provide new directions for estimation in an underdetermined system, both for a class of parameter estimation problems and also for the problem of sparse recovery in compressive sensing. There are two main contributions of the thesis: design of new sampling and statistical estimation algorithms for array processing, and development of improved guarantees for sparse reconstruction by introducing a statistical framework to the recovery problem.
We consider underdetermined observation models in array processing where the number of unknown sources simultaneously received by the array can be considerably larger than the number of physical sensors. We study new sparse spatial sampling schemes (array geometries) as well as propose new recovery algorithms that can exploit priors on the unknown signals and unambiguously identify all the sources. The proposed sampling structure is generic enough to be extended to multiple dimensions as well as to exploit different kinds of priors in the model such as correlation, higher order moments, etc.
Recognizing the role of correlation priors and suitable sampling schemes for underdetermined estimation in array processing, we introduce a correlation aware framework for recovering sparse support in compressive sensing. We show that it is possible to strictly increase the size of the recoverable sparse support using this framework provided the measurement matrix is suitably designed. The proposed nested and coprime arrays are shown to be appropriate candidates in this regard. We also provide new guarantees for convex and greedy formulations of the support recovery problem and demonstrate that it is possible to strictly improve upon existing guarantees.
This new paradigm of underdetermined estimation that explicitly establishes the fundamental interplay between sampling, statistical priors and the underlying sparsity, leads to exciting future research directions in a variety of application areas, and also gives rise to new questions that can lead to stand-alone theoretical results in their own right.
Resumo:
Complexity in the earthquake rupture process can result from many factors. This study investigates the origin of such complexity by examining several recent, large earthquakes in detail. In each case the local tectonic environment plays an important role in understanding the source of the complexity.
Several large shallow earthquakes (Ms > 7.0) along the Middle American Trench have similarities and differences between them that may lead to a better understanding of fracture and subduction processes. They are predominantly thrust events consistent with the known subduction of the Cocos plate beneath N. America. Two events occurring along this subduction zone close to triple junctions show considerable complexity. This may be attributable to a more heterogeneous stress environment in these regions and as such has implications for other subduction zone boundaries.
An event which looks complex but is actually rather simple is the 1978 Bermuda earthquake (Ms ~ 6). It is located predominantly in the mantle. Its mechanism is one of pure thrust faulting with a strike N 20°W and dip 42°NE. Its apparent complexity is caused by local crustal structure. This is an important event in terms of understanding and estimating seismic hazard on the eastern seaboard of N. America.
A study of several large strike-slip continental earthquakes identifies characteristics which are common to them and may be useful in determining what to expect from the next great earthquake on the San Andreas fault. The events are the 1976 Guatemala earthquake on the Motagua fault and two events on the Anatolian fault in Turkey (the 1967, Mudurnu Valley and 1976, E. Turkey events). An attempt to model the complex P-waveforms of these events results in good synthetic fits for the Guatemala and Mudurnu Valley events. However, the E. Turkey event proves to be too complex as it may have associated thrust or normal faulting. Several individual sources occurring at intervals of between 5 and 20 seconds characterize the Guatemala and Mudurnu Valley events. The maximum size of an individual source appears to be bounded at about 5 x 1026 dyne-cm. A detailed source study including directivity is performed on the Guatemala event. The source time history of the Mudurnu Valley event illustrates its significance in modeling strong ground motion in the near field. The complex source time series of the 1967 event produces amplitudes greater by a factor of 2.5 than a uniform model scaled to the same size for a station 20 km from the fault.
Three large and important earthquakes demonstrate an important type of complexity --- multiple-fault complexity. The first, the 1976 Philippine earthquake, an oblique thrust event, represents the first seismological evidence for a northeast dipping subduction zone beneath the island of Mindanao. A large event, following the mainshock by 12 hours, occurred outside the aftershock area and apparently resulted from motion on a subsidiary fault since the event had a strike-slip mechanism.
An aftershock of the great 1960 Chilean earthquake on June 6, 1960, proved to be an interesting discovery. It appears to be a large strike-slip event at the main rupture's southern boundary. It most likely occurred on the landward extension of the Chile Rise transform fault, in the subducting plate. The results for this event suggest that a small event triggered a series of slow events; the duration of the whole sequence being longer than 1 hour. This is indeed a "slow earthquake".
Perhaps one of the most complex of events is the recent Tangshan, China event. It began as a large strike-slip event. Within several seconds of the mainshock it may have triggered thrust faulting to the south of the epicenter. There is no doubt, however, that it triggered a large oblique normal event to the northeast, 15 hours after the mainshock. This event certainly contributed to the great loss of life-sustained as a result of the Tangshan earthquake sequence.
What has been learned from these studies has been applied to predict what one might expect from the next great earthquake on the San Andreas. The expectation from this study is that such an event would be a large complex event, not unlike, but perhaps larger than, the Guatemala or Mudurnu Valley events. That is to say, it will most likely consist of a series of individual events in sequence. It is also quite possible that the event could trigger associated faulting on neighboring fault systems such as those occurring in the Transverse Ranges. This has important bearing on the earthquake hazard estimation for the region.
Resumo:
The following work explores the processes individuals utilize when making multi-attribute choices. With the exception of extremely simple or familiar choices, most decisions we face can be classified as multi-attribute choices. In order to evaluate and make choices in such an environment, we must be able to estimate and weight the particular attributes of an option. Hence, better understanding the mechanisms involved in this process is an important step for economists and psychologists. For example, when choosing between two meals that differ in taste and nutrition, what are the mechanisms that allow us to estimate and then weight attributes when constructing value? Furthermore, how can these mechanisms be influenced by variables such as attention or common physiological states, like hunger?
In order to investigate these and similar questions, we use a combination of choice and attentional data, where the attentional data was collected by recording eye movements as individuals made decisions. Chapter 1 designs and tests a neuroeconomic model of multi-attribute choice that makes predictions about choices, response time, and how these variables are correlated with attention. Chapter 2 applies the ideas in this model to intertemporal decision-making, and finds that attention causally affects discount rates. Chapter 3 explores how hunger, a common physiological state, alters the mechanisms we utilize as we make simple decisions about foods.
Resumo:
Techniques are developed for estimating activity profiles in fixed bed reactors and catalyst deactivation parameters from operating reactor data. These techniques are applicable, in general, to most industrial catalytic processes. The catalytic reforming of naphthas is taken as a broad example to illustrate the estimation schemes and to signify the physical meaning of the kinetic parameters of the estimation equations. The work is described in two parts. Part I deals with the modeling of kinetic rate expressions and the derivation of the working equations for estimation. Part II concentrates on developing various estimation techniques.
Part I: The reactions used to describe naphtha reforming are dehydrogenation and dehydroisomerization of cycloparaffins; isomerization, dehydrocyclization and hydrocracking of paraffins; and the catalyst deactivation reactions, namely coking on alumina sites and sintering of platinum crystallites. The rate expressions for the above reactions are formulated, and the effects of transport limitations on the overall reaction rates are discussed in the appendices. Moreover, various types of interaction between the metallic and acidic active centers of reforming catalysts are discussed as characterizing the different types of reforming reactions.
Part II: In catalytic reactor operation, the activity distribution along the reactor determines the kinetics of the main reaction and is needed for predicting the effect of changes in the feed state and the operating conditions on the reactor output. In the case of a monofunctional catalyst and of bifunctional catalysts in limiting conditions, the cumulative activity is sufficient for predicting steady reactor output. The estimation of this cumulative activity can be carried out easily from measurements at the reactor exit. For a general bifunctional catalytic system, the detailed activity distribution is needed for describing the reactor operation, and some approximation must be made to obtain practicable estimation schemes. This is accomplished by parametrization techniques using measurements at a few points along the reactor. Such parametrization techniques are illustrated numerically with a simplified model of naphtha reforming.
To determine long term catalyst utilization and regeneration policies, it is necessary to estimate catalyst deactivation parameters from the the current operating data. For a first order deactivation model with a monofunctional catalyst or with a bifunctional catalyst in special limiting circumstances, analytical techniques are presented to transform the partial differential equations to ordinary differential equations which admit more feasible estimation schemes. Numerical examples include the catalytic oxidation of butene to butadiene and a simplified model of naphtha reforming. For a general bifunctional system or in the case of a monofunctional catalyst subject to general power law deactivation, the estimation can only be accomplished approximately. The basic feature of an appropriate estimation scheme involves approximating the activity profile by certain polynomials and then estimating the deactivation parameters from the integrated form of the deactivation equation by regression techniques. Different bifunctional systems must be treated by different estimation algorithms, which are illustrated by several cases of naphtha reforming with different feed or catalyst composition.