10 resultados para State-space modeling

em Duke University


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The advances in three related areas of state-space modeling, sequential Bayesian learning, and decision analysis are addressed, with the statistical challenges of scalability and associated dynamic sparsity. The key theme that ties the three areas is Bayesian model emulation: solving challenging analysis/computational problems using creative model emulators. This idea defines theoretical and applied advances in non-linear, non-Gaussian state-space modeling, dynamic sparsity, decision analysis and statistical computation, across linked contexts of multivariate time series and dynamic networks studies. Examples and applications in financial time series and portfolio analysis, macroeconomics and internet studies from computational advertising demonstrate the utility of the core methodological innovations.

Chapter 1 summarizes the three areas/problems and the key idea of emulating in those areas. Chapter 2 discusses the sequential analysis of latent threshold models with use of emulating models that allows for analytical filtering to enhance the efficiency of posterior sampling. Chapter 3 examines the emulator model in decision analysis, or the synthetic model, that is equivalent to the loss function in the original minimization problem, and shows its performance in the context of sequential portfolio optimization. Chapter 4 describes the method for modeling the steaming data of counts observed on a large network that relies on emulating the whole, dependent network model by independent, conjugate sub-models customized to each set of flow. Chapter 5 reviews those advances and makes the concluding remarks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A class of multi-process models is developed for collections of time indexed count data. Autocorrelation in counts is achieved with dynamic models for the natural parameter of the binomial distribution. In addition to modeling binomial time series, the framework includes dynamic models for multinomial and Poisson time series. Markov chain Monte Carlo (MCMC) and Po ́lya-Gamma data augmentation (Polson et al., 2013) are critical for fitting multi-process models of counts. To facilitate computation when the counts are high, a Gaussian approximation to the P ́olya- Gamma random variable is developed.

Three applied analyses are presented to explore the utility and versatility of the framework. The first analysis develops a model for complex dynamic behavior of themes in collections of text documents. Documents are modeled as a “bag of words”, and the multinomial distribution is used to characterize uncertainty in the vocabulary terms appearing in each document. State-space models for the natural parameters of the multinomial distribution induce autocorrelation in themes and their proportional representation in the corpus over time.

The second analysis develops a dynamic mixed membership model for Poisson counts. The model is applied to a collection of time series which record neuron level firing patterns in rhesus monkeys. The monkey is exposed to two sounds simultaneously, and Gaussian processes are used to smoothly model the time-varying rate at which the neuron’s firing pattern fluctuates between features associated with each sound in isolation.

The third analysis presents a switching dynamic generalized linear model for the time-varying home run totals of professional baseball players. The model endows each player with an age specific latent natural ability class and a performance enhancing drug (PED) use indicator. As players age, they randomly transition through a sequence of ability classes in a manner consistent with traditional aging patterns. When the performance of the player significantly deviates from the expected aging pattern, he is identified as a player whose performance is consistent with PED use.

All three models provide a mechanism for sharing information across related series locally in time. The models are fit with variations on the P ́olya-Gamma Gibbs sampler, MCMC convergence diagnostics are developed, and reproducible inference is emphasized throughout the dissertation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multi-output Gaussian processes provide a convenient framework for multi-task problems. An illustrative and motivating example of a multi-task problem is multi-region electrophysiological time-series data, where experimentalists are interested in both power and phase coherence between channels. Recently, the spectral mixture (SM) kernel was proposed to model the spectral density of a single task in a Gaussian process framework. This work develops a novel covariance kernel for multiple outputs, called the cross-spectral mixture (CSM) kernel. This new, flexible kernel represents both the power and phase relationship between multiple observation channels. The expressive capabilities of the CSM kernel are demonstrated through implementation of 1) a Bayesian hidden Markov model, where the emission distribution is a multi-output Gaussian process with a CSM covariance kernel, and 2) a Gaussian process factor analysis model, where factor scores represent the utilization of cross-spectral neural circuits. Results are presented for measured multi-region electrophysiological data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This dissertation examined the response to termination of CO2 enrichment of a forest ecosystem exposed to long-term elevated atmospheric CO2 condition, and aimed at investigating responses and their underlying mechanisms of two important factors of carbon cycle in the ecosystem, stomatal conductance and soil respiration. Because the contribution of understory vegetation to the entire ecosystem grew with time, we first investigated the effect of elevated CO2 on understory vegetation. Potential growth enhancing effect of elevated CO2 were not observed, and light seemed to be a limiting factor. Secondly, we examined the importance of aerodynamic conductance to determine canopy conductance, and found that its effect can be negligible. Responses of stomatal conductance and soil respiration were assessed using Bayesian state space model. In two years after the termination of CO2 enrichment, stomatal conductance in formerly elevated CO2 returned to ambient level, while soil respiration became smaller than ambient level and did not recovered to ambient in two years.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Free energy calculations are a computational method for determining thermodynamic quantities, such as free energies of binding, via simulation.

Currently, due to computational and algorithmic limitations, free energy calculations are limited in scope.

In this work, we propose two methods for improving the efficiency of free energy calculations.

First, we expand the state space of alchemical intermediates, and show that this expansion enables us to calculate free energies along lower variance paths.

We use Q-learning, a reinforcement learning technique, to discover and optimize paths at low computational cost.

Second, we reduce the cost of sampling along a given path by using sequential Monte Carlo samplers.

We develop a new free energy estimator, pCrooks (pairwise Crooks), a variant on the Crooks fluctuation theorem (CFT), which enables decomposition of the variance of the free energy estimate for discrete paths, while retaining beneficial characteristics of CFT.

Combining these two advancements, we show that for some test models, optimal expanded-space paths have a nearly 80% reduction in variance relative to the standard path.

Additionally, our free energy estimator converges at a more consistent rate and on average 1.8 times faster when we enable path searching, even when the cost of path discovery and refinement is considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new modality for preventing HIV transmission is emerging in the form of topical microbicides. Some clinical trials have shown some promising results of these methods of protection while other trials have failed to show efficacy. Due to the relatively novel nature of microbicide drug transport, a rigorous, deterministic analysis of that transport can help improve the design of microbicide vehicles and understand results from clinical trials. This type of analysis can aid microbicide product design by helping understand and organize the determinants of drug transport and the potential efficacies of candidate microbicide products.

Microbicide drug transport is modeled as a diffusion process with convection and reaction effects in appropriate compartments. This is applied here to vaginal gels and rings and a rectal enema, all delivering the microbicide drug Tenofovir. Although the focus here is on Tenofovir, the methods established in this dissertation can readily be adapted to other drugs, given knowledge of their physical and chemical properties, such as the diffusion coefficient, partition coefficient, and reaction kinetics. Other dosage forms such as tablets and fiber meshes can also be modeled using the perspective and methods developed here.

The analyses here include convective details of intravaginal flows by both ambient fluid and spreading gels with different rheological properties and applied volumes. These are input to the overall conservation equations for drug mass transport in different compartments. The results are Tenofovir concentration distributions in time and space for a variety of microbicide products and conditions. The Tenofovir concentrations in the vaginal and rectal mucosal stroma are converted, via a coupled reaction equation, to concentrations of Tenofovir diphosphate, which is the active form of the drug that functions as a reverse transcriptase inhibitor against HIV. Key model outputs are related to concentrations measured in experimental pharmacokinetic (PK) studies, e.g. concentrations in biopsies and blood. A new measure of microbicide prophylactic functionality, the Percent Protected, is calculated. This is the time dependent volume of the entire stroma (and thus fraction of host cells therein) in which Tenofovir diphosphate concentrations equal or exceed a target prophylactic value, e.g. an EC50.

Results show the prophylactic potentials of the studied microbicide vehicles against HIV infections. Key design parameters for each are addressed in application of the models. For a vaginal gel, fast spreading at small volume is more effective than slower spreading at high volume. Vaginal rings are shown to be most effective if inserted and retained as close to the fornix as possible. Because of the long half-life of Tenofovir diphosphate, temporary removal of the vaginal ring (after achieving steady state) for up to 24h does not appreciably diminish Percent Protected. However, full steady state (for the entire stromal volume) is not achieved until several days after ring insertion. Delivery of Tenofovir to the rectal mucosa by an enema is dominated by surface area of coated mucosa and whether the interiors of rectal crypts are filled with the enema fluid. For the enema 100% Percent Protected is achieved much more rapidly than for vaginal products, primarily because of the much thinner epithelial layer of the mucosa. For example, 100% Percent Protected can be achieved with a one minute enema application, and 15 minute wait time.

Results of these models have good agreement with experimental pharmacokinetic data, in animals and clinical trials. They also improve upon traditional, empirical PK modeling, and this is illustrated here. Our deterministic approach can inform design of sampling in clinical trials by indicating time periods during which significant changes in drug concentrations occur in different compartments. More fundamentally, the work here helps delineate the determinants of microbicide drug delivery. This information can be the key to improved, rational design of microbicide products and their dosage regimens.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Light rainfall is the baseline input to the annual water budget in mountainous landscapes through the tropics and at mid-latitudes. In the Southern Appalachians, the contribution from light rainfall ranges from 50-60% during wet years to 80-90% during dry years, with convective activity and tropical cyclone input providing most of the interannual variability. The Southern Appalachians is a region characterized by rich biodiversity that is vulnerable to land use/land cover changes due to its proximity to a rapidly growing population. Persistent near surface moisture and associated microclimates observed in this region has been well documented since the colonization of the area in terms of species health, fire frequency, and overall biodiversity. The overarching objective of this research is to elucidate the microphysics of light rainfall and the dynamics of low level moisture in the inner region of the Southern Appalachians during the warm season, with a focus on orographically mediated processes. The overarching research hypothesis is that physical processes leading to and governing the life cycle of orographic fog, low level clouds, and precipitation, and their interactions, are strongly tied to landform, land cover, and the diurnal cycles of flow patterns, radiative forcing, and surface fluxes at the ridge-valley scale. The following science questions will be addressed specifically: 1) How do orographic clouds and fog affect the hydrometeorological regime from event to annual scale and as a function of terrain characteristics and land cover?; 2) What are the source areas, governing processes, and relevant time-scales of near surface moisture convergence patterns in the region?; and 3) What are the four dimensional microphysical and dynamical characteristics, including variability and controlling factors and processes, of fog and light rainfall? The research was conducted with two major components: 1) ground-based high-quality observations using multi-sensor platforms and 2) interpretive numerical modeling guided by the analysis of the in situ data collection. Findings illuminate a high level of spatial – down to the ridge scale - and temporal – from event to annual scale - heterogeneity in observations, and a significant impact on the hydrological regime as a result of seeder-feeder interactions among fog, low level clouds, and stratiform rainfall that enhance coalescence efficiency and lead to significantly higher rainfall rates at the land surface. Specifically, results show that enhancement of an event up to one order of magnitude in short-term accumulation can occur as a result of concurrent fog presence. Results also show that events are modulated strongly by terrain characteristics including elevation, slope, geometry, and land cover. These factors produce interactions between highly localized flows and gradients of temperature and moisture with larger scale circulations. Resulting observations of DSD and rainfall patterns are stratified by region and altitude and exhibit clear diurnal and seasonal cycles.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The full-scale base-isolated structure studied in this dissertation is the only base-isolated building in South Island of New Zealand. It sustained hundreds of earthquake ground motions from September 2010 and well into 2012. Several large earthquake responses were recorded in December 2011 by NEES@UCLA and by GeoNet recording station nearby Christchurch Women's Hospital. The primary focus of this dissertation is to advance the state-of-the art of the methods to evaluate performance of seismic-isolated structures and the effects of soil-structure interaction by developing new data processing methodologies to overcome current limitations and by implementing advanced numerical modeling in OpenSees for direct analysis of soil-structure interaction.

This dissertation presents a novel method for recovering force-displacement relations within the isolators of building structures with unknown nonlinearities from sparse seismic-response measurements of floor accelerations. The method requires only direct matrix calculations (factorizations and multiplications); no iterative trial-and-error methods are required. The method requires a mass matrix, or at least an estimate of the floor masses. A stiffness matrix may be used, but is not necessary. Essentially, the method operates on a matrix of incomplete measurements of floor accelerations. In the special case of complete floor measurements of systems with linear dynamics, real modes, and equal floor masses, the principal components of this matrix are the modal responses. In the more general case of partial measurements and nonlinear dynamics, the method extracts a number of linearly-dependent components from Hankel matrices of measured horizontal response accelerations, assembles these components row-wise and extracts principal components from the singular value decomposition of this large matrix of linearly-dependent components. These principal components are then interpolated between floors in a way that minimizes the curvature energy of the interpolation. This interpolation step can make use of a reduced-order stiffness matrix, a backward difference matrix or a central difference matrix. The measured and interpolated floor acceleration components at all floors are then assembled and multiplied by a mass matrix. The recovered in-service force-displacement relations are then incorporated into the OpenSees soil structure interaction model.

Numerical simulations of soil-structure interaction involving non-uniform soil behavior are conducted following the development of the complete soil-structure interaction model of Christchurch Women's Hospital in OpenSees. In these 2D OpenSees models, the superstructure is modeled as two-dimensional frames in short span and long span respectively. The lead rubber bearings are modeled as elastomeric bearing (Bouc Wen) elements. The soil underlying the concrete raft foundation is modeled with linear elastic plane strain quadrilateral element. The non-uniformity of the soil profile is incorporated by extraction and interpolation of shear wave velocity profile from the Canterbury Geotechnical Database. The validity of the complete two-dimensional soil-structure interaction OpenSees model for the hospital is checked by comparing the results of peak floor responses and force-displacement relations within the isolation system achieved from OpenSees simulations to the recorded measurements. General explanations and implications, supported by displacement drifts, floor acceleration and displacement responses, force-displacement relations are described to address the effects of soil-structure interaction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent research into resting-state functional magnetic resonance imaging (fMRI) has shown that the brain is very active during rest. This thesis work utilizes blood oxygenation level dependent (BOLD) signals to investigate the spatial and temporal functional network information found within resting-state data, and aims to investigate the feasibility of extracting functional connectivity networks using different methods as well as the dynamic variability within some of the methods. Furthermore, this work looks into producing valid networks using a sparsely-sampled sub-set of the original data.

In this work we utilize four main methods: independent component analysis (ICA), principal component analysis (PCA), correlation, and a point-processing technique. Each method comes with unique assumptions, as well as strengths and limitations into exploring how the resting state components interact in space and time.

Correlation is perhaps the simplest technique. Using this technique, resting-state patterns can be identified based on how similar the time profile is to a seed region’s time profile. However, this method requires a seed region and can only identify one resting state network at a time. This simple correlation technique is able to reproduce the resting state network using subject data from one subject’s scan session as well as with 16 subjects.

Independent component analysis, the second technique, has established software programs that can be used to implement this technique. ICA can extract multiple components from a data set in a single analysis. The disadvantage is that the resting state networks it produces are all independent of each other, making the assumption that the spatial pattern of functional connectivity is the same across all the time points. ICA is successfully able to reproduce resting state connectivity patterns for both one subject and a 16 subject concatenated data set.

Using principal component analysis, the dimensionality of the data is compressed to find the directions in which the variance of the data is most significant. This method utilizes the same basic matrix math as ICA with a few important differences that will be outlined later in this text. Using this method, sometimes different functional connectivity patterns are identifiable but with a large amount of noise and variability.

To begin to investigate the dynamics of the functional connectivity, the correlation technique is used to compare the first and second halves of a scan session. Minor differences are discernable between the correlation results of the scan session halves. Further, a sliding window technique is implemented to study the correlation coefficients through different sizes of correlation windows throughout time. From this technique it is apparent that the correlation level with the seed region is not static throughout the scan length.

The last method introduced, a point processing method, is one of the more novel techniques because it does not require analysis of the continuous time points. Here, network information is extracted based on brief occurrences of high or low amplitude signals within a seed region. Because point processing utilizes less time points from the data, the statistical power of the results is lower. There are also larger variations in DMN patterns between subjects. In addition to boosted computational efficiency, the benefit of using a point-process method is that the patterns produced for different seed regions do not have to be independent of one another.

This work compares four unique methods of identifying functional connectivity patterns. ICA is a technique that is currently used by many scientists studying functional connectivity patterns. The PCA technique is not optimal for the level of noise and the distribution of the data sets. The correlation technique is simple and obtains good results, however a seed region is needed and the method assumes that the DMN regions is correlated throughout the entire scan. Looking at the more dynamic aspects of correlation changing patterns of correlation were evident. The last point-processing method produces a promising results of identifying functional connectivity networks using only low and high amplitude BOLD signals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bayesian nonparametric models, such as the Gaussian process and the Dirichlet process, have been extensively applied for target kinematics modeling in various applications including environmental monitoring, traffic planning, endangered species tracking, dynamic scene analysis, autonomous robot navigation, and human motion modeling. As shown by these successful applications, Bayesian nonparametric models are able to adjust their complexities adaptively from data as necessary, and are resistant to overfitting or underfitting. However, most existing works assume that the sensor measurements used to learn the Bayesian nonparametric target kinematics models are obtained a priori or that the target kinematics can be measured by the sensor at any given time throughout the task. Little work has been done for controlling the sensor with bounded field of view to obtain measurements of mobile targets that are most informative for reducing the uncertainty of the Bayesian nonparametric models. To present the systematic sensor planning approach to leaning Bayesian nonparametric models, the Gaussian process target kinematics model is introduced at first, which is capable of describing time-invariant spatial phenomena, such as ocean currents, temperature distributions and wind velocity fields. The Dirichlet process-Gaussian process target kinematics model is subsequently discussed for modeling mixture of mobile targets, such as pedestrian motion patterns.

Novel information theoretic functions are developed for these introduced Bayesian nonparametric target kinematics models to represent the expected utility of measurements as a function of sensor control inputs and random environmental variables. A Gaussian process expected Kullback Leibler divergence is developed as the expectation of the KL divergence between the current (prior) and posterior Gaussian process target kinematics models with respect to the future measurements. Then, this approach is extended to develop a new information value function that can be used to estimate target kinematics described by a Dirichlet process-Gaussian process mixture model. A theorem is proposed that shows the novel information theoretic functions are bounded. Based on this theorem, efficient estimators of the new information theoretic functions are designed, which are proved to be unbiased with the variance of the resultant approximation error decreasing linearly as the number of samples increases. Computational complexities for optimizing the novel information theoretic functions under sensor dynamics constraints are studied, and are proved to be NP-hard. A cumulative lower bound is then proposed to reduce the computational complexity to polynomial time.

Three sensor planning algorithms are developed according to the assumptions on the target kinematics and the sensor dynamics. For problems where the control space of the sensor is discrete, a greedy algorithm is proposed. The efficiency of the greedy algorithm is demonstrated by a numerical experiment with data of ocean currents obtained by moored buoys. A sweep line algorithm is developed for applications where the sensor control space is continuous and unconstrained. Synthetic simulations as well as physical experiments with ground robots and a surveillance camera are conducted to evaluate the performance of the sweep line algorithm. Moreover, a lexicographic algorithm is designed based on the cumulative lower bound of the novel information theoretic functions, for the scenario where the sensor dynamics are constrained. Numerical experiments with real data collected from indoor pedestrians by a commercial pan-tilt camera are performed to examine the lexicographic algorithm. Results from both the numerical simulations and the physical experiments show that the three sensor planning algorithms proposed in this dissertation based on the novel information theoretic functions are superior at learning the target kinematics with

little or no prior knowledge