979 resultados para Markov Processes
Resumo:
Markov chain Monte Carlo (MCMC) estimation provides a solution to the complex integration problems that are faced in the Bayesian analysis of statistical problems. The implementation of MCMC algorithms is, however, code intensive and time consuming. We have developed a Python package, which is called PyMCMC, that aids in the construction of MCMC samplers and helps to substantially reduce the likelihood of coding error, as well as aid in the minimisation of repetitive code. PyMCMC contains classes for Gibbs, Metropolis Hastings, independent Metropolis Hastings, random walk Metropolis Hastings, orientational bias Monte Carlo and slice samplers as well as specific modules for common models such as a module for Bayesian regression analysis. PyMCMC is straightforward to optimise, taking advantage of the Python libraries Numpy and Scipy, as well as being readily extensible with C or Fortran.
Resumo:
Log-linear and maximum-margin models are two commonly-used methods in supervised machine learning, and are frequently used in structured prediction problems. Efficient learning of parameters in these models is therefore an important problem, and becomes a key factor when learning from very large data sets. This paper describes exponentiated gradient (EG) algorithms for training such models, where EG updates are applied to the convex dual of either the log-linear or max-margin objective function; the dual in both the log-linear and max-margin cases corresponds to minimizing a convex function with simplex constraints. We study both batch and online variants of the algorithm, and provide rates of convergence for both cases. In the max-margin case, O(1/ε) EG updates are required to reach a given accuracy ε in the dual; in contrast, for log-linear models only O(log(1/ε)) updates are required. For both the max-margin and log-linear cases, our bounds suggest that the online EG algorithm requires a factor of n less computation to reach a desired accuracy than the batch EG algorithm, where n is the number of training examples. Our experiments confirm that the online algorithms are much faster than the batch algorithms in practice. We describe how the EG updates factor in a convenient way for structured prediction problems, allowing the algorithms to be efficiently applied to problems such as sequence learning or natural language parsing. We perform extensive evaluation of the algorithms, comparing them to L-BFGS and stochastic gradient descent for log-linear models, and to SVM-Struct for max-margin models. The algorithms are applied to a multi-class problem as well as to a more complex large-scale parsing task. In all these settings, the EG algorithms presented here outperform the other methods.
Resumo:
The Upper Roper River is one of the Australia’s unique tropical rivers which have been largely untouched by development. The Upper Roper River catchment comprises the sub-catchments of the Waterhouse River and Roper Creek, the two tributaries of the Roper River. There is a complex geological setting with different aquifer types. In this seasonal system, close interaction between surface water and groundwater contributes to both streamflow and sustaining ecosystems. The interaction is highly variable between seasons. A conceptual hydrogeological model was developed to investigate the different hydrological processes and geochemical parameters, and determine the baseline characteristics of water resources of this pristine catchment. In the catchment, long term average rainfall is around 850 mm and is summer dominant which significantly influences the total hydrological system. The difference between seasons is pronounced, with high rainfall up to 600 mm/month in the wet season, and negligible rainfall in the dry season. Canopy interception significantly reduces the amount of effective rainfall because of the native vegetation cover in the pristine catchment. Evaporation exceeds rainfall the majority of the year. Due to elevated evaporation and high temperature in the tropics, at least 600 mm of annual rainfall is required to generate potential recharge. Analysis of 120 years of rainfall data trend helped define “wet” and “dry periods”: decreasing trend corresponds to dry periods, and increasing trend to wet periods. The period from 1900 to 1970 was considered as Dry period 1, when there were years with no effective rainfall, and if there was, the intensity of rainfall was around 300 mm. The period 1970 – 1985 was identified as the Wet period 2, when positive effective rainfall occurred in almost every year, and the intensity reached up to 700 mm. The period 1985 – 1995 was the Dry period 2, with similar characteristics as Dry period 1. Finally, the last decade was the Wet period 2, with effective rainfall intensity up to 800 mm. This variability in rainfall over decades increased/decreased recharge and discharge, improving/reducing surface water and groundwater quantity and quality in different wet and dry periods. The stream discharge follows the rainfall pattern. In the wet season, the aquifer is replenished, groundwater levels and groundwater discharge are high, and surface runoff is the dominant component of streamflow. Waterhouse River contributes two thirds and Roper Creek one third to Roper River flow. As the dry season progresses, surface runoff depletes, and groundwater becomes the main component of stream flow. Flow in Waterhouse River is negligible, the Roper Creek dries up, but the Roper River maintains its flow throughout the year. This is due to the groundwater and spring discharge from the highly permeable Tindall Limestone and tufa aquifers. Rainfall seasonality and lithology of both the catchment and aquifers are shown to influence water chemistry. In the wet season, dilution of water bodies by rainwater is the main process. In the dry season, when groundwater provides baseflow to the streams, their chemical composition reflects lithology of the aquifers, in particular the karstic areas. Water chemistry distinguishes four types of aquifer materials described as alluvium, sandstone, limestone and tufa. Surface water in the headwaters of the Waterhouse River, the Roper Creek and their tributaries are freshwater, and reflect the alluvium and sandstone aquifers. At and downstream of the confluence of the Roper River, river water chemistry indicates the influence of rainfall dilution in the wet season, and the signature of the Tindall Limestone and tufa aquifers in the dry. Rainbow Spring on the Waterhouse River and Bitter Spring on the Little Roper River (known as Roper Creek at the headwaters) discharge from the Tindall Limestone. Botanic Walk Spring and Fig Tree Spring discharge into the Roper River from tufa. The source of water was defined based on water chemical composition of the springs, surface and groundwater. The mechanisms controlling surface water chemistry were examined to define the dominance of precipitation, evaporation or rock weathering on the water chemical composition. Simple water balance models for the catchment have been developed. The important aspects to be considered in water resource planning of this total system are the naturally high salinity in the region, especially the downstream sections, and how unpredictable climate variation may impact on the natural seasonal variability of water volumes and surface-subsurface interaction.
Resumo:
The uniformization method (also known as randomization) is a numerically stable algorithm for computing transient distributions of a continuous time Markov chain. When the solution is needed after a long run or when the convergence is slow, the uniformization method involves a large number of matrix-vector products. Despite this, the method remains very popular due to its ease of implementation and its reliability in many practical circumstances. Because calculating the matrix-vector product is the most time-consuming part of the method, overall efficiency in solving large-scale problems can be significantly enhanced if the matrix-vector product is made more economical. In this paper, we incorporate a new relaxation strategy into the uniformization method to compute the matrix-vector products only approximately. We analyze the error introduced by these inexact matrix-vector products and discuss strategies for refining the accuracy of the relaxation while reducing the execution cost. Numerical experiments drawn from computer systems and biological systems are given to show that significant computational savings are achieved in practical applications.
Resumo:
In this reflection on research processes a humanities researcher begins to ask questions about the cultural materialist dimensions of research activities. At the center of this exploration are questions relating to the ways in which personal histories and experiences inform particular research processes and the ways in which a researcher's habits of collecting and working with data are regulated by cultural and social practice. The reflection on personal research processes is located in terms of the ethics work of Michel Foucault that provides reminders about the role of modern bureaucracy in governing what appear to be personal processes.
Resumo:
Experimental and theoretical studies have shown the importance of stochastic processes in genetic regulatory networks and cellular processes. Cellular networks and genetic circuits often involve small numbers of key proteins such as transcriptional factors and signaling proteins. In recent years stochastic models have been used successfully for studying noise in biological pathways, and stochastic modelling of biological systems has become a very important research field in computational biology. One of the challenge problems in this field is the reduction of the huge computing time in stochastic simulations. Based on the system of the mitogen-activated protein kinase cascade that is activated by epidermal growth factor, this work give a parallel implementation by using OpenMP and parallelism across the simulation. Special attention is paid to the independence of the generated random numbers in parallel computing, that is a key criterion for the success of stochastic simulations. Numerical results indicate that parallel computers can be used as an efficient tool for simulating the dynamics of large-scale genetic regulatory networks and cellular processes
Resumo:
We consider a robust filtering problem for uncertain discrete-time, homogeneous, first-order, finite-state hidden Markov models (HMMs). The class of uncertain HMMs considered is described by a conditional relative entropy constraint on measures perturbed from a nominal regular conditional probability distribution given the previous posterior state distribution and the latest measurement. Under this class of perturbations, a robust infinite horizon filtering problem is first formulated as a constrained optimization problem before being transformed via variational results into an unconstrained optimization problem; the latter can be elegantly solved using a risk-sensitive information-state based filtering.