869 resultados para Markov chains hidden Markov models Viterbi algorithm Forward-Backward algorithm maximum likelihood
Resumo:
In most treatments of the regression problem it is assumed that the distribution of target data can be described by a deterministic function of the inputs, together with additive Gaussian noise having constant variance. The use of maximum likelihood to train such models then corresponds to the minimization of a sum-of-squares error function. In many applications a more realistic model would allow the noise variance itself to depend on the input variables. However, the use of maximum likelihood to train such models would give highly biased results. In this paper we show how a Bayesian treatment can allow for an input-dependent variance while overcoming the bias of maximum likelihood.
Resumo:
It is well known that one of the obstacles to effective forecasting of exchange rates is heteroscedasticity (non-stationary conditional variance). The autoregressive conditional heteroscedastic (ARCH) model and its variants have been used to estimate a time dependent variance for many financial time series. However, such models are essentially linear in form and we can ask whether a non-linear model for variance can improve results just as non-linear models (such as neural networks) for the mean have done. In this paper we consider two neural network models for variance estimation. Mixture Density Networks (Bishop 1994, Nix and Weigend 1994) combine a Multi-Layer Perceptron (MLP) and a mixture model to estimate the conditional data density. They are trained using a maximum likelihood approach. However, it is known that maximum likelihood estimates are biased and lead to a systematic under-estimate of variance. More recently, a Bayesian approach to parameter estimation has been developed (Bishop and Qazaz 1996) that shows promise in removing the maximum likelihood bias. However, up to now, this model has not been used for time series prediction. Here we compare these algorithms with two other models to provide benchmark results: a linear model (from the ARIMA family), and a conventional neural network trained with a sum-of-squares error function (which estimates the conditional mean of the time series with a constant variance noise model). This comparison is carried out on daily exchange rate data for five currencies.
Resumo:
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss the advantages conveyed by the definition of a probability density function for PCA.
Resumo:
Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss the advantages conveyed by the definition of a probability density function for PCA.
Resumo:
Automatically generating maps of a measured variable of interest can be problematic. In this work we focus on the monitoring network context where observations are collected and reported by a network of sensors, and are then transformed into interpolated maps for use in decision making. Using traditional geostatistical methods, estimating the covariance structure of data collected in an emergency situation can be difficult. Variogram determination, whether by method-of-moment estimators or by maximum likelihood, is very sensitive to extreme values. Even when a monitoring network is in a routine mode of operation, sensors can sporadically malfunction and report extreme values. If this extreme data destabilises the model, causing the covariance structure of the observed data to be incorrectly estimated, the generated maps will be of little value, and the uncertainty estimates in particular will be misleading. Marchant and Lark [2007] propose a REML estimator for the covariance, which is shown to work on small data sets with a manual selection of the damping parameter in the robust likelihood. We show how this can be extended to allow treatment of large data sets together with an automated approach to all parameter estimation. The projected process kriging framework of Ingram et al. [2007] is extended to allow the use of robust likelihood functions, including the two component Gaussian and the Huber function. We show how our algorithm is further refined to reduce the computational complexity while at the same time minimising any loss of information. To show the benefits of this method, we use data collected from radiation monitoring networks across Europe. We compare our results to those obtained from traditional kriging methodologies and include comparisons with Box-Cox transformations of the data. We discuss the issue of whether to treat or ignore extreme values, making the distinction between the robust methods which ignore outliers and transformation methods which treat them as part of the (transformed) process. Using a case study, based on an extreme radiological events over a large area, we show how radiation data collected from monitoring networks can be analysed automatically and then used to generate reliable maps to inform decision making. We show the limitations of the methods and discuss potential extensions to remedy these.
Resumo:
When making predictions with complex simulators it can be important to quantify the various sources of uncertainty. Errors in the structural specification of the simulator, for example due to missing processes or incorrect mathematical specification, can be a major source of uncertainty, but are often ignored. We introduce a methodology for inferring the discrepancy between the simulator and the system in discrete-time dynamical simulators. We assume a structural form for the discrepancy function, and show how to infer the maximum-likelihood parameter estimates using a particle filter embedded within a Monte Carlo expectation maximization (MCEM) algorithm. We illustrate the method on a conceptual rainfall-runoff simulator (logSPM) used to model the Abercrombie catchment in Australia. We assess the simulator and discrepancy model on the basis of their predictive performance using proper scoring rules. This article has supplementary material online. © 2011 International Biometric Society.
Resumo:
The subject of this thesis is the n-tuple net.work (RAMnet). The major advantage of RAMnets is their speed and the simplicity with which they can be implemented in parallel hardware. On the other hand, this method is not a universal approximator and the training procedure does not involve the minimisation of a cost function. Hence RAMnets are potentially sub-optimal. It is important to understand the source of this sub-optimality and to develop the analytical tools that allow us to quantify the generalisation cost of using this model for any given data. We view RAMnets as classifiers and function approximators and try to determine how critical their lack of' universality and optimality is. In order to understand better the inherent. restrictions of the model, we review RAMnets showing their relationship to a number of well established general models such as: Associative Memories, Kamerva's Sparse Distributed Memory, Radial Basis Functions, General Regression Networks and Bayesian Classifiers. We then benchmark binary RAMnet. model against 23 other algorithms using real-world data from the StatLog Project. This large scale experimental study indicates that RAMnets are often capable of delivering results which are competitive with those obtained by more sophisticated, computationally expensive rnodels. The Frequency Weighted version is also benchmarked and shown to perform worse than the binary RAMnet for large values of the tuple size n. We demonstrate that the main issues in the Frequency Weighted RAMnets is adequate probability estimation and propose Good-Turing estimates in place of the more commonly used :Maximum Likelihood estimates. Having established the viability of the method numerically, we focus on providillg an analytical framework that allows us to quantify the generalisation cost of RAMnets for a given datasetL. For the classification network we provide a semi-quantitative argument which is based on the notion of Tuple distance. It gives a good indication of whether the network will fail for the given data. A rigorous Bayesian framework with Gaussian process prior assumptions is given for the regression n-tuple net. We show how to calculate the generalisation cost of this net and verify the results numerically for one dimensional noisy interpolation problems. We conclude that the n-tuple method of classification based on memorisation of random features can be a powerful alternative to slower cost driven models. The speed of the method is at the expense of its optimality. RAMnets will fail for certain datasets but the cases when they do so are relatively easy to determine with the analytical tools we provide.
Resumo:
Urban regions present some of the most challenging areas for the remote sensing community. Many different types of land cover have similar spectral responses, making them difficult to distinguish from one another. Traditional per-pixel classification techniques suffer particularly badly because they only use these spectral properties to determine a class, and no other properties of the image, such as context. This project presents the results of the classification of a deeply urban area of Dudley, West Midlands, using 4 methods: Supervised Maximum Likelihood, SMAP, ECHO and Unsupervised Maximum Likelihood. An accuracy assessment method is then developed to allow a fair representation of each procedure and a direct comparison between them. Subsequently, a classification procedure is developed that makes use of the context in the image, though a per-polygon classification. The imagery is broken up into a series of polygons extracted from the Marr-Hildreth zero-crossing edge detector. These polygons are then refined using a region-growing algorithm, and then classified according to the mean class of the fine polygons. The imagery produced by this technique is shown to be of better quality and of a higher accuracy than that of other conventional methods. Further refinements are suggested and examined to improve the aesthetic appearance of the imagery. Finally a comparison with the results produced from a previous study of the James Bridge catchment, in Darleston, West Midlands, is made, showing that the Polygon classified ATM imagery performs significantly better than the Maximum Likelihood classified videography used in the initial study, despite the presence of geometric correction errors.
Resumo:
The so called “Plural Uncertainty Model” is considered, in which statistical, maxmin, interval and Fuzzy model of uncertainty are embedded. For the last case external and internal contradictions of the theory are investigated and the modified definition of the Fuzzy Sets is proposed to overcome the troubles of the classical variant of Fuzzy Subsets by L. Zadeh. The general variants of logit- and probit- regression are the model of the modified Fuzzy Sets. It is possible to say about observations within the modification of the theory. The conception of the “situation” is proposed within modified Fuzzy Theory and the classifying problem is considered. The algorithm of the classification for the situation is proposed being the analogue of the statistical MLM(maximum likelihood method). The example related possible observing the distribution from the collection of distribution is considered.
Resumo:
2000 Mathematics Subject Classifi cation: 62J12.
Resumo:
2000 Mathematics Subject Classification: Primary 62F35; Secondary 62P99
Resumo:
This study is focused on the comparison and modification of different estimates arising in the branching processes. Simulations of models with or without migration are put through. Due to the complexity of the computations the algorithms are designed with the language of technical computing MATLAB. Using the simulations, estimates of the o spring mean of the generated processes are calculated. It is well known in the literature that under certain conditions the asymptotic distribution of the estimates is proved to be normal. Using the asymptotic normality a modified method of maximum likelihood is proposed. The aim is to obtain trimmed maximum likelihood estimates based on several sample paths with the same number of generations. Thus in a natural way the observations, inconsistent with the aprior information about the asymptotic normality are excluded from the model. The computation of the standard error allows the comparison of different types of estimates.
Resumo:
In this dissertation, I investigate three related topics on asset pricing: the consumption-based asset pricing under long-run risks and fat tails, the pricing of VIX (CBOE Volatility Index) options and the market price of risk embedded in stock returns and stock options. These three topics are fully explored in Chapter II through IV. Chapter V summarizes the main conclusions. In Chapter II, I explore the effects of fat tails on the equilibrium implications of the long run risks model of asset pricing by introducing innovations with dampened power law to consumption and dividends growth processes. I estimate the structural parameters of the proposed model by maximum likelihood. I find that the stochastic volatility model with fat tails can, without resorting to high risk aversion, generate implied risk premium, expected risk free rate and their volatilities comparable to the magnitudes observed in data. In Chapter III, I examine the pricing performance of VIX option models. The contention that simpler-is-better is supported by the empirical evidence using actual VIX option market data. I find that no model has small pricing errors over the entire range of strike prices and times to expiration. In general, Whaley’s Black-like option model produces the best overall results, supporting the simpler-is-better contention. However, the Whaley model does under/overprice out-of-the-money call/put VIX options, which is contrary to the behavior of stock index option pricing models. In Chapter IV, I explore risk pricing through a model of time-changed Lvy processes based on the joint evidence from individual stock options and underlying stocks. I specify a pricing kernel that prices idiosyncratic and systematic risks. This approach to examining risk premia on stocks deviates from existing studies. The empirical results show that the market pays positive premia for idiosyncratic and market jump-diffusion risk, and idiosyncratic volatility risk. However, there is no consensus on the premium for market volatility risk. It can be positive or negative. The positive premium on idiosyncratic risk runs contrary to the implications of traditional capital asset pricing theory.
Resumo:
The number of dividend paying firms has been on the decline since the popularity of stock repurchases in the 1980s, and the recent financial crisis has brought about a wave of dividend reductions and omissions. This dissertation examined the U.S. firms and American Depository Receipts that are listed on the U.S. equity exchanges according to their dividend paying history in the previous twelve quarters. While accounting for the state of the economy, the firm’s size, profitability, earned equity, and growth opportunities, it determines whether or not the firm will pay a dividend in the next quarter. It also examined the likelihood of a dividend change. Further, returns of firms were examined according to their dividend paying history and the state of the economy using the Fama-French three-factor model. Using forward, backward, and step-wise selection logistic regressions, the results show that firms with a history of regular and uninterrupted dividend payments are likely to continue to pay dividends, while firms that do not have a history of regular dividend payments are not likely to begin to pay dividends or continue to do so. The results of a set of generalized polytomous logistic regressions imply that dividend paying firms are more likely to reduce dividend payments during economic expansions, as opposed to recessions. Also the analysis of returns using the Fama-French three factor model reveals that dividend paying firms are earning significant abnormal positive returns. As a special case, a similar analysis of dividend payment and dividend change was applied to American Depository Receipts that trade on the NYSE, NASDAQ, and AMEX exchanges and are issued by the Bank of New York Mellon. Returns of American Depository Receipts were examined using the Fama-French two-factor model for international firms. The results of the generalized polytomous logistic regression analyses indicate that dividend paying status and economic conditions are also important for dividend level change of American Depository Receipts, and Fama-French two-factor regressions alone do not adequately explain returns for these securities.
Resumo:
My dissertation investigates the financial linkages and transmission of economic shocks between the US and the smallest emerging markets (frontier markets). The first chapter sets up an empirical model that examines the impact of US market returns and conditional volatility on the returns and conditional volatilities of twenty-one frontier markets. The model is estimated via maximum likelihood; utilizes the GARCH model of errors, and is applied to daily country data from the MSCI Barra. We find limited, but statistically significant exposure of Frontier markets to shocks from the US. Our results suggest that it is not the lagged US market returns that have impact; rather it is the expected US market returns that influence frontier market returns The second chapter sets up an empirical time-varying parameter (TVP) model to explore the time-variation in the impact of mean US returns on mean Frontier market returns. The model utilizes the Kalman filter algorithm as well as the GARCH model of errors and is applied to daily country data from the MSCI Barra. The TVP model detects statistically significant time-variation in the impact of US returns and low, but statistically and quantitatively important impact of US market conditional volatility. The third chapter studies the risk-return relationship in twenty Frontier country stock markets by setting up an international version of the intertemporal capital asset pricing model. The systematic risk in this model comes from covariance of Frontier market stock index returns with world returns. Both the systematic risk and risk premium are time-varying in our model. We also incorporate own country variances as additional determinants of Frontier country returns. Our results suggest statistically significant impact of both world and own country risk in explaining Frontier country returns. Time-variation in the world risk premium is also found to be statistically significant for most Frontier market returns. However, own country risk is found to be quantitatively more important.