989 resultados para Joint conditional distributions
Resumo:
In this paper, we extend the debate concerning Credit Default Swap valuation to include time varying correlation and co-variances. Traditional multi-variate techniques treat the correlations between covariates as constant over time; however, this view is not supported by the data. Secondly, since financial data does not follow a normal distribution because of its heavy tails, modeling the data using a Generalized Linear model (GLM) incorporating copulas emerge as a more robust technique over traditional approaches. This paper also includes an empirical analysis of the regime switching dynamics of credit risk in the presence of liquidity by following the general practice of assuming that credit and market risk follow a Markov process. The study was based on Credit Default Swap data obtained from Bloomberg that spanned the period January 1st 2004 to August 08th 2006. The empirical examination of the regime switching tendencies provided quantitative support to the anecdotal view that liquidity decreases as credit quality deteriorates. The analysis also examined the joint probability distribution of the credit risk determinants across credit quality through the use of a copula function which disaggregates the behavior embedded in the marginal gamma distributions, so as to isolate the level of dependence which is captured in the copula function. The results suggest that the time varying joint correlation matrix performed far superior as compared to the constant correlation matrix; the centerpiece of linear regression models.
Resumo:
Spatial characterization of non-Gaussian attributes in earth sciences and engineering commonly requires the estimation of their conditional distribution. The indicator and probability kriging approaches of current nonparametric geostatistics provide approximations for estimating conditional distributions. They do not, however, provide results similar to those in the cumbersome implementation of simultaneous cokriging of indicators. This paper presents a new formulation termed successive cokriging of indicators that avoids the classic simultaneous solution and related computational problems, while obtaining equivalent results to the impractical simultaneous solution of cokriging of indicators. A successive minimization of the estimation variance of probability estimates is performed, as additional data are successively included into the estimation process. In addition, the approach leads to an efficient nonparametric simulation algorithm for non-Gaussian random functions based on residual probabilities.
Resumo:
We introduce a novel inversion-based neuro-controller for solving control problems involving uncertain nonlinear systems that could also compensate for multi-valued systems. The approach uses recent developments in neural networks, especially in the context of modelling statistical distributions, which are applied to forward and inverse plant models. Provided that certain conditions are met, an estimate of the intrinsic uncertainty for the outputs of neural networks can be obtained using the statistical properties of networks. More generally, multicomponent distributions can be modelled by the mixture density network. In this work a novel robust inverse control approach is obtained based on importance sampling from these distributions. This importance sampling provides a structured and principled approach to constrain the complexity of the search space for the ideal control law. The performance of the new algorithm is illustrated through simulations with example systems.
Resumo:
A better understanding of stock price changes is important in guiding many economic activities. Since prices often do not change without good reasons, searching for related explanatory variables has involved many enthusiasts. This book seeks answers from prices per se by relating price changes to their conditional moments. This is based on the belief that prices are the products of a complex psychological and economic process and their conditional moments derive ultimately from these psychological and economic shocks. Utilizing information about conditional moments hence makes it an attractive alternative to using other selective financial variables in explaining price changes. The first paper examines the relation between the conditional mean and the conditional variance using information about moments in three types of conditional distributions; it finds that the significance of the estimated mean and variance ratio can be affected by the assumed distributions and the time variations in skewness. The second paper decomposes the conditional industry volatility into a concurrent market component and an industry specific component; it finds that market volatility is on average responsible for a rather small share of total industry volatility — 6 to 9 percent in UK and 2 to 3 percent in Germany. The third paper looks at the heteroskedasticity in stock returns through an ARCH process supplemented with a set of conditioning information variables; it finds that the heteroskedasticity in stock returns allows for several forms of heteroskedasticity that include deterministic changes in variances due to seasonal factors, random adjustments in variances due to market and macro factors, and ARCH processes with past information. The fourth paper examines the role of higher moments — especially skewness and kurtosis — in determining the expected returns; it finds that total skewness and total kurtosis are more relevant non-beta risk measures and that they are costly to be diversified due either to the possible eliminations of their desirable parts or to the unsustainability of diversification strategies based on them.
Resumo:
The Dirichlet family owes its privileged status within simplex distributions to easyness of interpretation and good mathematical properties. In particular, we recall fundamental properties for the analysis of compositional data such as closure under amalgamation and subcomposition. From a probabilistic point of view, it is characterised (uniquely) by a variety of independence relationships which makes it indisputably the reference model for expressing the non trivial idea of substantial independence for compositions. Indeed, its well known inadequacy as a general model for compositional data stems from such an independence structure together with the poorness of its parametrisation. In this paper a new class of distributions (called Flexible Dirichlet) capable of handling various dependence structures and containing the Dirichlet as a special case is presented. The new model exhibits a considerably richer parametrisation which, for example, allows to model the means and (part of) the variance-covariance matrix separately. Moreover, such a model preserves some good mathematical properties of the Dirichlet, i.e. closure under amalgamation and subcomposition with new parameters simply related to the parent composition parameters. Furthermore, the joint and conditional distributions of subcompositions and relative totals can be expressed as simple mixtures of two Flexible Dirichlet distributions. The basis generating the Flexible Dirichlet, though keeping compositional invariance, shows a dependence structure which allows various forms of partitional dependence to be contemplated by the model (e.g. non-neutrality, subcompositional dependence and subcompositional non-invariance), independence cases being identified by suitable parameter configurations. In particular, within this model substantial independence among subsets of components of the composition naturally occurs when the subsets have a Dirichlet distribution
Resumo:
Two variables define the topological state of closed double-stranded DNA: the knot type, K, and ΔLk, the linking number difference from relaxed DNA. The equilibrium distribution of probabilities of these states, P(ΔLk, K), is related to two conditional distributions: P(ΔLk|K), the distribution of ΔLk for a particular K, and P(K|ΔLk) and also to two simple distributions: P(ΔLk), the distribution of ΔLk irrespective of K, and P(K). We explored the relationships between these distributions. P(ΔLk, K), P(ΔLk), and P(K|ΔLk) were calculated from the simulated distributions of P(ΔLk|K) and of P(K). The calculated distributions agreed with previous experimental and theoretical results and greatly advanced on them. Our major focus was on P(K|ΔLk), the distribution of knot types for a particular value of ΔLk, which had not been evaluated previously. We found that unknotted circular DNA is not the most probable state beyond small values of ΔLk. Highly chiral knotted DNA has a lower free energy because it has less torsional deformation. Surprisingly, even at |ΔLk| > 12, only one or two knot types dominate the P(K|ΔLk) distribution despite the huge number of knots of comparable complexity. A large fraction of the knots found belong to the small family of torus knots. The relationship between supercoiling and knotting in vivo is discussed.
Resumo:
This paper considers a model-based approach to the clustering of tissue samples of a very large number of genes from microarray experiments. It is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. Frequently in practice, there are also clinical data available on those cases on which the tissue samples have been obtained. Here we investigate how to use the clinical data in conjunction with the microarray gene expression data to cluster the tissue samples. We propose two mixture model-based approaches in which the number of components in the mixture model corresponds to the number of clusters to be imposed on the tissue samples. One approach specifies the components of the mixture model to be the conditional distributions of the microarray data given the clinical data with the mixing proportions also conditioned on the latter data. Another takes the components of the mixture model to represent the joint distributions of the clinical and microarray data. The approaches are demonstrated on some breast cancer data, as studied recently in van't Veer et al. (2002).
Resumo:
In this paper, we present the application of a non-linear dimensionality reduction technique for the learning and probabilistic classification of hyperspectral image. Hyperspectral image spectroscopy is an emerging technique for geological investigations from airborne or orbital sensors. It gives much greater information content per pixel on the image than a normal colour image. This should greatly help with the autonomous identification of natural and manmade objects in unfamiliar terrains for robotic vehicles. However, the large information content of such data makes interpretation of hyperspectral images time-consuming and userintensive. We propose the use of Isomap, a non-linear manifold learning technique combined with Expectation Maximisation in graphical probabilistic models for learning and classification. Isomap is used to find the underlying manifold of the training data. This low dimensional representation of the hyperspectral data facilitates the learning of a Gaussian Mixture Model representation, whose joint probability distributions can be calculated offline. The learnt model is then applied to the hyperspectral image at runtime and data classification can be performed.
Resumo:
We derive a new method for determining size-transition matrices (STMs) that eliminates probabilities of negative growth and accounts for individual variability. STMs are an important part of size-structured models, which are used in the stock assessment of aquatic species. The elements of STMs represent the probability of growth from one size class to another, given a time step. The growth increment over this time step can be modelled with a variety of methods, but when a population construct is assumed for the underlying growth model, the resulting STM may contain entries that predict negative growth. To solve this problem, we use a maximum likelihood method that incorporates individual variability in the asymptotic length, relative age at tagging, and measurement error to obtain von Bertalanffy growth model parameter estimates. The statistical moments for the future length given an individual's previous length measurement and time at liberty are then derived. We moment match the true conditional distributions with skewed-normal distributions and use these to accurately estimate the elements of the STMs. The method is investigated with simulated tag-recapture data and tag-recapture data gathered from the Australian eastern king prawn (Melicertus plebejus).
Resumo:
We derive a new method for determining size-transition matrices (STMs) that eliminates probabilities of negative growth and accounts for individual variability. STMs are an important part of size-structured models, which are used in the stock assessment of aquatic species. The elements of STMs represent the probability of growth from one size class to another, given a time step. The growth increment over this time step can be modelled with a variety of methods, but when a population construct is assumed for the underlying growth model, the resulting STM may contain entries that predict negative growth. To solve this problem, we use a maximum likelihood method that incorporates individual variability in the asymptotic length, relative age at tagging, and measurement error to obtain von Bertalanffy growth model parameter estimates. The statistical moments for the future length given an individual’s previous length measurement and time at liberty are then derived. We moment match the true conditional distributions with skewed-normal distributions and use these to accurately estimate the elements of the STMs. The method is investigated with simulated tag–recapture data and tag–recapture data gathered from the Australian eastern king prawn (Melicertus plebejus).
Resumo:
Guo and Nixon proposed a feature selection method based on maximizing I(x; Y),the multidimensional mutual information between feature vector x and class variable Y. Because computing I(x; Y) can be difficult in practice, Guo and Nixon proposed an approximation of I(x; Y) as the criterion for feature selection. We show that Guo and Nixon's criterion originates from approximating the joint probability distributions in I(x; Y) by second-order product distributions. We remark on the limitations of the approximation and discuss computationally attractive alternatives to compute I(x; Y).
Resumo:
Semi-qualitative probabilistic networks (SQPNs) merge two important graphical model formalisms: Bayesian networks and qualitative probabilistic networks. They provide a very general modeling framework by allowing the combination of numeric and qualitative assessments over a discrete domain, and can be compactly encoded by exploiting the same factorization of joint probability distributions that are behind the Bayesian networks. This paper explores the computational complexity of semi-qualitative probabilistic networks, and takes the polytree-shaped networks as its main target. We show that the inference problem is coNP-Complete for binary polytrees with multiple observed nodes. We also show that inferences can be performed in linear time if there is a single observed node, which is a relevant practical case. Because our proof is constructive, we obtain an efficient linear time algorithm for SQPNs under such assumptions. To the best of our knowledge, this is the first exact polynomial-time algorithm for SQPNs. Together these results provide a clear picture of the inferential complexity in polytree-shaped SQPNs.
Resumo:
A credal network is a graph-theoretic model that represents imprecision in joint probability distributions. An inference in a credal net aims at computing an interval for the probability of an event of interest. Algorithms for inference in credal networks can be divided into exact and approximate. The selection of an algorithm is based on a trade off that ponders how much time someone wants to spend in a particular calculation against the quality of the computed values. This paper presents an algorithm, called IDS, that combines exact and approximate methods for computing inferences in polytree-shaped credal networks. The algorithm provides an approach to trade time and precision when making inferences in credal nets
Resumo:
Exercises and solutions in PDF