Biblioteca Digital

991 resultados para Random variables

Mapping an uncertainty zone between interpolated types of a categorical variable

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Categorical data cannot be interpolated directly because they are outcomes of discrete random variables. Thus, types of categorical variables are transformed into indicator functions that can be handled by interpolation methods. Interpolated indicator values are then backtransformed to the original types of categorical variables. However, aspects such as variability and uncertainty of interpolated values of categorical data have never been considered. In this paper we show that the interpolation variance can be used to map an uncertainty zone around boundaries between types of categorical variables. Moreover, it is shown that the interpolation variance is a component of the total variance of the categorical variables, as measured by the coefficient of unalikeability. (C) 2011 Elsevier Ltd. All rights reserved.

Parameter recovery for a skew-normal IRT model under a Bayesian approach: hierarchical framework, prior and kernel sensitivity and sample size

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is an interest in studying latent variables (or latent traits). Usually such latent traits are assumed to be random variables and a convenient distribution is assigned to them. A very common choice for such a distribution has been the standard normal. Recently, Azevedo et al. [Bayesian inference for a skew-normal IRT model under the centred parameterization, Comput. Stat. Data Anal. 55 (2011), pp. 353-365] proposed a skew-normal distribution under the centred parameterization (SNCP) as had been studied in [R. B. Arellano-Valle and A. Azzalini, The centred parametrization for the multivariate skew-normal distribution, J. Multivariate Anal. 99(7) (2008), pp. 1362-1382], to model the latent trait distribution. This approach allows one to represent any asymmetric behaviour concerning the latent trait distribution. Also, they developed a Metropolis-Hastings within the Gibbs sampling (MHWGS) algorithm based on the density of the SNCP. They showed that the algorithm recovers all parameters properly. Their results indicated that, in the presence of asymmetry, the proposed model and the estimation algorithm perform better than the usual model and estimation methods. Our main goal in this paper is to propose another type of MHWGS algorithm based on a stochastic representation (hierarchical structure) of the SNCP studied in [N. Henze, A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271-275]. Our algorithm has only one Metropolis-Hastings step, in opposition to the algorithm developed by Azevedo et al., which has two such steps. This not only makes the implementation easier but also reduces the number of proposal densities to be used, which can be a problem in the implementation of MHWGS algorithms, as can be seen in [R.J. Patz and B.W. Junker, A straightforward approach to Markov Chain Monte Carlo methods for item response models, J. Educ. Behav. Stat. 24(2) (1999), pp. 146-178; R. J. Patz and B. W. Junker, The applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, J. Educ. Behav. Stat. 24(4) (1999), pp. 342-366; A. Gelman, G.O. Roberts, and W.R. Gilks, Efficient Metropolis jumping rules, Bayesian Stat. 5 (1996), pp. 599-607]. Moreover, we consider a modified beta prior (which generalizes the one considered in [3]) and a Jeffreys prior for the asymmetry parameter. Furthermore, we study the sensitivity of such priors as well as the use of different kernel densities for this parameter. Finally, we assess the impact of the number of examinees, number of items and the asymmetry level on the parameter recovery. Results of the simulation study indicated that our approach performed equally as well as that in [3], in terms of parameter recovery, mainly using the Jeffreys prior. Also, they indicated that the asymmetry level has the highest impact on parameter recovery, even though it is relatively small. A real data analysis is considered jointly with the development of model fitting assessment tools. The results are compared with the ones obtained by Azevedo et al. The results indicate that using the hierarchical approach allows us to implement MCMC algorithms more easily, it facilitates diagnosis of the convergence and also it can be very useful to fit more complex skew IRT models.

Dynamics of coastal aquifers: data-driven forecasting and risk analysis

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work is focused on the study of saltwater intrusion in coastal aquifers, and in particular on the realization of conceptual schemes to evaluate the risk associated with it. Saltwater intrusion depends on different natural and anthropic factors, both presenting a strong aleatory behaviour, that should be considered for an optimal management of the territory and water resources. Given the uncertainty of problem parameters, the risk associated with salinization needs to be cast in a probabilistic framework. On the basis of a widely adopted sharp interface formulation, key hydrogeological problem parameters are modeled as random variables, and global sensitivity analysis is used to determine their influence on the position of saltwater interface. The analyses presented in this work rely on an efficient model reduction technique, based on Polynomial Chaos Expansion, able to combine the best description of the model without great computational burden. When the assumptions of classical analytical models are not respected, and this occurs several times in the applications to real cases of study, as in the area analyzed in the present work, one can adopt data-driven techniques, based on the analysis of the data characterizing the system under study. It follows that a model can be defined on the basis of connections between the system state variables, with only a limited number of assumptions about the "physical" behaviour of the system.

Limiting theorems in statistical mechanics. The mean-field case.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Curie-Weiss model is defined by ah Hamiltonian according to spins interact. For some particular values of the parameters, the sum of the spins normalized with square-root normalization converges or not toward Gaussian distribution. In the thesis we investigate some correlations between the behaviour of the sum and the central limit for interacting random variables.

Semiparametric Normal Transformation Models for Spatially Correlated Survival Data

Relevância:

60.00% 60.00%

Publicador:

Resumo:

There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.

A unified approach to modeling multivariate binary data using copulas over partitions

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many seemingly disparate approaches for marginal modeling have been developed in recent years. We demonstrate that many current approaches for marginal modeling of correlated binary outcomes produce likelihoods that are equivalent to the proposed copula-based models herein. These general copula models of underlying latent threshold random variables yield likelihood based models for marginal fixed effects estimation and interpretation in the analysis of correlated binary data. Moreover, we propose a nomenclature and set of model relationships that substantially elucidates the complex area of marginalized models for binary data. A diverse collection of didactic mathematical and numerical examples are given to illustrate concepts.

Confidence Bands for Convex Median Curves using Sign-Tests

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Suppose that one observes pairs (x1,Y1), (x2,Y2), ..., (xn,Yn), where x1 < x2 < ... < xn are fixed numbers while Y1, Y2, ..., Yn are independent random variables with unknown distributions. The only assumption is that Median(Yi) = f(xi) for some unknown convex or concave function f. We present a confidence band for this regression function f using suitable multiscale sign tests. While the exact computation of this band seems to require O(n4) steps, good approximations can be obtained in O(n2) steps. In addition the confidence band is shown to have desirable asymptotic properties as the sample size n tends to infinity.

M-Functionals of Multivariate Scatter

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This survey provides a self-contained account of M-estimation of multivariate scatter. In particular, we present new proofs for existence of the underlying M-functionals and discuss their weak continuity and differentiability. This is done in a rather general framework with matrix-valued random variables. By doing so we reveal a connection between Tyler's (1987) M-functional of scatter and the estimation of proportional covariance matrices. Moreover, this general framework allows us to treat a new class of scatter estimators, based on symmetrizations of arbitrary order. Finally these results are applied to M-estimation of multivariate location and scatter via multivariate t-distributions.

Confidence Bands for Isotonic Median Curves using Sign-Tests

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Suppose that one observes independent random variables (X1, Y1), (X2, Y2), …, (Xn, Yn) in R2 with unknown distributions, except that Median(Yi | Xi = M(x) for some unknown isotonic function M. We describe an explicit algorithm for the computation of confidence bands for the median function M whose running time is of order O(n2). The bands rely on multiscale sign tests and are shown to have desirable asymptotic properties.

Bayesian generalized linear models for meta-analysis of diagnostic tests

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^

A simulation study using the hybrid statistic and the meta-analysis t-test for data with matched and unmatched subjects in the interim analysis of clinical trials

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An interim analysis is usually applied in later phase II or phase III trials to find convincing evidence of a significant treatment difference that may lead to trial termination at an earlier point than planned at the beginning. This can result in the saving of patient resources and shortening of drug development and approval time. In addition, ethics and economics are also the reasons to stop a trial earlier. In clinical trials of eyes, ears, knees, arms, kidneys, lungs, and other clustered treatments, data may include distribution-free random variables with matched and unmatched subjects in one study. It is important to properly include both subjects in the interim and the final analyses so that the maximum efficiency of statistical and clinical inferences can be obtained at different stages of the trials. So far, no publication has applied a statistical method for distribution-free data with matched and unmatched subjects in the interim analysis of clinical trials. In this simulation study, the hybrid statistic was used to estimate the empirical powers and the empirical type I errors among the simulated datasets with different sample sizes, different effect sizes, different correlation coefficients for matched pairs, and different data distributions, respectively, in the interim and final analysis with 4 different group sequential methods. Empirical powers and empirical type I errors were also compared to those estimated by using the meta-analysis t-test among the same simulated datasets. Results from this simulation study show that, compared to the meta-analysis t-test commonly used for data with normally distributed observations, the hybrid statistic has a greater power for data observed from normally, log-normally, and multinomially distributed random variables with matched and unmatched subjects and with outliers. Powers rose with the increase in sample size, effect size, and correlation coefficient for the matched pairs. In addition, lower type I errors were observed estimated by using the hybrid statistic, which indicates that this test is also conservative for data with outliers in the interim analysis of clinical trials.^

Estimadores bayesianos de la fiabilidad con muestreo censurado

Relevância:

60.00% 60.00%

Publicador:

Resumo:

El estudio de la fiabilidad de componentes y sistemas tiene gran importancia en diversos campos de la ingenieria, y muy concretamente en el de la informatica. Al analizar la duracion de los elementos de la muestra hay que tener en cuenta los elementos que no fallan en el tiempo que dure el experimento, o bien los que fallen por causas distintas a la que es objeto de estudio. Por ello surgen nuevos tipos de muestreo que contemplan estos casos. El mas general de ellos, el muestreo censurado, es el que consideramos en nuestro trabajo. En este muestreo tanto el tiempo hasta que falla el componente como el tiempo de censura son variables aleatorias. Con la hipotesis de que ambos tiempos se distribuyen exponencialmente, el profesor Hurt estudio el comportamiento asintotico del estimador de maxima verosimilitud de la funcion de fiabilidad. En principio parece interesante utilizar metodos Bayesianos en el estudio de la fiabilidad porque incorporan al analisis la informacion a priori de la que se dispone normalmente en problemas reales. Por ello hemos considerado dos estimadores Bayesianos de la fiabilidad de una distribucion exponencial que son la media y la moda de la distribucion a posteriori. Hemos calculado la expansion asint6tica de la media, varianza y error cuadratico medio de ambos estimadores cuando la distribuci6n de censura es exponencial. Hemos obtenido tambien la distribucion asintotica de los estimadores para el caso m3s general de que la distribucion de censura sea de Weibull. Dos tipos de intervalos de confianza para muestras grandes se han propuesto para cada estimador. Los resultados se han comparado con los del estimador de maxima verosimilitud, y con los de dos estimadores no parametricos: limite producto y Bayesiano, resultando un comportamiento superior por parte de uno de nuestros estimadores. Finalmente nemos comprobado mediante simulacion que nuestros estimadores son robustos frente a la supuesta distribuci6n de censura, y que uno de los intervalos de confianza propuestos es valido con muestras pequenas. Este estudio ha servido tambien para confirmar el mejor comportamiento de uno de nuestros estimadores. SETTING OUT AND SUMMARY OF THE THESIS When we study the lifetime of components it's necessary to take into account the elements that don't fail during the experiment, or those that fail by reasons which are desirable to exclude from consideration. The model of random censorship is very usefull for analysing these data. In this model the time to failure and the time censor are random variables. We obtain two Bayes estimators of the reliability function of an exponential distribution based on randomly censored data. We have calculated the asymptotic expansion of the mean, variance and mean square error of both estimators, when the censor's distribution is exponential. We have obtained also the asymptotic distribution of the estimators for the more general case of censor's Weibull distribution. Two large-sample confidence bands have been proposed for each estimator. The results have been compared with those of the maximum likelihood estimator, and with those of two non parametric estimators: Product-limit and Bayesian. One of our estimators has the best behaviour. Finally we have shown by simulation, that our estimators are robust against the assumed censor's distribution, and that one of our intervals does well in small sample situation.

Reliability of tunnel liners

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The possibility of application of structural reliability theory to the computation of the safety margins of excavated tunnels is presented. After a brief description of the existing procedures the limitations of the safety coefficients such as they usually defined, the proposed limit states are precised as well as the random variables and the applied methodology. Also presented are simple examples, some of them based in actual cases, and to end, some conclusions are established the most important one being the probability of using the method to solve the inverse problem of identification.

Level II reliability methods for tunnels

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In tunnel construction, as in every engineering work, it is usual the decision making, with incomplete data. Nevertheless, consciously or not, the builder weighs the risks (even if this is done subjectively) so that he can offer a cost. The objective of this paper is to recall the existence of a methodology to treat the uncertainties in the data so that it is possible to see their effect on the output of the computational model used and then to estimate the failure probability or the safety margin of a structure. In this scheme it is possible to include the subjective knowledge on the statistical properties of the random variables and, using a numerical model consistent with the degree of complexity appropiate to the problem at hand, to make rationally based decisions. As will be shown with the method it is possible to quantify the relative importance of the random variables and, in addition, it can be used, under certain conditions, to solve the inverse problem. It is then a method very well suited both to the project and to the control phases of tunnel construction.

Reliability analysis based on non-dimensional parameters

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A reliability analysis method is proposed that starts with the identification of all variables involved. These are divided in three groups: (a) variables fixed by codes, as loads and strength project values, and their corresponding partial safety coefficients, (b) geometric variables defining the dimension of the main elements involved, (c) the cost variables, including the possible damages caused by failure, (d) the random variables as loads, strength, etc., and (e)the variables defining the statistical model, as the family of distribution and its corresponding parameters. Once the variables are known, the II-theorem is used to obtain a minimum equivalent set of non-dimensional variables, which is used to define the limit states. This allows a reduction in the number of variables involved and a better understanding of their coupling effects. Two minimum cost criteria are used for selecting the project dimensions. One is based on a bounded-probability of failure, and the other on a total cost, including the damages of the possible failure. Finally, the method is illustrated by means of an application.

«
1
2
...
5
6
7
8
9
10
11
...
66
67
»