24 resultados para Bayesian hierarchical model
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
A Bayesian nonparametric model for Taguchi's on-line quality monitoring procedure for attributes is introduced. The proposed model may accommodate the original single shift setting to the more realistic situation of gradual quality deterioration and allows the incorporation of an expert's opinion on the production process. Based on the number of inspections to be carried out until a defective item is found, the Bayesian operation for the distribution function that represents the increasing sequence of defective fractions during a cycle considering a mixture of Dirichlet processes as prior distribution is performed. Bayes estimates for relevant quantities are also obtained. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Objective: To assess the risk factors for delayed diagnosis of uterine cervical lesions. Materials and Methods: This is a case-control study that recruited 178 women at 2 Brazilian hospitals. The cases (n = 74) were composed of women with a late diagnosis of a lesion in the uterine cervix (invasive carcinoma in any stage). The controls (n = 104) were composed of women with cervical lesions diagnosed early on (low-or high-grade intraepithelial lesions). The analysis was performed by means of logistic regression model using a hierarchical model. The socioeconomic and demographic variables were included at level I (distal). Level II (intermediate) included the personal and family antecedents and knowledge about the Papanicolaou test and human papillomavirus. Level III (proximal) encompassed the variables relating to individuals' care for their own health, gynecologic symptoms, and variables relating to access to the health care system. Results: The risk factors for late diagnosis of uterine cervical lesions were age older than 40 years (odds ratio [OR] = 10.4; 95% confidence interval [CI], 2.3-48.4), not knowing the difference between the Papanicolaou test and gynecological pelvic examinations (OR, = 2.5; 95% CI, 1.3-4.9), not thinking that the Papanicolaou test was important (odds ratio [OR], 4.2; 95% CI, 1.3-13.4), and abnormal vaginal bleeding (OR, 15.0; 95% CI, 6.5-35.0). Previous treatment for sexually transmissible disease was a protective factor (OR, 0.3; 95% CI, 0.1-0.8) for delayed diagnosis. Conclusions: Deficiencies in cervical cancer prevention programs in developing countries are not simply a matter of better provision and coverage of Papanicolaou tests. The misconception about the Papanicolaou test is a serious educational problem, as demonstrated by the present study.
Resumo:
Introduction: The purpose of this ecological study was to evaluate the urban spatial and temporal distribution of tuberculosis (TB) in Ribeirao Preto, State of Sao Paulo, southeast Brazil, between 2006 and 2009 and to evaluate its relationship with factors of social vulnerability such as income and education level. Methods: We evaluated data from TBWeb, an electronic notification system for TB cases. Measures of social vulnerability were obtained from the SEADE Foundation, and information about the number of inhabitants, education and income of the households were obtained from Brazilian Institute of Geography and Statistics. Statistical analyses were conducted by a Bayesian regression model assuming a Poisson distribution for the observed new cases of TB in each area. A conditional autoregressive structure was used for the spatial covariance structure. Results: The Bayesian model confirmed the spatial heterogeneity of TB distribution in Ribeirao Preto, identifying areas with elevated risk and the effects of social vulnerability on the disease. We demonstrated that the rate of TB was correlated with the measures of income, education and social vulnerability. However, we observed areas with low vulnerability and high education and income, but with high estimated TB rates. Conclusions: The study identified areas with different risks for TB, given that the public health system deals with the characteristics of each region individually and prioritizes those that present a higher propensity to risk of TB. Complex relationships may exist between TB incidence and a wide range of environmental and intrinsic factors, which need to be studied in future research.
Resumo:
In this paper, we propose a random intercept Poisson model in which the random effect is assumed to follow a generalized log-gamma (GLG) distribution. This random effect accommodates (or captures) the overdispersion in the counts and induces within-cluster correlation. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are, in general, required for deriving the marginal models, we obtain the multivariate negative binomial model from a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis is proposed and two applications with real data are given for illustration. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
INTRODUCTION: The purpose of this ecological study was to evaluate the urban spatial and temporal distribution of tuberculosis (TB) in Ribeirão Preto, State of São Paulo, southeast Brazil, between 2006 and 2009 and to evaluate its relationship with factors of social vulnerability such as income and education level. METHODS: We evaluated data from TBWeb, an electronic notification system for TB cases. Measures of social vulnerability were obtained from the SEADE Foundation, and information about the number of inhabitants, education and income of the households were obtained from Brazilian Institute of Geography and Statistics. Statistical analyses were conducted by a Bayesian regression model assuming a Poisson distribution for the observed new cases of TB in each area. A conditional autoregressive structure was used for the spatial covariance structure. RESULTS: The Bayesian model confirmed the spatial heterogeneity of TB distribution in Ribeirão Preto, identifying areas with elevated risk and the effects of social vulnerability on the disease. We demonstrated that the rate of TB was correlated with the measures of income, education and social vulnerability. However, we observed areas with low vulnerability and high education and income, but with high estimated TB rates. CONCLUSIONS: The study identified areas with different risks for TB, given that the public health system deals with the characteristics of each region individually and prioritizes those that present a higher propensity to risk of TB. Complex relationships may exist between TB incidence and a wide range of environmental and intrinsic factors, which need to be studied in future research.
Resumo:
Biogeography has been difficult to apply as a methodological approach because organismic biology is incomplete at levels where the process of formulating comparisons and analogies is complex. The study of insect biogeography became necessary because insects possess numerous evolutionary traits and play an important role as pollinators. Among insects, the euglossine bees, or orchid bees, attract interest because the study of their biology allows us to explain important steps in the evolution of social behavior and many other adaptive tradeoffs. We analyzed the distribution of morphological characteristics in Colombian orchid bees from an ecological perspective. The aim of this study was to observe the distribution of these attributes on a regional basis. Data corresponding to Colombian euglossine species were ordered with a correspondence analysis and with subsequent hierarchical clustering. Later, and based on community proprieties, we compared the resulting hierarchical model with the collection localities to seek to identify a biogeographic classification pattern. From this analysis, we derived a model that classifies the territory of Colombia into 11 biogeographic units or natural clusters. Ecological assumptions in concordance with the derived classification levels suggest that species characteristics associated with flight performance, nectar uptake, and social behavior are the factors that served to produce the current geographical structure.
Resumo:
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is an interest in studying latent variables (or latent traits). Usually such latent traits are assumed to be random variables and a convenient distribution is assigned to them. A very common choice for such a distribution has been the standard normal. Recently, Azevedo et al. [Bayesian inference for a skew-normal IRT model under the centred parameterization, Comput. Stat. Data Anal. 55 (2011), pp. 353-365] proposed a skew-normal distribution under the centred parameterization (SNCP) as had been studied in [R. B. Arellano-Valle and A. Azzalini, The centred parametrization for the multivariate skew-normal distribution, J. Multivariate Anal. 99(7) (2008), pp. 1362-1382], to model the latent trait distribution. This approach allows one to represent any asymmetric behaviour concerning the latent trait distribution. Also, they developed a Metropolis-Hastings within the Gibbs sampling (MHWGS) algorithm based on the density of the SNCP. They showed that the algorithm recovers all parameters properly. Their results indicated that, in the presence of asymmetry, the proposed model and the estimation algorithm perform better than the usual model and estimation methods. Our main goal in this paper is to propose another type of MHWGS algorithm based on a stochastic representation (hierarchical structure) of the SNCP studied in [N. Henze, A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271-275]. Our algorithm has only one Metropolis-Hastings step, in opposition to the algorithm developed by Azevedo et al., which has two such steps. This not only makes the implementation easier but also reduces the number of proposal densities to be used, which can be a problem in the implementation of MHWGS algorithms, as can be seen in [R.J. Patz and B.W. Junker, A straightforward approach to Markov Chain Monte Carlo methods for item response models, J. Educ. Behav. Stat. 24(2) (1999), pp. 146-178; R. J. Patz and B. W. Junker, The applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, J. Educ. Behav. Stat. 24(4) (1999), pp. 342-366; A. Gelman, G.O. Roberts, and W.R. Gilks, Efficient Metropolis jumping rules, Bayesian Stat. 5 (1996), pp. 599-607]. Moreover, we consider a modified beta prior (which generalizes the one considered in [3]) and a Jeffreys prior for the asymmetry parameter. Furthermore, we study the sensitivity of such priors as well as the use of different kernel densities for this parameter. Finally, we assess the impact of the number of examinees, number of items and the asymmetry level on the parameter recovery. Results of the simulation study indicated that our approach performed equally as well as that in [3], in terms of parameter recovery, mainly using the Jeffreys prior. Also, they indicated that the asymmetry level has the highest impact on parameter recovery, even though it is relatively small. A real data analysis is considered jointly with the development of model fitting assessment tools. The results are compared with the ones obtained by Azevedo et al. The results indicate that using the hierarchical approach allows us to implement MCMC algorithms more easily, it facilitates diagnosis of the convergence and also it can be very useful to fit more complex skew IRT models.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for the right-censored survival data when immune or cured individuals may be present in the population from which the data is taken. In our approach the number of competing causes of the event of interest follows the Conway-Maxwell-Poisson distribution which generalizes the Poisson distribution. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.
Resumo:
The objective of this paper is to model variations in test-day milk yields of first lactations of Holstein cows by RR using B-spline functions and Bayesian inference in order to fit adequate and parsimonious models for the estimation of genetic parameters. They used 152,145 test day milk yield records from 7317 first lactations of Holstein cows. The model established in this study was additive, permanent environmental and residual random effects. In addition, contemporary group and linear and quadratic effects of the age of cow at calving were included as fixed effects. Authors modeled the average lactation curve of the population with a fourth-order orthogonal Legendre polynomial. They concluded that a cubic B-spline with seven random regression coefficients for both the additive genetic and permanent environment effects was to be the best according to residual mean square and residual variance estimates. Moreover they urged a lower order model (quadratic B-spline with seven random regression coefficients for both random effects) could be adopted because it yielded practically the same genetic parameter estimates with parsimony. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis - latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.
Resumo:
Abstract Background An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.
Resumo:
Spin systems in the presence of disorder are described by two sets of degrees of freedom, associated with orientational (spin) and disorder variables, which may be characterized by two distinct relaxation times. Disordered spin models have been mostly investigated in the quenched regime, which is the usual situation in solid state physics, and in which the relaxation time of the disorder variables is much larger than the typical measurement times. In this quenched regime, disorder variables are fixed, and only the orientational variables are duly thermalized. Recent studies in the context of lattice statistical models for the phase diagrams of nematic liquid-crystalline systems have stimulated the interest of going beyond the quenched regime. The phase diagrams predicted by these calculations for a simple Maier-Saupe model turn out to be qualitative different from the quenched case if the two sets of degrees of freedom are allowed to reach thermal equilibrium during the experimental time, which is known as the fully annealed regime. In this work, we develop a transfer matrix formalism to investigate annealed disordered Ising models on two hierarchical structures, the diamond hierarchical lattice (DHL) and the Apollonian network (AN). The calculations follow the same steps used for the analysis of simple uniform systems, which amounts to deriving proper recurrence maps for the thermodynamic and magnetic variables in terms of the generations of the construction of the hierarchical structures. In this context, we may consider different kinds of disorder, and different types of ferromagnetic and anti-ferromagnetic interactions. In the present work, we analyze the effects of dilution, which are produced by the removal of some magnetic ions. The system is treated in a “grand canonical" ensemble. The introduction of two extra fields, related to the concentration of two different types of particles, leads to higher-rank transfer matrices as compared with the formalism for the usual uniform models. Preliminary calculations on a DHL indicate that there is a phase transition for a wide range of dilution concentrations. Ising spin systems on the AN are known to be ferromagnetically ordered at all temperatures; in the presence of dilution, however, there are indications of a disordered (paramagnetic) phase at low concentrations of magnetic ions.
Resumo:
In this paper we propose a hybrid hazard regression model with threshold stress which includes the proportional hazards and the accelerated failure time models as particular cases. To express the behavior of lifetimes the generalized-gamma distribution is assumed and an inverse power law model with a threshold stress is considered. For parameter estimation we develop a sampling-based posterior inference procedure based on Markov Chain Monte Carlo techniques. We assume proper but vague priors for the parameters of interest. A simulation study investigates the frequentist properties of the proposed estimators obtained under the assumption of vague priors. Further, some discussions on model selection criteria are given. The methodology is illustrated on simulated and real lifetime data set.
Resumo:
A data set of a commercial Nellore beef cattle selection program was used to compare breeding models that assumed or not markers effects to estimate the breeding values, when a reduced number of animals have phenotypic, genotypic and pedigree information available. This herd complete data set was composed of 83,404 animals measured for weaning weight (WW), post-weaning gain (PWG), scrotal circumference (SC) and muscle score (MS), corresponding to 116,652 animals in the relationship matrix. Single trait analyses were performed by MTDFREML software to estimate fixed and random effects solutions using this complete data. The additive effects estimated were assumed as the reference breeding values for those animals. The individual observed phenotype of each trait was adjusted for fixed and random effects solutions, except for direct additive effects. The adjusted phenotype composed of the additive and residual parts of observed phenotype was used as dependent variable for models' comparison. Among all measured animals of this herd, only 3160 animals were genotyped for 106 SNP markers. Three models were compared in terms of changes on animals' rank, global fit and predictive ability. Model 1 included only polygenic effects, model 2 included only markers effects and model 3 included both polygenic and markers effects. Bayesian inference via Markov chain Monte Carlo methods performed by TM software was used to analyze the data for model comparison. Two different priors were adopted for markers effects in models 2 and 3, the first prior assumed was a uniform distribution (U) and, as a second prior, was assumed that markers effects were distributed as normal (N). Higher rank correlation coefficients were observed for models 3_U and 3_N, indicating a greater similarity of these models animals' rank and the rank based on the reference breeding values. Model 3_N presented a better global fit, as demonstrated by its low DIC. The best models in terms of predictive ability were models 1 and 3_N. Differences due prior assumed to markers effects in models 2 and 3 could be attributed to the better ability of normal prior in handle with collinear effects. The models 2_U and 2_N presented the worst performance, indicating that this small set of markers should not be used to genetically evaluate animals with no data, since its predictive ability is restricted. In conclusion, model 3_N presented a slight superiority when a reduce number of animals have phenotypic, genotypic and pedigree information. It could be attributed to the variation retained by markers and polygenic effects assumed together and the normal prior assumed to markers effects, that deals better with the collinearity between markers. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Many recent survival studies propose modeling data with a cure fraction, i.e., data in which part of the population is not susceptible to the event of interest. This event may occur more than once for the same individual (recurrent event). We then have a scenario of recurrent event data in the presence of a cure fraction, which may appear in various areas such as oncology, finance, industries, among others. This paper proposes a multiple time scale survival model to analyze recurrent events using a cure fraction. The objective is analyzing the efficiency of certain interventions so that the studied event will not happen again in terms of covariates and censoring. All estimates were obtained using a sampling-based approach, which allows information to be input beforehand with lower computational effort. Simulations were done based on a clinical scenario in order to observe some frequentist properties of the estimation procedure in the presence of small and moderate sample sizes. An application of a well-known set of real mammary tumor data is provided.