15 resultados para bayesian methods
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Background: The evaluation of associations between genotypes and diseases in a case-control framework plays an important role in genetic epidemiology. This paper focuses on the evaluation of the homogeneity of both genotypic and allelic frequencies. The traditional test that is used to check allelic homogeneity is known to be valid only under Hardy-Weinberg equilibrium, a property that may not hold in practice. Results: We first describe the flaws of the traditional (chi-squared) tests for both allelic and genotypic homogeneity. Besides the known problem of the allelic procedure, we show that whenever these tests are used, an incoherence may arise: sometimes the genotypic homogeneity hypothesis is not rejected, but the allelic hypothesis is. As we argue, this is logically impossible. Some methods that were recently proposed implicitly rely on the idea that this does not happen. In an attempt to correct this incoherence, we describe an alternative frequentist approach that is appropriate even when Hardy-Weinberg equilibrium does not hold. It is then shown that the problem remains and is intrinsic of frequentist procedures. Finally, we introduce the Full Bayesian Significance Test to test both hypotheses and prove that the incoherence cannot happen with these new tests. To illustrate this, all five tests are applied to real and simulated datasets. Using the celebrated power analysis, we show that the Bayesian method is comparable to the frequentist one and has the advantage of being coherent. Conclusions: Contrary to more traditional approaches, the Full Bayesian Significance Test for association studies provides a simple, coherent and powerful tool for detecting associations.
Resumo:
The circumscription of genera belonging to tribe Bignonieae (Bignoniaceae) has traditionally been complex, with only a few genera having stable circumscriptions in the various classification systems proposed for the tribe. The genus Lundia, for instance, is well characterized by a series of morphological synapomorphies and its circumscription has remained quite stable throughout its history. Despite the stable circumscription of Lundia, the circumscription of species within the genus has remained problematic. This study aims to reconstruct the phylogeny of Lundia in order to refine species circumscriptions, gain a better understanding of relationships between taxa, and identify potential morphological synapomorphies for species and major clades. We sampled 26 accessions representing 13 species of Lundia, and 5 outgroups, and reconstructed the phylogeny of the genus using a chloroplast (ndhF) and a nuclear marker (PepC). Data derived from sequences of the individual loci were analyzed using parsimony and Bayesian inference, and the combined molecular dataset was analyzed with Bayesian methods. The monophyly of Lundia nitidula, a species with a particularly complex circumscription, was tested using Shimodaira-Hasegawa (SH) test and the approximately unbiased test for phylogenetic tree selection (AU test). In addition, 40 morphological characters were mapped onto the tree that resulted from the analysis of the combined molecular dataset in order to identify morphological synapomorphies of individual species and major clades. Lundia and most species currently recognized within the genus were strongly supported as monophyletic in all analyses. One species, Lundia nitidula, was not resolved as monophyletic, but the monophyly of this species was not rejected by the AU and SH tests. Lundia sect. Eriolundia is resolved as paraphyletic in all analyses, while Lundia sect. Eulundia is monophyletic and supported by the same morphological characters traditionally used to circumscribe this section. The phylogeny of Lundia contributed important information for a better circumscription of species and served as basis the taxonomic revision of the genus.
Resumo:
The HIV-1 subtype C has spread efficiently in the southern states of Brazil (Rio Grande do Sul, Santa Catarina and Parana). Phylogeographic studies indicate that the subtype C epidemic in southern Brazil was initiated by the introduction of a single founder virus population at some time point between 1960 and 1980, but little is known about the spatial dynamics of viral spread. A total of 135 Brazilian HIV-1 subtype C pol sequences collected from 1992 to 2009 at the three southern state capitals (Porto Alegre, Florianopolis and Curitiba) were analyzed. Maximum-likelihood and Bayesian methods were used to explore the degree of phylogenetic mixing of subtype C sequences from different cities and to reconstruct the geographical pattern of viral spread in this country region. Phylogeographic analyses supported the monophyletic origin of the HIV-1 subtype C clade circulating in southern Brazil and placed the root of that clade in Curitiba (Parana state). This analysis further suggested that Florianopolis (Santa Catarina state) is an important staging post in the subtype C dissemination displaying high viral migration rates from and to the other cities, while viral flux between Curitiba and Porto Alegre (Rio Grande do Sul state) is very low. We found a positive correlation (r(2) = 0.64) between routine travel and viral migration rates among localities. Despite the intense viral movement, phylogenetic intermixing of subtype C sequences from different Brazilian cities is lower than expected by chance. Notably, a high proportion (67%) of subtype C sequences from Porto Alegre branched within a single local monophyletic sub-cluster. These results suggest that the HIV-1 subtype C epidemic in southern Brazil has been shaped by both frequent viral migration among states and in situ dissemination of local clades.
Resumo:
Background: Human respiratory syncytial virus (HRSV) is one of the major etiologic agents of respiratory tract infections among children worldwide. Methodology/Principal Findings: Here through a comprehensive analysis of the two major HRSV groups A and B (n = 1983) which comprise of several genotypes, we present a complex pattern of population dynamics of HRSV over a time period of 50 years (1956-2006). Circulation pattern of HRSV revealed a series of expansions and fluctuations of co-circulating lineages with a predominance of HRSVA. Positively selected amino acid substitutions of the G glycoprotein occurred upon population growth of GB3 with a 60-nucleotide insertion (GB3 Insert), while other genotypes acquired substitutions upon both population growth and decrease, thus possibly reflecting a role for immune selected epitopes in linkage to the traced substitution sites that may have important relevance for vaccine design. Analysis evidenced the co-circulation and predominance of distinct HRSV genotypes in Brazil and suggested a year-round presence of the virus. In Brazil, GA2 and GA5 were the main culprits of HRSV outbreaks until recently, when the GB3 Insert became highly prevalent. Using Bayesian methods, we determined the dispersal patterns of genotypes through several inferred migratory routes. Conclusions/Significance: Genotypes spread across continents and between neighboring areas. Crucially, genotypes also remained at any given region for extended periods, independent of seasonal outbreaks possibly maintained by re-infecting the general population.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for the right-censored survival data when immune or cured individuals may be present in the population from which the data is taken. In our approach the number of competing causes of the event of interest follows the Conway-Maxwell-Poisson distribution which generalizes the Poisson distribution. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.
Resumo:
Introduction: The purpose of this ecological study was to evaluate the urban spatial and temporal distribution of tuberculosis (TB) in Ribeirao Preto, State of Sao Paulo, southeast Brazil, between 2006 and 2009 and to evaluate its relationship with factors of social vulnerability such as income and education level. Methods: We evaluated data from TBWeb, an electronic notification system for TB cases. Measures of social vulnerability were obtained from the SEADE Foundation, and information about the number of inhabitants, education and income of the households were obtained from Brazilian Institute of Geography and Statistics. Statistical analyses were conducted by a Bayesian regression model assuming a Poisson distribution for the observed new cases of TB in each area. A conditional autoregressive structure was used for the spatial covariance structure. Results: The Bayesian model confirmed the spatial heterogeneity of TB distribution in Ribeirao Preto, identifying areas with elevated risk and the effects of social vulnerability on the disease. We demonstrated that the rate of TB was correlated with the measures of income, education and social vulnerability. However, we observed areas with low vulnerability and high education and income, but with high estimated TB rates. Conclusions: The study identified areas with different risks for TB, given that the public health system deals with the characteristics of each region individually and prioritizes those that present a higher propensity to risk of TB. Complex relationships may exist between TB incidence and a wide range of environmental and intrinsic factors, which need to be studied in future research.
Resumo:
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is an interest in studying latent variables (or latent traits). Usually such latent traits are assumed to be random variables and a convenient distribution is assigned to them. A very common choice for such a distribution has been the standard normal. Recently, Azevedo et al. [Bayesian inference for a skew-normal IRT model under the centred parameterization, Comput. Stat. Data Anal. 55 (2011), pp. 353-365] proposed a skew-normal distribution under the centred parameterization (SNCP) as had been studied in [R. B. Arellano-Valle and A. Azzalini, The centred parametrization for the multivariate skew-normal distribution, J. Multivariate Anal. 99(7) (2008), pp. 1362-1382], to model the latent trait distribution. This approach allows one to represent any asymmetric behaviour concerning the latent trait distribution. Also, they developed a Metropolis-Hastings within the Gibbs sampling (MHWGS) algorithm based on the density of the SNCP. They showed that the algorithm recovers all parameters properly. Their results indicated that, in the presence of asymmetry, the proposed model and the estimation algorithm perform better than the usual model and estimation methods. Our main goal in this paper is to propose another type of MHWGS algorithm based on a stochastic representation (hierarchical structure) of the SNCP studied in [N. Henze, A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271-275]. Our algorithm has only one Metropolis-Hastings step, in opposition to the algorithm developed by Azevedo et al., which has two such steps. This not only makes the implementation easier but also reduces the number of proposal densities to be used, which can be a problem in the implementation of MHWGS algorithms, as can be seen in [R.J. Patz and B.W. Junker, A straightforward approach to Markov Chain Monte Carlo methods for item response models, J. Educ. Behav. Stat. 24(2) (1999), pp. 146-178; R. J. Patz and B. W. Junker, The applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, J. Educ. Behav. Stat. 24(4) (1999), pp. 342-366; A. Gelman, G.O. Roberts, and W.R. Gilks, Efficient Metropolis jumping rules, Bayesian Stat. 5 (1996), pp. 599-607]. Moreover, we consider a modified beta prior (which generalizes the one considered in [3]) and a Jeffreys prior for the asymmetry parameter. Furthermore, we study the sensitivity of such priors as well as the use of different kernel densities for this parameter. Finally, we assess the impact of the number of examinees, number of items and the asymmetry level on the parameter recovery. Results of the simulation study indicated that our approach performed equally as well as that in [3], in terms of parameter recovery, mainly using the Jeffreys prior. Also, they indicated that the asymmetry level has the highest impact on parameter recovery, even though it is relatively small. A real data analysis is considered jointly with the development of model fitting assessment tools. The results are compared with the ones obtained by Azevedo et al. The results indicate that using the hierarchical approach allows us to implement MCMC algorithms more easily, it facilitates diagnosis of the convergence and also it can be very useful to fit more complex skew IRT models.
Resumo:
To estimate causal relationships, time series econometricians must be aware of spurious correlation, a problem first mentioned by Yule (1926). To deal with this problem, one can work either with differenced series or multivariate models: VAR (VEC or VECM) models. These models usually include at least one cointegration relation. Although the Bayesian literature on VAR/VEC is quite advanced, Bauwens et al. (1999) highlighted that "the topic of selecting the cointegrating rank has not yet given very useful and convincing results". The present article applies the Full Bayesian Significance Test (FBST), especially designed to deal with sharp hypotheses, to cointegration rank selection tests in VECM time series models. It shows the FBST implementation using both simulated and available (in the literature) data sets. As illustration, standard non informative priors are used.
Resumo:
Aims: Guided tissue regeneration (GTR) and enamel matrix derivatives (EMD) are two popular regenerative treatments for periodontal infrabony lesions. Both have been used in conjunction with other regenerative materials. We conducted a Bayesian network meta-analysis of randomized controlled trials on treatment effects of GTR, EMD and their combination therapies. Material and Methods: A systematic literature search was conducted using the Medline, EMBASE, LILACS and CENTRAL databases up to and including June 2011. Treatment outcomes were changes in probing pocket depth (PPD), clinical attachment level (CAL) and infrabony defect depth. Different types of bone grafts were treated as one group and so were barrier membranes. Results: A total of 53 studies were included in this review, and we found small differences between regenerative therapies which were non-significant statistically and clinically. GTR and GTR-related combination therapies achieved greater PPD reduction than EMD and EMD-related combination therapies. Combination therapies achieved slightly greater CAL gain than the use of EMD or GTR alone. GTR with BG achieved greatest defect fill. Conclusion: Combination therapies performed better than single therapies, but the additional benefits were small. Bayesian network meta-analysis is a promising technique to compare multiple treatments. Further analysis of methodological characteristics will be required prior to clinical recommendations.
Resumo:
A common interest in gene expression data analysis is to identify from a large pool of candidate genes the genes that present significant changes in expression levels between a treatment and a control biological condition. Usually, it is done using a statistic value and a cutoff value that are used to separate the genes differentially and nondifferentially expressed. In this paper, we propose a Bayesian approach to identify genes differentially expressed calculating sequentially credibility intervals from predictive densities which are constructed using the sampled mean treatment effect from all genes in study excluding the treatment effect of genes previously identified with statistical evidence for difference. We compare our Bayesian approach with the standard ones based on the use of the t-test and modified t-tests via a simulation study, using small sample sizes which are common in gene expression data analysis. Results obtained report evidence that the proposed approach performs better than standard ones, especially for cases with mean differences and increases in treatment variance in relation to control variance. We also apply the methodologies to a well-known publicly available data set on Escherichia coli bacterium.
Resumo:
In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis - latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.
Resumo:
Abstract Background An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.
Resumo:
OBJECTIVE: To estimate the pretest probability of Cushing's syndrome (CS) diagnosis by a Bayesian approach using intuitive clinical judgment. MATERIALS AND METHODS: Physicians were requested, in seven endocrinology meetings, to answer three questions: "Based on your personal expertise, after obtaining clinical history and physical examination, without using laboratorial tests, what is your probability of diagnosing Cushing's Syndrome?"; "For how long have you been practicing Endocrinology?"; and "Where do you work?". A Bayesian beta regression, using the WinBugs software was employed. RESULTS: We obtained 294 questionnaires. The mean pretest probability of CS diagnosis was 51.6% (95%CI: 48.7-54.3). The probability was directly related to experience in endocrinology, but not with the place of work. CONCLUSION: Pretest probability of CS diagnosis was estimated using a Bayesian methodology. Although pretest likelihood can be context-dependent, experience based on years of practice may help the practitioner to diagnosis CS. Arq Bras Endocrinol Metab. 2012;56(9):633-7
Resumo:
INTRODUCTION: The purpose of this ecological study was to evaluate the urban spatial and temporal distribution of tuberculosis (TB) in Ribeirão Preto, State of São Paulo, southeast Brazil, between 2006 and 2009 and to evaluate its relationship with factors of social vulnerability such as income and education level. METHODS: We evaluated data from TBWeb, an electronic notification system for TB cases. Measures of social vulnerability were obtained from the SEADE Foundation, and information about the number of inhabitants, education and income of the households were obtained from Brazilian Institute of Geography and Statistics. Statistical analyses were conducted by a Bayesian regression model assuming a Poisson distribution for the observed new cases of TB in each area. A conditional autoregressive structure was used for the spatial covariance structure. RESULTS: The Bayesian model confirmed the spatial heterogeneity of TB distribution in Ribeirão Preto, identifying areas with elevated risk and the effects of social vulnerability on the disease. We demonstrated that the rate of TB was correlated with the measures of income, education and social vulnerability. However, we observed areas with low vulnerability and high education and income, but with high estimated TB rates. CONCLUSIONS: The study identified areas with different risks for TB, given that the public health system deals with the characteristics of each region individually and prioritizes those that present a higher propensity to risk of TB. Complex relationships may exist between TB incidence and a wide range of environmental and intrinsic factors, which need to be studied in future research.
Resumo:
In this work we compared the estimates of the parameters of ARCH models using a complete Bayesian method and an empirical Bayesian method in which we adopted a non-informative prior distribution and informative prior distribution, respectively. We also considered a reparameterization of those models in order to map the space of the parameters into real space. This procedure permits choosing prior normal distributions for the transformed parameters. The posterior summaries were obtained using Monte Carlo Markov chain methods (MCMC). The methodology was evaluated by considering the Telebras series from the Brazilian financial market. The results show that the two methods are able to adjust ARCH models with different numbers of parameters. The empirical Bayesian method provided a more parsimonious model to the data and better adjustment than the complete Bayesian method.