84 resultados para hierarchical Bayesian analysis
Resumo:
Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.
Resumo:
A chemotaxonomic analysis is described of a database containing various types of compounds from the Heliantheae tribe (Asteraceae) using Self-Organizing Maps (SOM). The numbers of occurrences of 9 chemical classes in different taxa of the tribe were used as variables. The study shows that SOM applied to chemical data can contribute to differentiate genera, subtribes, and groups of subtribes (subtribe branches), as well as to tribal and subtribal classifications of Heliantheae, exhibiting a high hit percentage comparable to that of an expert performance, and in agreement with the previous tribe classification proposed by Stuessy.
Resumo:
We analyze the influence of time-, firm-, industry- and country-level determinants of capital structure. First, we apply hierarchical linear modeling in order to assess the relative importance of those levels. We find that time and firm levels explain 78% of firm leverage. Second, we include random intercepts and random coefficients in order to analyze the direct and indirect influences of firm/industry/country characteristics on firm leverage. We document several important indirect influences of variables at industry and country-levels on firm determinants of leverage, as well as several structural differences in the financial behavior between firms of developed and emerging countries. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Molecular epidemiological data concerning the hepatitis B virus (HBV) in Chile are not known completely. Since the HBV genotype F is the most prevalent in the country, the goal of this study was to obtain full HBV genome sequences from patients infected chronically in order to determine their subgenotypes and the occurrence of resistance-associated mutations. Twenty-one serum samples from antiviral drug-naive patients with chronic hepatitis B were subjected to full-length PCR amplification, and both strands of the whole genomes were fully sequenced. Phylogenetic analyses were performed along with reference sequences available from GenBank (n = 290). The sequences were aligned using Clustal X and edited in the SE-AL software. Bayesian phylogenetic analyses were conducted by Markov Chain Monte Carlo simulations (MCMC) for 10 million generations in order to obtain the substitution tree using BEAST. The sequences were also analyzed for the presence of primary drug resistance mutations using CodonCode Aligner Software. The phylogenetic analyses indicated that all sequences were found to be the HBV subgenotype F1b, clustered into four different groups, suggesting that diverse lineages of this subgenotype may be circulating within this population of Chilean patients. J. Med. Virol. 83: 1530-1536, 2011. (C) 2011 Wiley-Liss, Inc.
Resumo:
Understanding the mating patterns of populations of tree species is a key component of ex situ genetic conservation. In this study, we analysed the genetic diversity, spatial genetic structure (SGS) and mating system at the hierarchical levels of fruits and individuals as well as pollen dispersal patterns in a continuous population of Theobroma cacao in Para State, Brazil. A total of 156 individuals in a 0.56 ha plot were mapped and genotyped for nine microsatellite loci. For the mating system analyses, 50 seeds were collected from nine seed trees by sampling five fruits per tree (10 seeds per fruit). Among the 156 individuals, 127 had unique multilocus genotypes, and the remaining were clones. The population was spatially aggregated; it demonstrated a significant SGS up to 15m that could be attributed primarily to the presence of clones. However, the short seed dispersal distance also contributed to this pattern. Population matings occurred mainly via outcrossing, but selfing was observed in some seed trees, which indicated the presence of individual variation for self-incompatibility. The matings were also correlated, especially within ((r) over cap (p(m)) = 0.607) rather than among the fruits ((r) over cap (p(m)) = 0.099), which suggested that a small number of pollen donors fertilised each fruit. The paternity analysis suggested a high proportion of pollen migration (61.3%), although within the plot, most of the pollen dispersal encompassed short distances (28m). The determination of these novel parameters provides the fundamental information required to establish long-term ex situ conservation strategies for this important tropical species. Heredity (2011) 106, 973-985; doi:10.1038/hdy.2010.145; published online 8 December 2010
Resumo:
Evolutionary novelties in the skeleton are usually expressed as changes in the timing of growth of features intrinsically integrated at different hierarchical levels of development(1). As a consequence, most of the shape- traits observed across species do vary quantitatively rather than qualitatively(2), in a multivariate space(3) and in a modularized way(4,5). Because most phylogenetic analyses normally use discrete, hypothetically independent characters(6), previous attempts have disregarded the phylogenetic signals potentially enclosed in the shape of morphological structures. When analysing low taxonomic levels, where most variation is quantitative in nature, solving basic requirements like the choice of characters and the capacity of using continuous, integrated traits is of crucial importance in recovering wider phylogenetic information. This is particularly relevant when analysing extinct lineages, where available data are limited to fossilized structures. Here we show that when continuous, multivariant and modularized characters are treated as such, cladistic analysis successfully solves relationships among main Homo taxa. Our attempt is based on a combination of cladistics, evolutionary- development- derived selection of characters, and geometric morphometrics methods. In contrast with previous cladistic analyses of hominid phylogeny, our method accounts for the quantitative nature of the traits, and respects their morphological integration patterns. Because complex phenotypes are observable across different taxonomic groups and are potentially informative about phylogenetic relationships, future analyses should point strongly to the incorporation of these types of trait.
Resumo:
This work proposes and discusses an approach for inducing Bayesian classifiers aimed at balancing the tradeoff between the precise probability estimates produced by time consuming unrestricted Bayesian networks and the computational efficiency of Naive Bayes (NB) classifiers. The proposed approach is based on the fundamental principles of the Heuristic Search Bayesian network learning. The Markov Blanket concept, as well as a proposed ""approximate Markov Blanket"" are used to reduce the number of nodes that form the Bayesian network to be induced from data. Consequently, the usually high computational cost of the heuristic search learning algorithms can be lessened, while Bayesian network structures better than NB can be achieved. The resulting algorithms, called DMBC (Dynamic Markov Blanket Classifier) and A-DMBC (Approximate DMBC), are empirically assessed in twelve domains that illustrate scenarios of particular interest. The obtained results are compared with NB and Tree Augmented Network (TAN) classifiers, and confinn that both proposed algorithms can provide good classification accuracies and better probability estimates than NB and TAN, while being more computationally efficient than the widely used K2 Algorithm.
Resumo:
It is known that patients may cease participating in a longitudinal study and become lost to follow-up. The objective of this article is to present a Bayesian model to estimate the malaria transition probabilities considering individuals lost to follow-up. We consider a homogeneous population, and it is assumed that the considered period of time is small enough to avoid two or more transitions from one state of health to another. The proposed model is based on a Gibbs sampling algorithm that uses information of lost to follow-up at the end of the longitudinal study. To simulate the unknown number of individuals with positive and negative states of malaria at the end of the study and lost to follow-up, two latent variables were introduced in the model. We used a real data set and a simulated data to illustrate the application of the methodology. The proposed model showed a good fit to these data sets, and the algorithm did not show problems of convergence or lack of identifiability. We conclude that the proposed model is a good alternative to estimate probabilities of transitions from one state of health to the other in studies with low adherence to follow-up.
Resumo:
Point placement strategies aim at mapping data points represented in higher dimensions to bi-dimensional spaces and are frequently used to visualize relationships amongst data instances. They have been valuable tools for analysis and exploration of data sets of various kinds. Many conventional techniques, however, do not behave well when the number of dimensions is high, such as in the case of documents collections. Later approaches handle that shortcoming, but may cause too much clutter to allow flexible exploration to take place. In this work we present a novel hierarchical point placement technique that is capable of dealing with these problems. While good grouping and separation of data with high similarity is maintained without increasing computation cost, its hierarchical structure lends itself both to exploration in various levels of detail and to handling data in subsets, improving analysis capability and also allowing manipulation of larger data sets.
Resumo:
In this work we introduce a new hierarchical surface decomposition method for multiscale analysis of surface meshes. In contrast to other multiresolution methods, our approach relies on spectral properties of the surface to build a binary hierarchical decomposition. Namely, we utilize the first nontrivial eigenfunction of the Laplace-Beltrami operator to recursively decompose the surface. For this reason we coin our surface decomposition the Fiedler tree. Using the Fiedler tree ensures a number of attractive properties, including: mesh-independent decomposition, well-formed and nearly equi-areal surface patches, and noise robustness. We show how the evenly distributed patches can be exploited for generating multiresolution high quality uniform meshes. Additionally, our decomposition permits a natural means for carrying out wavelet methods, resulting in an intuitive method for producing feature-sensitive meshes at multiple scales. Published by Elsevier Ltd.
Resumo:
A continuous version of the hierarchical spherical model at dimension d=4 is investigated. Two limit distributions of the block spin variable X(gamma), normalized with exponents gamma = d + 2 and gamma=d at and above the critical temperature, are established. These results are proven by solving certain evolution equations corresponding to the renormalization group (RG) transformation of the O(N) hierarchical spin model of block size L(d) in the limit L down arrow 1 and N ->infinity. Starting far away from the stationary Gaussian fixed point the trajectories of these dynamical system pass through two different regimes with distinguishable crossover behavior. An interpretation of this trajectories is given by the geometric theory of functions which describe precisely the motion of the Lee-Yang zeroes. The large-N limit of RG transformation with L(d) fixed equal to 2, at the criticality, has recently been investigated in both weak and strong (coupling) regimes by Watanabe (J. Stat. Phys. 115:1669-1713, 2004) . Although our analysis deals only with N = infinity case, it complements various aspects of that work.
Resumo:
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is interest in studying latent variables. These latent variables are directly considered in the Item Response Models (IRM) and they are usually called latent traits. A usual assumption for parameter estimation of the IRM, considering one group of examinees, is to assume that the latent traits are random variables which follow a standard normal distribution. However, many works suggest that this assumption does not apply in many cases. Furthermore, when this assumption does not hold, the parameter estimates tend to be biased and misleading inference can be obtained. Therefore, it is important to model the distribution of the latent traits properly. In this paper we present an alternative latent traits modeling based on the so-called skew-normal distribution; see Genton (2004). We used the centred parameterization, which was proposed by Azzalini (1985). This approach ensures the model identifiability as pointed out by Azevedo et al. (2009b). Also, a Metropolis Hastings within Gibbs sampling (MHWGS) algorithm was built for parameter estimation by using an augmented data approach. A simulation study was performed in order to assess the parameter recovery in the proposed model and the estimation method, and the effect of the asymmetry level of the latent traits distribution on the parameter estimation. Also, a comparison of our approach with other estimation methods (which consider the assumption of symmetric normality for the latent traits distribution) was considered. The results indicated that our proposed algorithm recovers properly all parameters. Specifically, the greater the asymmetry level, the better the performance of our approach compared with other approaches, mainly in the presence of small sample sizes (number of examinees). Furthermore, we analyzed a real data set which presents indication of asymmetry concerning the latent traits distribution. The results obtained by using our approach confirmed the presence of strong negative asymmetry of the latent traits distribution. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In this article, we introduce a semi-parametric Bayesian approach based on Dirichlet process priors for the discrete calibration problem in binomial regression models. An interesting topic is the dosimetry problem related to the dose-response model. A hierarchical formulation is provided so that a Markov chain Monte Carlo approach is developed. The methodology is applied to simulated and real data.
Resumo:
We present a Bayesian approach for modeling heterogeneous data and estimate multimodal densities using mixtures of Skew Student-t-Normal distributions [Gomez, H.W., Venegas, O., Bolfarine, H., 2007. Skew-symmetric distributions generated by the distribution function of the normal distribution. Environmetrics 18, 395-407]. A stochastic representation that is useful for implementing a MCMC-type algorithm and results about existence of posterior moments are obtained. Marginal likelihood approximations are obtained, in order to compare mixture models with different number of component densities. Data sets concerning the Gross Domestic Product per capita (Human Development Report) and body mass index (National Health and Nutrition Examination Survey), previously studied in the related literature, are analyzed. (c) 2008 Elsevier B.V. All rights reserved.
Resumo:
In this paper, we present a Bayesian approach for estimation in the skew-normal calibration model, as well as the conditional posterior distributions which are useful for implementing the Gibbs sampler. Data transformation is thus avoided by using the methodology proposed. Model fitting is implemented by proposing the asymmetric deviance information criterion, ADIC, a modification of the ordinary DIC. We also report an application of the model studied by using a real data set, related to the relationship between the resistance and the elasticity of a sample of concrete beams. Copyright (C) 2008 John Wiley & Sons, Ltd.