84 resultados para hierarchical Bayesian analysis
Resumo:
Patients with chronic pancreatitis may have abnormal gastrointestinal transit, but the factors underlying these abnormalities are poorly understood. Gastrointestinal transit was assessed, in 40 male outpatients with alcohol-related chronic pancreatitis and 18 controls, by scintigraphy after a liquid meal labeled with (99m)technetium-phytate. Blood and urinary glucose, fecal fat excretion, nutritional status, and cardiovascular autonomic function were determined in all patients. The influence of diabetes mellitus, malabsorption, malnutrition, and autonomic neuropathy on abnormal gastrointestinal transit was assessed by univariate analysis and Bayesian multiple regression analysis. Accelerated gastrointestinal transit was found in 11 patients who showed abnormally rapid arrival of the meal marker to the cecum. Univariate and Bayesian analysis showed that diabetes mellitus and autonomic neuropathy had significant influences on rapid transit, which was not associated with either malabsorption or malnutrition. In conclusion, rapid gastrointestinal transit in patients with alcohol-related chronic pancreatitis is related to diabetes mellitus and autonomic neuropathy.
Resumo:
In this article, we present a generalization of the Bayesian methodology introduced by Cepeda and Gamerman (2001) for modeling variance heterogeneity in normal regression models where we have orthogonality between mean and variance parameters to the general case considering both linear and highly nonlinear regression models. Under the Bayesian paradigm, we use MCMC methods to simulate samples for the joint posterior distribution. We illustrate this algorithm considering a simulated data set and also considering a real data set related to school attendance rate for children in Colombia. Finally, we present some extensions of the proposed MCMC algorithm.
Resumo:
In interval-censored survival data, the event of interest is not observed exactly but is only known to occur within some time interval. Such data appear very frequently. In this paper, we are concerned only with parametric forms, and so a location-scale regression model based on the exponentiated Weibull distribution is proposed for modeling interval-censored data. We show that the proposed log-exponentiated Weibull regression model for interval-censored data represents a parametric family of models that include other regression models that are broadly used in lifetime data analysis. Assuming the use of interval-censored data, we employ a frequentist analysis, a jackknife estimator, a parametric bootstrap and a Bayesian analysis for the parameters of the proposed model. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess global influences. Furthermore, for different parameter settings, sample sizes and censoring percentages, various simulations are performed; in addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended to a modified deviance residual in log-exponentiated Weibull regression models for interval-censored data. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
In survival analysis applications, the failure rate function may frequently present a unimodal shape. In such case, the log-normal or log-logistic distributions are used. In this paper, we shall be concerned only with parametric forms, so a location-scale regression model based on the Burr XII distribution is proposed for modeling data with a unimodal failure rate function as an alternative to the log-logistic regression model. Assuming censored data, we consider a classic analysis, a Bayesian analysis and a jackknife estimator for the parameters of the proposed model. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the log-logistic and log-Burr XII regression models. Besides, we use sensitivity analysis to detect influential or outlying observations, and residual analysis is used to check the assumptions in the model. Finally, we analyze a real data set under log-Buff XII regression models. (C) 2008 Published by Elsevier B.V.
Resumo:
This article presents important properties of standard discrete distributions and its conjugate densities. The Bernoulli and Poisson processes are described as generators of such discrete models. A characterization of distributions by mixtures is also introduced. This article adopts a novel singular notation and representation. Singular representations are unusual in statistical texts. Nevertheless, the singular notation makes it simpler to extend and generalize theoretical results and greatly facilitates numerical and computational implementation.
Resumo:
Coconut water is a natural isotonic, nutritive, and low-caloric drink. Preservation process is necessary to increase its shelf life outside the fruit and to improve commercialization. However, the influence of the conservation processes, antioxidant addition, maturation time, and soil where coconut is cultivated on the chemical composition of coconut water has had few arguments and studies. For these reasons, an evaluation of coconut waters (unprocessed and processed) was carried out using Ca, Cu, Fe, K, Mg, Mn, Na, Zn, chloride, sulfate, phosphate, malate, and ascorbate concentrations and chemometric tools. The quantitative determinations were performed by electrothermal atomic absorption spectrometry, inductively coupled plasma optical emission spectrometry, and capillary electrophoresis. The results showed that Ca, K, and Zn concentrations did not present significant alterations between the samples. The ranges of Cu, Fe, Mg, Mn, PO (4) (3-) , and SO (4) (2-) concentrations were as follows: Cu (3.1-120 A mu g L(-1)), Fe (60-330 A mu g L(-1)), Mg (48-123 mg L(-1)), Mn (0.4-4.0 mg L(-1)), PO (4) (3-) (55-212 mg L(-1)), and SO (4) (2-) (19-136 mg L(-1)). The principal component analysis (PCA) and hierarchical cluster analysis (HCA) were applied to differentiate unprocessed and processed samples. Multivariated analysis (PCA and HCA) were compared through one-way analysis of variance with Tukey-Kramer multiple comparisons test, and p values less than 0.05 were considered to be significant.
Resumo:
This work shows the application of the analytic hierarchy process (AHP) in the full cost accounting (FCA) within the integrated resource planning (IRP) process. For this purpose, a pioneer case was developed and different energy solutions of supply and demand for a metropolitan airport (Congonhas) were considered [Moreira, E.M., 2005. Modelamento energetico para o desenvolvimento limpo de aeroporto metropolitano baseado na filosofia do PIR-O caso da metropole de Sao Paulo. Dissertacao de mestrado, GEPEA/USP]. These solutions were compared and analyzed utilizing the software solution ""Decision Lens"" that implements the AHP. The final part of this work has a classification of resources that can be considered to be the initial target as energy resources, thus facilitating the restraints of the IRP of the airport and setting parameters aiming at sustainable development. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
GB virus C/hepatitis G (GBV-C) is an RNA virus of the family Flaviviridae. Despite replicating with an RNA-dependent RNA polymerase, some previous estimates of rates of evolutionary change in GBV-C suggest that it fixes mutations at the anomalously low rate of similar to 100(-7) nucleotide substitution per site, per year. However, these estimates were largely based on the assumption that GBV-C and its close relative GBV-A (New World monkey GB viruses) codiverged with their primate hosts over millions of years. Herein, we estimated the substitution rate of GBV-C using the largest set of dated GBV-C isolates compiled to date and a Bayesian coalescent approach that utilizes the year of sampling and so is independent of the assumption of codivergence. This revealed a rate of evolutionary change approximately four orders of magnitude higher than that estimated previously, in the range of 10(-2) to 10(-3) sub/site/year, and hence in line with those previously determined for RNA viruses in general and the Flaviviridae in particular. In addition, we tested the assumption of host-virus codivergence in GBV-A by performing a reconciliation analysis of host and virus phylogenies. Strikingly, we found no statistical evidence for host-virus codivergence in GBV-A, indicating that substitution rates in the GB viruses should not be estimated from host divergence times.
Resumo:
In this paper, we compare the performance of two statistical approaches for the analysis of data obtained from the social research area. In the first approach, we use normal models with joint regression modelling for the mean and for the variance heterogeneity. In the second approach, we use hierarchical models. In the first case, individual and social variables are included in the regression modelling for the mean and for the variance, as explanatory variables, while in the second case, the variance at level 1 of the hierarchical model depends on the individuals (age of the individuals), and in the level 2 of the hierarchical model, the variance is assumed to change according to socioeconomic stratum. Applying these methodologies, we analyze a Colombian tallness data set to find differences that can be explained by socioeconomic conditions. We also present some theoretical and empirical results concerning the two models. From this comparative study, we conclude that it is better to jointly modelling the mean and variance heterogeneity in all cases. We also observe that the convergence of the Gibbs sampling chain used in the Markov Chain Monte Carlo method for the jointly modeling the mean and variance heterogeneity is quickly achieved.
Resumo:
This work presents a Bayesian semiparametric approach for dealing with regression models where the covariate is measured with error. Given that (1) the error normality assumption is very restrictive, and (2) assuming a specific elliptical distribution for errors (Student-t for example), may be somewhat presumptuous; there is need for more flexible methods, in terms of assuming only symmetry of errors (admitting unknown kurtosis). In this sense, the main advantage of this extended Bayesian approach is the possibility of considering generalizations of the elliptical family of models by using Dirichlet process priors in dependent and independent situations. Conditional posterior distributions are implemented, allowing the use of Markov Chain Monte Carlo (MCMC), to generate the posterior distributions. An interesting result shown is that the Dirichlet process prior is not updated in the case of the dependent elliptical model. Furthermore, an analysis of a real data set is reported to illustrate the usefulness of our approach, in dealing with outliers. Finally, semiparametric proposed models and parametric normal model are compared, graphically with the posterior distribution density of the coefficients. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.
Resumo:
Background: The post-genomic era has brought new challenges regarding the understanding of the organization and function of the human genome. Many of these challenges are centered on the meaning of differential gene regulation under distinct biological conditions and can be performed by analyzing the Multiple Differential Expression (MDE) of genes associated with normal and abnormal biological processes. Currently MDE analyses are limited to usual methods of differential expression initially designed for paired analysis. Results: We proposed a web platform named ProbFAST for MDE analysis which uses Bayesian inference to identify key genes that are intuitively prioritized by means of probabilities. A simulated study revealed that our method gives a better performance when compared to other approaches and when applied to public expression data, we demonstrated its flexibility to obtain relevant genes biologically associated with normal and abnormal biological processes. Conclusions: ProbFAST is a free accessible web-based application that enables MDE analysis on a global scale. It offers an efficient methodological approach for MDE analysis of a set of genes that are turned on and off related to functional information during the evolution of a tumor or tissue differentiation. ProbFAST server can be accessed at http://gdm.fmrp.usp.br/probfast.
Resumo:
Online music databases have increased significantly as a consequence of the rapid growth of the Internet and digital audio, requiring the development of faster and more efficient tools for music content analysis. Musical genres are widely used to organize music collections. In this paper, the problem of automatic single and multi-label music genre classification is addressed by exploring rhythm-based features obtained from a respective complex network representation. A Markov model is built in order to analyse the temporal sequence of rhythmic notation events. Feature analysis is performed by using two multi-variate statistical approaches: principal components analysis (unsupervised) and linear discriminant analysis (supervised). Similarly, two classifiers are applied in order to identify the category of rhythms: parametric Bayesian classifier under the Gaussian hypothesis (supervised) and agglomerative hierarchical clustering (unsupervised). Qualitative results obtained by using the kappa coefficient and the obtained clusters corroborated the effectiveness of the proposed method.
Resumo:
Chagas disease is still a major public health problem in Latin America. Its causative agent, Trypanosoma cruzi, can be typed into three major groups, T. cruzi I, T. cruzi II and hybrids. These groups each have specific genetic characteristics and epidemiological distributions. Several highly virulent strains are found in the hybrid group; their origin is still a matter of debate. The null hypothesis is that the hybrids are of polyphyletic origin, evolving independently from various hybridization events. The alternative hypothesis is that all extant hybrid strains originated from a single hybridization event. We sequenced both alleles of genes encoding EF-1 alpha, actin and SSU rDNA of 26 T. cruzi strains and DHFR-TS and TR of 12 strains. This information was used for network genealogy analysis and Bayesian phylogenies. We found T. cruzi I and T. cruzi II to be monophyletic and that all hybrids had different combinations of T. cruzi I and T. cruzi II haplotypes plus hybrid-specific haplotypes. Bootstrap values (networks) and posterior probabilities (Bayesian phylogenies) of clades supporting the monophyly of hybrids were far below the 95% confidence interval, indicating that the hybrid group is polyphyletic. We hypothesize that T. cruzi I and T. cruzi II are two different species and that the hybrids are extant representatives of independent events of genome hybridization, which sporadically have sufficient fitness to impact on the epidemiology of Chagas disease.
Resumo:
A simultaneous optimization strategy based on a neuro-genetic approach is proposed for selection of laser induced breakdown spectroscopy operational conditions for the simultaneous determination of macronutrients (Ca, Mg and P), micro-nutrients (B, Cu, Fe, Mn and Zn), Al and Si in plant samples. A laser induced breakdown spectroscopy system equipped with a 10 Hz Q-switched Nd:YAG laser (12 ns, 532 nm, 140 mJ) and an Echelle spectrometer with intensified coupled-charge device was used. Integration time gate, delay time, amplification gain and number of pulses were optimized. Pellets of spinach leaves (NIST 1570a) were employed as laboratory samples. In order to find a model that could correlate laser induced breakdown spectroscopy operational conditions with compromised high peak areas of all elements simultaneously, a Bayesian Regularized Artificial Neural Network approach was employed. Subsequently, a genetic algorithm was applied to find optimal conditions for the neural network model, in an approach called neuro-genetic, A single laser induced breakdown spectroscopy working condition that maximizes peak areas of all elements simultaneously, was obtained with the following optimized parameters: 9.0 mu s integration time gate, 1.1 mu s delay time, 225 (a.u.) amplification gain and 30 accumulated laser pulses. The proposed approach is a useful and a suitable tool for the optimization process of such a complex analytical problem. (C) 2009 Elsevier B.V. All rights reserved.