97 resultados para Geospatial Data Model
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
In this paper, we present an algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-objective clustering. The algorithm is based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions. The algorithm proposed can deal with data sets presenting different types of clusters, without the need of expertise in cluster analysis. its result is a concise set of partitions representing alternative trade-offs among the objective functions. We compare the results obtained with our algorithm, in the context of gene expression data sets, to those achieved with multi-objective Clustering with automatic K-determination (MOCK). the algorithm most closely related to ours. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
This work aims to compare different nonlinear functions for describing the growth curves of Nelore females. The growth curve parameters, their (co) variance components, and environmental and genetic effects were estimated jointly through a Bayesian hierarchical model. In the first stage of the hierarchy, 4 nonlinear functions were compared: Brody, Von Bertalanffy, Gompertz, and logistic. The analyses were carried out using 3 different data sets to check goodness of fit while having animals with few records. Three different assumptions about SD of fitting errors were considered: constancy throughout the trajectory, linear increasing until 3 yr of age and constancy thereafter, and variation following the nonlinear function applied in the first stage of the hierarchy. Comparisons of the overall goodness of fit were based on Akaike information criterion, the Bayesian information criterion, and the deviance information criterion. Goodness of fit at different points of the growth curve was compared applying the Gelfand`s check function. The posterior means of adult BW ranged from 531.78 to 586.89 kg. Greater estimates of adult BW were observed when the fitting error variance was considered constant along the trajectory. The models were not suitable to describe the SD of fitting errors at the beginning of the growth curve. All functions provided less accurate predictions at the beginning of growth, and predictions were more accurate after 48 mo of age. The prediction of adult BW using nonlinear functions can be accurate when growth curve parameters and their (co) variance components are estimated jointly. The hierarchical model used in the present study can be applied to the prediction of mature BW in herds in which a portion of the animals are culled before adult age. Gompertz, Von Bertalanffy, and Brody functions were adequate to establish mean growth patterns and to predict the adult BW of Nelore females. The Brody model was more accurate in predicting the birth weight of these animals and presented the best overall goodness of fit.
Resumo:
Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.
Resumo:
Historically, the cure rate model has been used for modeling time-to-event data within which a significant proportion of patients are assumed to be cured of illnesses, including breast cancer, non-Hodgkin lymphoma, leukemia, prostate cancer, melanoma, and head and neck cancer. Perhaps the most popular type of cure rate model is the mixture model introduced by Berkson and Gage [1]. In this model, it is assumed that a certain proportion of the patients are cured, in the sense that they do not present the event of interest during a long period of time and can found to be immune to the cause of failure under study. In this paper, we propose a general hazard model which accommodates comprehensive families of cure rate models as particular cases, including the model proposed by Berkson and Gage. The maximum-likelihood-estimation procedure is discussed. A simulation study analyzes the coverage probabilities of the asymptotic confidence intervals for the parameters. A real data set on children exposed to HIV by vertical transmission illustrates the methodology.
Resumo:
The deterpenation of bergamot essential oil can be performed by liquid liquid extraction using hydrous ethanol as the solvent. A ternary mixture composed of 1-methyl-4-prop-1-en-2-yl-cydohexene (limonene), 3,7-dimethylocta-1,6-dien-3-yl-acetate (linalyl acetate), and 3,7-dimethylocta-1,6-dien-3-ol (linalool), three major compounds commonly found in bergamot oil, was used to simulate this essential oil. Liquid liquid equilibrium data were experimentally determined for systems containing essential oil compounds, ethanol, and water at 298.2 K and are reported in this paper. The experimental data were correlated using the NRTL and UNIQUAC models, and the mean deviations between calculated and experimental data were lower than 0.0062 in all systems, indicating the good descriptive quality of the molecular models. To verify the effect of the water mass fraction in the solvent and the linalool mass fraction in the terpene phase on the distribution coefficients of the essential oil compounds, nonlinear regression analyses were performed, obtaining mathematical models with correlation coefficient values higher than 0.99. The results show that as the water content in the solvent phase increased, the kappa value decreased, regardless of the type of compound studied. Conversely, as the linalool content increased, the distribution coefficients of hydrocarbon terpene and ester also increased. However, the linalool distribution coefficient values were negatively affected when the terpene alcohol content increased in the terpene phase.
Resumo:
Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
In this paper we introduce a parametric model for handling lifetime data where an early lifetime can be related to the infant-mortality failure or to the wear processes but we do not know which risk is responsible for the failure. The maximum likelihood approach and the sampling-based approach are used to get the inferences of interest. Some special cases of the proposed model are studied via Monte Carlo methods for size and power of hypothesis tests. To illustrate the proposed methodology, we introduce an example consisting of a real data set.
Resumo:
In this paper we deal with a Bayesian analysis for right-censored survival data suitable for populations with a cure rate. We consider a cure rate model based on the negative binomial distribution, encompassing as a special case the promotion time cure model. Bayesian analysis is based on Markov chain Monte Carlo (MCMC) methods. We also present some discussion on model selection and an illustration with a real dataset.
A bivariate regression model for matched paired survival data: local influence and residual analysis
Resumo:
The use of bivariate distributions plays a fundamental role in survival and reliability studies. In this paper, we consider a location scale model for bivariate survival times based on the proposal of a copula to model the dependence of bivariate survival data. For the proposed model, we consider inferential procedures based on maximum likelihood. Gains in efficiency from bivariate models are also examined in the censored data setting. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and compared to the performance of the bivariate regression model for matched paired survival data. Sensitivity analysis methods such as local and total influence are presented and derived under three perturbation schemes. The martingale marginal and the deviance marginal residual measures are used to check the adequacy of the model. Furthermore, we propose a new measure which we call modified deviance component residual. The methodology in the paper is illustrated on a lifetime data set for kidney patients.
Resumo:
In interval-censored survival data, the event of interest is not observed exactly but is only known to occur within some time interval. Such data appear very frequently. In this paper, we are concerned only with parametric forms, and so a location-scale regression model based on the exponentiated Weibull distribution is proposed for modeling interval-censored data. We show that the proposed log-exponentiated Weibull regression model for interval-censored data represents a parametric family of models that include other regression models that are broadly used in lifetime data analysis. Assuming the use of interval-censored data, we employ a frequentist analysis, a jackknife estimator, a parametric bootstrap and a Bayesian analysis for the parameters of the proposed model. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess global influences. Furthermore, for different parameter settings, sample sizes and censoring percentages, various simulations are performed; in addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended to a modified deviance residual in log-exponentiated Weibull regression models for interval-censored data. (C) 2009 Elsevier B.V. All rights reserved.
A robust Bayesian approach to null intercept measurement error model with application to dental data
Resumo:
Measurement error models often arise in epidemiological and clinical research. Usually, in this set up it is assumed that the latent variable has a normal distribution. However, the normality assumption may not be always correct. Skew-normal/independent distribution is a class of asymmetric thick-tailed distributions which includes the Skew-normal distribution as a special case. In this paper, we explore the use of skew-normal/independent distribution as a robust alternative to null intercept measurement error model under a Bayesian paradigm. We assume that the random errors and the unobserved value of the covariate (latent variable) follows jointly a skew-normal/independent distribution, providing an appealing robust alternative to the routine use of symmetric normal distribution in this type of model. Specific distributions examined include univariate and multivariate versions of the skew-normal distribution, the skew-t distributions, the skew-slash distributions and the skew contaminated normal distributions. The methods developed is illustrated using a real data set from a dental clinical trial. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The quantification of the available energy in the environment is important because it determines photosynthesis, evapotranspiration and, therefore, the final yield of crops. Instruments for measuring the energy balance are costly and indirect estimation alternatives are desirable. This study assessed the Deardorff's model performance during a cycle of a sugarcane crop in Piracicaba, State of São Paulo, Brazil, in comparison to the aerodynamic method. This mechanistic model simulates the energy fluxes (sensible, latent heat and net radiation) at three levels (atmosphere, canopy and soil) using only air temperature, relative humidity and wind speed measured at a reference level above the canopy, crop leaf area index, and some pre-calibrated parameters (canopy albedo, soil emissivity, atmospheric transmissivity and hydrological characteristics of the soil). The analysis was made for different time scales, insolation conditions and seasons (spring, summer and autumn). Analyzing all data of 15 minute intervals, the model presented good performance for net radiation simulation in different insolations and seasons. The latent heat flux in the atmosphere and the sensible heat flux in the atmosphere did not present differences in comparison to data from the aerodynamic method during the autumn. The sensible heat flux in the soil was poorly simulated by the model due to the poor performance of the soil water balance method. The Deardorff's model improved in general the flux simulations in comparison to the aerodynamic method when more insolation was available in the environment.
Resumo:
The General Ocean Turbulence Model (GOTM) is applied to the diagnostic turbulence field of the mixing layer (ML) over the equatorial region of the Atlantic Ocean. Two situations were investigated: rainy and dry seasons, defined, respectively, by the presence of the intertropical convergence zone and by its northward displacement. Simulations were carried out using data from a PIRATA buoy located on the equator at 23º W to compute surface turbulent fluxes and from the NASA/GEWEX Surface Radiation Budget Project to close the surface radiation balance. A data assimilation scheme was used as a surrogate for the physical effects not present in the one-dimensional model. In the rainy season, results show that the ML is shallower due to the weaker surface stress and stronger stable stratification; the maximum ML depth reached during this season is around 15 m, with an averaged diurnal variation of 7 m depth. In the dry season, the stronger surface stress and the enhanced surface heat balance components enable higher mechanical production of turbulent kinetic energy and, at night, the buoyancy acts also enhancing turbulence in the first meters of depth, characterizing a deeper ML, reaching around 60 m and presenting an average diurnal variation of 30 m.
Resumo:
Sepsis is a systemic inflammatory response that can lead to tissue damage and death. In order to increase our understanding of sepsis, experimental models are needed that produce relevant immune and inflammatory responses during a septic event. We describe a lipopolysaccharide tolerance mouse model to characterize the cellular and molecular alterations of immune cells during sepsis. The model presents a typical lipopolysaccharide tolerance pattern in which tolerance is related to decreased production and secretion of cytokines after a subsequent exposure to a lethal dose of lipopolysaccharide. The initial lipopolysaccharide exposure also altered the expression patterns of cytokines and was followed by an 8- and a 1.5-fold increase in the T helper 1 and 2 cell subpopulations. Behavioral data indicate a decrease in spontaneous activity and an increase in body temperature following exposure to lipopolysaccharide. In contrast, tolerant animals maintained production of reactive oxygen species and nitric oxide when terminally challenged by cecal ligation and puncture (CLP). Survival study after CLP showed protection in tolerant compared to naive animals. Spleen mass increased in tolerant animals followed by increases of B lymphocytes and subpopulation Th1 cells. An increase in the number of stem cells was found in spleen and bone marrow. We also showed that administration of spleen or bone marrow cells from tolerant to naive animals transfers the acquired resistance status. In conclusion, lipopolysaccharide tolerance is a natural reprogramming of the immune system that increases the number of immune cells, particularly T helper 1 cells, and does not reduce oxidative stress.
Resumo:
In this work we study the problem of modeling identification of a population employing a discrete dynamic model based on the Richards growth model. The population is subjected to interventions due to consumption, such as hunting or farming animals. The model identification allows us to estimate the probability or the average time for a population number to reach a certain level. The parameter inference for these models are obtained with the use of the likelihood profile technique as developed in this paper. The identification method here developed can be applied to evaluate the productivity of animal husbandry or to evaluate the risk of extinction of autochthon populations. It is applied to data of the Brazilian beef cattle herd population, and the the population number to reach a certain goal level is investigated.