23 resultados para STATISTICAL MODELS
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Biological membranes are constituted from lipid bilayers and proteins. Investigation of protein-membrane interaction, essential for biological function of cells, must rest upon solid knowledge of lipid bilayer behavior. Thus, extensive studies of an experimental model for membranes, lipid bilayers in water solution, have been undertaken in the last decades. These systems present structural, thermal and electrical properties which depend on temperature, ionic strength or concentration. In this talk, we shall discuss statistical models for lipid bilayers, as well as the relation between their properties and results for properties of lipid dispersions investigated by the laboratories supervised by Teresa Lamy (IF-USP) and Amando Ito (FFCL-USP).
Resumo:
The objectives of the present study were to determine if variance components of calving intervals varied with age at calving and if considering calving intervals as a longitudinal trait would be a useful approach for fertility analysis of Zebu dairy herds. With these purposes, calving records from females born from 1940 to 2006 in a Guzerat dairy subpopulation in Brazil were analyzed. The fixed effects of contemporary groups, formed by year and farm at birth or at calving, and the regressions of age at calving, equivalent inbreeding coefficient and day of the year on the studied traits were considered in the statistical models. In one approach, calving intervals (Cl) were analyzed as a single trait, by fitting a statistical model on which both animal and permanent environment effects were adjusted for the effect of age at calving by random regression. In a second approach, a four-trait analysis was conducted, including age at first calving (AFC) and three different female categories for the calving intervals: first calving females; young females (less than 80 months old, but not first calving); or mature females (80 months old or more). Finally, a two-trait analysis was performed, also including AFC and Cl, but calving intervals were regarded as a single trait in a repeatability model. Additionally, the ranking of sires was compared among approaches. Calving intervals decreased with age until females were about 80 months old, remaining nearly constant after that age. A quasi-linear increase of 11.5 days on the calving intervals was observed for each 10% increase in the female's equivalent inbreeding coefficient. The heritability of AFC was 0.37. For Cl. the genetic-phenotypic variance ratios ranged from 0.064 to 0.141, depending on the approach and on ages at calving. Differences among genetic variance components for calving intervals were observed along the animal's lifetime. Those differences confirmed the longitudinal aspect of that trait, indicating the importance of such consideration when accessing fertility of Zebu dairy females, especially in situations where the available information relies on their calving intervals. Spearman rank correlations among approaches ranged from 0.90 to 0.95, and changes observed in the ranking of sires suggested that the genetic progress of the population could be affected by the approach chosen for the analysis of calving intervals. (C) 2012 Elsevier ay. All rights reserved.
Resumo:
Item response theory (IRT) comprises a set of statistical models which are useful in many fields, especially when there is an interest in studying latent variables (or latent traits). Usually such latent traits are assumed to be random variables and a convenient distribution is assigned to them. A very common choice for such a distribution has been the standard normal. Recently, Azevedo et al. [Bayesian inference for a skew-normal IRT model under the centred parameterization, Comput. Stat. Data Anal. 55 (2011), pp. 353-365] proposed a skew-normal distribution under the centred parameterization (SNCP) as had been studied in [R. B. Arellano-Valle and A. Azzalini, The centred parametrization for the multivariate skew-normal distribution, J. Multivariate Anal. 99(7) (2008), pp. 1362-1382], to model the latent trait distribution. This approach allows one to represent any asymmetric behaviour concerning the latent trait distribution. Also, they developed a Metropolis-Hastings within the Gibbs sampling (MHWGS) algorithm based on the density of the SNCP. They showed that the algorithm recovers all parameters properly. Their results indicated that, in the presence of asymmetry, the proposed model and the estimation algorithm perform better than the usual model and estimation methods. Our main goal in this paper is to propose another type of MHWGS algorithm based on a stochastic representation (hierarchical structure) of the SNCP studied in [N. Henze, A probabilistic representation of the skew-normal distribution, Scand. J. Statist. 13 (1986), pp. 271-275]. Our algorithm has only one Metropolis-Hastings step, in opposition to the algorithm developed by Azevedo et al., which has two such steps. This not only makes the implementation easier but also reduces the number of proposal densities to be used, which can be a problem in the implementation of MHWGS algorithms, as can be seen in [R.J. Patz and B.W. Junker, A straightforward approach to Markov Chain Monte Carlo methods for item response models, J. Educ. Behav. Stat. 24(2) (1999), pp. 146-178; R. J. Patz and B. W. Junker, The applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, J. Educ. Behav. Stat. 24(4) (1999), pp. 342-366; A. Gelman, G.O. Roberts, and W.R. Gilks, Efficient Metropolis jumping rules, Bayesian Stat. 5 (1996), pp. 599-607]. Moreover, we consider a modified beta prior (which generalizes the one considered in [3]) and a Jeffreys prior for the asymmetry parameter. Furthermore, we study the sensitivity of such priors as well as the use of different kernel densities for this parameter. Finally, we assess the impact of the number of examinees, number of items and the asymmetry level on the parameter recovery. Results of the simulation study indicated that our approach performed equally as well as that in [3], in terms of parameter recovery, mainly using the Jeffreys prior. Also, they indicated that the asymmetry level has the highest impact on parameter recovery, even though it is relatively small. A real data analysis is considered jointly with the development of model fitting assessment tools. The results are compared with the ones obtained by Azevedo et al. The results indicate that using the hierarchical approach allows us to implement MCMC algorithms more easily, it facilitates diagnosis of the convergence and also it can be very useful to fit more complex skew IRT models.
Resumo:
The sera of a retrospective cohort (n = 41) composed of children with well characterized cow's milk allergy collected from multiple visits were analyzed using a protein microarray system measuring four classes of immunoglobulins. The frequency of the visits, age and gender distribution reflected real situation faced by the clinicians at a pediatric reference center for food allergy in 530 Paulo, Brazil. The profiling array results have shown that total IgG and IgA share similar specificity whilst IgM and in particular IgE are distantly related. The correlation of specificity of IgE and IgA is variable amongst the patients and this relationship cannot be used to predict atopy or the onset of tolerance to milk. The array profiling technique has corroborated the clinical selection criteria for this cohort albeit it clearly suggested that 4 out of the 41 patients might have allergies other than milk origin. There was also a good correlation between the array data and ImmunoCAP results, casein in particular. By using qualitative and quantitative multivariate analysis routines it was possible to produce validated statistical models to predict with reasonable accuracy the onset of tolerance to milk proteins. If expanded to larger study groups, the array profiling in combination with the multivariate techniques show potential to improve the prognostic of milk allergic patients. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Background: Several models have been designed to predict survival of patients with heart failure. These, while available and widely used for both stratifying and deciding upon different treatment options on the individual level, have several limitations. Specifically, some clinical variables that may influence prognosis may have an influence that change over time. Statistical models that include such characteristic may help in evaluating prognosis. The aim of the present study was to analyze and quantify the impact of modeling heart failure survival allowing for covariates with time-varying effects known to be independent predictors of overall mortality in this clinical setting. Methodology: Survival data from an inception cohort of five hundred patients diagnosed with heart failure functional class III and IV between 2002 and 2004 and followed-up to 2006 were analyzed by using the proportional hazards Cox model and variations of the Cox's model and also of the Aalen's additive model. Principal Findings: One-hundred and eighty eight (188) patients died during follow-up. For patients under study, age, serum sodium, hemoglobin, serum creatinine, and left ventricular ejection fraction were significantly associated with mortality. Evidence of time-varying effect was suggested for the last three. Both high hemoglobin and high LV ejection fraction were associated with a reduced risk of dying with a stronger initial effect. High creatinine, associated with an increased risk of dying, also presented an initial stronger effect. The impact of age and sodium were constant over time. Conclusions: The current study points to the importance of evaluating covariates with time-varying effects in heart failure models. The analysis performed suggests that variations of Cox and Aalen models constitute a valuable tool for identifying these variables. The implementation of covariates with time-varying effects into heart failure prognostication models may reduce bias and increase the specificity of such models.
Resumo:
Abstract Background Smallpox is a lethal disease that was endemic in many parts of the world until eradicated by massive immunization. Due to its lethality, there are serious concerns about its use as a bioweapon. Here we analyze publicly available microarray data to further understand survival of smallpox infected macaques, using systems biology approaches. Our goal is to improve the knowledge about the progression of this disease. Results We used KEGG pathways annotations to define groups of genes (or modules), and subsequently compared them to macaque survival times. This technique provided additional insights about the host response to this disease, such as increased expression of the cytokines and ECM receptors in the individuals with higher survival times. These results could indicate that these gene groups could influence an effective response from the host to smallpox. Conclusion Macaques with higher survival times clearly express some specific pathways previously unidentified using regular gene-by-gene approaches. Our work also shows how third party analysis of public datasets can be important to support new hypotheses to relevant biological problems.
Resumo:
Spin systems in the presence of disorder are described by two sets of degrees of freedom, associated with orientational (spin) and disorder variables, which may be characterized by two distinct relaxation times. Disordered spin models have been mostly investigated in the quenched regime, which is the usual situation in solid state physics, and in which the relaxation time of the disorder variables is much larger than the typical measurement times. In this quenched regime, disorder variables are fixed, and only the orientational variables are duly thermalized. Recent studies in the context of lattice statistical models for the phase diagrams of nematic liquid-crystalline systems have stimulated the interest of going beyond the quenched regime. The phase diagrams predicted by these calculations for a simple Maier-Saupe model turn out to be qualitative different from the quenched case if the two sets of degrees of freedom are allowed to reach thermal equilibrium during the experimental time, which is known as the fully annealed regime. In this work, we develop a transfer matrix formalism to investigate annealed disordered Ising models on two hierarchical structures, the diamond hierarchical lattice (DHL) and the Apollonian network (AN). The calculations follow the same steps used for the analysis of simple uniform systems, which amounts to deriving proper recurrence maps for the thermodynamic and magnetic variables in terms of the generations of the construction of the hierarchical structures. In this context, we may consider different kinds of disorder, and different types of ferromagnetic and anti-ferromagnetic interactions. In the present work, we analyze the effects of dilution, which are produced by the removal of some magnetic ions. The system is treated in a “grand canonical" ensemble. The introduction of two extra fields, related to the concentration of two different types of particles, leads to higher-rank transfer matrices as compared with the formalism for the usual uniform models. Preliminary calculations on a DHL indicate that there is a phase transition for a wide range of dilution concentrations. Ising spin systems on the AN are known to be ferromagnetically ordered at all temperatures; in the presence of dilution, however, there are indications of a disordered (paramagnetic) phase at low concentrations of magnetic ions.
Resumo:
The purpose of this study was to analyze the influence of lactation and dry period in the constituents of lipid and glucose metabolism of buffaloes. One hundred forty-seven samples of serum and plasma were collected between November 2009 and July 2010, from properties raising Murrah, Mediterranean and crossbred buffaloes, located in the State of Sao Paulo, Brazil. Biochemical analysis was obtained by determining the contents of serum cholesterol, triglycerides, beta-hydroxybutyrate (β-HBO), non-esterified fatty acids (NEFA) and plasma glucose. Values for arithmetic mean and standard error mean were calculated using the SAS procedure, version 9.2. Tests for normality of residuals and homogeneity of variances were performed using the SAS Guide Data Analysis. Data were analyzed by ANOVA using the SAS procedure Glimmix. The group information (Lactation), Farm and Age were used in the statistical models. Means of groups were compared using Least Square Means (LSMeans) of SAS, where significant difference was observed at P ≤ 0.05. It was possible to conclude that buffaloes during peak lactation need to metabolize body reserves to supplement the lower amounts of bloodstream lipids, when they remain in negative energy balance. In the dry period, there were significant changes in the lipid profile, characterized by decrease of nutritional requirements, with consequent improvement in the general conditions of the animals.
Resumo:
In accelerating dark energy models, the estimates of the Hubble constant, Ho, from Sunyaev-Zerdovich effect (SZE) and X-ray surface brightness of galaxy clusters may depend on the matter content (Omega(M)), the curvature (Omega(K)) and the equation of state parameter GO. In this article, by using a sample of 25 angular diameter distances of galaxy clusters described by the elliptical beta model obtained through the SZE/X-ray technique, we constrain Ho in the framework of a general ACDM model (arbitrary curvature) and a flat XCDM model with a constant equation of state parameter omega = p(x)/rho(x). In order to avoid the use of priors in the cosmological parameters, we apply a joint analysis involving the baryon acoustic oscillations (BA()) and the (MB Shift Parameter signature. By taking into account the statistical and systematic errors of the SZE/X-ray technique we obtain for nonflat ACDM model H-0 = 74(-4.0)(+5.0) km s(-1) Mpc(-1) (1 sigma) whereas for a fiat universe with constant equation of state parameter we find H-0 = 72(-4.0)(+5.5) km s(-1) Mpc(-1)(1 sigma). By assuming that galaxy clusters are described by a spherical beta model these results change to H-0 = 6(-7.0)(+8.0) and H-0 = 59(-6.0)(+9.0) km s(-1) Mpc(-1)(1 sigma), respectively. The results from elliptical description are in good agreement with independent studies from the Hubble Space Telescope key project and recent estimates based on the Wilkinson Microwave Anisotropy Probe, thereby suggesting that the combination of these three independent phenomena provides an interesting method to constrain the Bubble constant. As an extra bonus, the adoption of the elliptical description is revealed to be a quite realistic assumption. Finally, by comparing these results with a recent determination for a, flat ACDM model using only the SZE/X-ray technique and BAO, we see that the geometry has a very weak influence on H-0 estimates for this combination of data.
Resumo:
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Background: Lynch syndrome (LS) is the most common form of inherited predisposition to colorectal cancer (CRC), accounting for 2-5% of all CRC. LS is an autosomal dominant disease characterized by mutations in the mismatch repair genes mutL homolog 1 (MLH1), mutS homolog 2 (MSH2), postmeiotic segregation increased 1 (PMS1), post-meiotic segregation increased 2 (PMS2) and mutS homolog 6 (MSH6). Mutation risk prediction models can be incorporated into clinical practice, facilitating the decision-making process and identifying individuals for molecular investigation. This is extremely important in countries with limited economic resources. This study aims to evaluate sensitivity and specificity of five predictive models for germline mutations in repair genes in a sample of individuals with suspected Lynch syndrome. Methods: Blood samples from 88 patients were analyzed through sequencing MLH1, MSH2 and MSH6 genes. The probability of detecting a mutation was calculated using the PREMM, Barnetson, MMRpro, Wijnen and Myriad models. To evaluate the sensitivity and specificity of the models, receiver operating characteristic curves were constructed. Results: Of the 88 patients included in this analysis, 31 mutations were identified: 16 were found in the MSH2 gene, 15 in the MLH1 gene and no pathogenic mutations were identified in the MSH6 gene. It was observed that the AUC for the PREMM (0.846), Barnetson (0.850), MMRpro (0.821) and Wijnen (0.807) models did not present significant statistical difference. The Myriad model presented lower AUC (0.704) than the four other models evaluated. Considering thresholds of >= 5%, the models sensitivity varied between 1 (Myriad) and 0.87 (Wijnen) and specificity ranged from 0 (Myriad) to 0.38 (Barnetson). Conclusions: The Barnetson, PREMM, MMRpro and Wijnen models present similar AUC. The AUC of the Myriad model is statistically inferior to the four other models.
Resumo:
We show that the Kronecker sum of d >= 2 copies of a random one-dimensional sparse model displays a spectral transition of the type predicted by Anderson, from absolutely continuous around the center of the band to pure point around the boundaries. Possible applications to physics and open problems are discussed briefly.
Resumo:
In this paper we obtain asymptotic expansions, up to order n(-1/2) and under a sequence of Pitman alternatives, for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of symmetric linear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Background: In the analysis of effects by cell treatment such as drug dosing, identifying changes on gene network structures between normal and treated cells is a key task. A possible way for identifying the changes is to compare structures of networks estimated from data on normal and treated cells separately. However, this approach usually fails to estimate accurate gene networks due to the limited length of time series data and measurement noise. Thus, approaches that identify changes on regulations by using time series data on both conditions in an efficient manner are demanded. Methods: We propose a new statistical approach that is based on the state space representation of the vector autoregressive model and estimates gene networks on two different conditions in order to identify changes on regulations between the conditions. In the mathematical model of our approach, hidden binary variables are newly introduced to indicate the presence of regulations on each condition. The use of the hidden binary variables enables an efficient data usage; data on both conditions are used for commonly existing regulations, while for condition specific regulations corresponding data are only applied. Also, the similarity of networks on two conditions is automatically considered from the design of the potential function for the hidden binary variables. For the estimation of the hidden binary variables, we derive a new variational annealing method that searches the configuration of the binary variables maximizing the marginal likelihood. Results: For the performance evaluation, we use time series data from two topologically similar synthetic networks, and confirm that our proposed approach estimates commonly existing regulations as well as changes on regulations with higher coverage and precision than other existing approaches in almost all the experimental settings. For a real data application, our proposed approach is applied to time series data from normal Human lung cells and Human lung cells treated by stimulating EGF-receptors and dosing an anticancer drug termed Gefitinib. In the treated lung cells, a cancer cell condition is simulated by the stimulation of EGF-receptors, but the effect would be counteracted due to the selective inhibition of EGF-receptors by Gefitinib. However, gene expression profiles are actually different between the conditions, and the genes related to the identified changes are considered as possible off-targets of Gefitinib. Conclusions: From the synthetically generated time series data, our proposed approach can identify changes on regulations more accurately than existing methods. By applying the proposed approach to the time series data on normal and treated Human lung cells, candidates of off-target genes of Gefitinib are found. According to the published clinical information, one of the genes can be related to a factor of interstitial pneumonia, which is known as a side effect of Gefitinib.
Resumo:
Lemonte and Cordeiro [Birnbaum-Saunders nonlinear regression models, Comput. Stat. Data Anal. 53 (2009), pp. 4441-4452] introduced a class of Birnbaum-Saunders (BS) nonlinear regression models potentially useful in lifetime data analysis. We give a general matrix Bartlett correction formula to improve the likelihood ratio (LR) tests in these models. The formula is simple enough to be used analytically to obtain several closed-form expressions in special cases. Our results generalize those in Lemonte et al. [Improved likelihood inference in Birnbaum-Saunders regressions, Comput. Stat. DataAnal. 54 (2010), pp. 1307-1316], which hold only for the BS linear regression models. We consider Monte Carlo simulations to show that the corrected tests work better than the usual LR tests.