19 resultados para Model selection criteria
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
In this paper we propose a hybrid hazard regression model with threshold stress which includes the proportional hazards and the accelerated failure time models as particular cases. To express the behavior of lifetimes the generalized-gamma distribution is assumed and an inverse power law model with a threshold stress is considered. For parameter estimation we develop a sampling-based posterior inference procedure based on Markov Chain Monte Carlo techniques. We assume proper but vague priors for the parameters of interest. A simulation study investigates the frequentist properties of the proposed estimators obtained under the assumption of vague priors. Further, some discussions on model selection criteria are given. The methodology is illustrated on simulated and real lifetime data set.
Resumo:
Abstract Background The criteria for organ sharing has developed a system that prioritizes liver transplantation (LT) for patients with hepatocellular carcinoma (HCC) who have the highest risk of wait-list mortality. In some countries this model allows patients only within the Milan Criteria (MC, defined by the presence of a single nodule up to 5 cm, up to three nodules none larger than 3 cm, with no evidence of extrahepatic spread or macrovascular invasion) to be evaluated for liver transplantation. This police implies that some patients with HCC slightly more advanced than those allowed by the current strict selection criteria will be excluded, even though LT for these patients might be associated with acceptable long-term outcomes. Methods We propose a mathematical approach to study the consequences of relaxing the MC for patients with HCC that do not comply with the current rules for inclusion in the transplantation candidate list. We consider overall 5-years survival rates compatible with the ones reported in the literature. We calculate the best strategy that would minimize the total mortality of the affected population, that is, the total number of people in both groups of HCC patients that die after 5 years of the implementation of the strategy, either by post-transplantation death or by death due to the basic HCC. We illustrate the above analysis with a simulation of a theoretical population of 1,500 HCC patients with tumor size exponentially. The parameter λ obtained from the literature was equal to 0.3. As the total number of patients in these real samples was 327 patients, this implied in an average size of 3.3 cm and a 95% confidence interval of [2.9; 3.7]. The total number of available livers to be grafted was assumed to be 500. Results With 1500 patients in the waiting list and 500 grafts available we simulated the total number of deaths in both transplanted and non-transplanted HCC patients after 5 years as a function of the tumor size of transplanted patients. The total number of deaths drops down monotonically with tumor size, reaching a minimum at size equals to 7 cm, increasing from thereafter. With tumor size equals to 10 cm the total mortality is equal to the 5 cm threshold of the Milan criteria. Conclusion We concluded that it is possible to include patients with tumor size up to 10 cm without increasing the total mortality of this population.
Resumo:
We present a photometric catalogue of compact groups of galaxies (p2MCGs) automatically extracted from the Two-Micron All Sky Survey (2MASS) extended source catalogue. A total of 262 p2MCGs are identified, following the criteria defined by Hickson, of which 230 survive visual inspection (given occasional galaxy fragmentation and blends in the 2MASS parent catalogue). Only one quarter of these 230 groups were previously known compact groups (CGs). Among the 144 p2MCGs that have all their galaxies with known redshifts, 85 (59?per cent) have four or more accordant galaxies. This v2MCG sample of velocity-filtered p2MCGs constitutes the largest sample of CGs (with N = 4) catalogued to date, with both well-defined selection criteria and velocity filtering, and is the first CG sample selected by stellar mass. It is fairly complete up to Kgroup similar to 9 and radial velocity of similar to 6000?km?s-1. We compared the properties of the 78 v2MCGs with median velocities greater than 3000?km?s-1 with the properties of other CG samples, as well as those (mvCGs) extracted from the semi-analytical model (SAM) of Guo et al. run on the high-resolution Millennium-II simulation. This mvCG sample is similar (i.e. with 2/3 of physically dense CGs) to those we had previously extracted on three other SAMs run on the Millennium simulation with 125 times worse spatial and mass resolutions. The space density of v2MCGs within 6000?km?s-1 is 8.0 X 10-5?h3?Mpc-3, i.e. four times that of the Hickson sample [Hickson Compact Group (HCG)] up to the same distance and with the same criteria used in this work, but still 40?per cent less than that of mvCGs. The v2MCG constitutes the first group catalogue to show a statistically large firstsecond ranked galaxy magnitude gap according to TremaineRichstone statistics, as expected if the first ranked group members tend to be the products of galaxy mergers, and as confirmed in the mvCGs. The v2MCG is also the first observed sample to show that first-ranked galaxies tend to be centrally located, again consistent with the predictions obtained from mvCGs. We found no significant correlation of group apparent elongation and velocity dispersion in the quartets among the v2MCGs, and the velocity dispersions of apparently round quartets are not significantly larger than those of chain-like ones, in contrast to what has been previously reported in HCGs. By virtue of its automatic selection with the popular Hickson criteria, its size, its selection on stellar mass, and its statistical signs of mergers and centrally located brightest galaxies, the v2MCG catalogue appears to be the laboratory of choice to study physically dense groups of four or more galaxies of comparable luminosity.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for the right-censored survival data when immune or cured individuals may be present in the population from which the data is taken. In our approach the number of competing causes of the event of interest follows the Conway-Maxwell-Poisson distribution which generalizes the Poisson distribution. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.
Resumo:
The starting point of this article is the question "How to retrieve fingerprints of rhythm in written texts?" We address this problem in the case of Brazilian and European Portuguese. These two dialects of Modern Portuguese share the same lexicon and most of the sentences they produce are superficially identical. Yet they are conjectured, on linguistic grounds, to implement different rhythms. We show that this linguistic question can be formulated as a problem of model selection in the class of variable length Markov chains. To carry on this approach, we compare texts from European and Brazilian Portuguese. These texts are previously encoded according to some basic rhythmic features of the sentences which can be automatically retrieved. This is an entirely new approach from the linguistic point of view. Our statistical contribution is the introduction of the smallest maximizer criterion which is a constant free procedure for model selection. As a by-product, this provides a solution for the problem of optimal choice of the penalty constant when using the BIC to select a variable length Markov chain. Besides proving the consistency of the smallest maximizer criterion when the sample size diverges, we also make a simulation study comparing our approach with both the standard BIC selection and the Peres-Shields order estimation. Applied to the linguistic sample constituted for our case study, the smallest maximizer criterion assigns different context-tree models to the two dialects of Portuguese. The features of the selected models are compatible with current conjectures discussed in the linguistic literature.
Resumo:
Estimates of phenotypic, genetics and residual variances for reproductive traits in 5903 Nellore bulls were obtained. The experimental model used was multiple trait derivative-free restricted maximum likelihood. The values obtained for heritability were 0.24 +/- 0.05 for scrotal circumference at 450 days of age and 0.37 +/- 0.05 at 21 months for age at the time of the breeding soundness evaluation; 0.24 +/- 0.05 and 0.26 +/- 0.05 for left and right testicle length; 0.29 +/- 0.05 and 0.31 +/- 0.05 for left and right testicle width; 0.12 +/- 0.04 for testicle format; 0.33 +/- 0.06 for testicle volume; 0.11 +/- 0.03 for gross motility; 0.08 +/- 0.03 for individual motility and 0.05 +/- 0.02 for spermatic vigor; 0.20 +/- 0.04, 0.03 +/- 0.02 and 0.19 +/- 0.04 for larger defects, smaller defects and total defects, respectively. The values for heritability for testicular biometric characteristics were moderate to high while the seminal characteristics, presented low values. Genetic correlations between scrotal circumference with all the reproductive traits were favorable, suggesting the scrotal circumference as a feature of choice in the selection of bulls.
Resumo:
The objective of this study was to evaluate the genetic relationship between postweaning weight gain (PWG), heifer pregnancy (HP), scrotal circumference (SC) at 18 months of age, stayability at 6 years of age (STAY) and finishing visual score at 18 months of age (PREC), and to determine the potential of these traits as selection criteria for the genetic improvement of growth and reproduction in Nellore cattle. The HP was defined as the observation that a heifer conceived and remained pregnant, which was assessed by rectal palpation at 60 days. The STAY was defined as whether or not a cow calved every year up to the age of 6 years, given that she was provided the opportunity to breed. The Bayesian linear-threshold analysis via the Gibbs sampler was used to estimate the variance and covariance components applying a multitrait model. Posterior mean estimates of direct heritability were 0.15 +/- 0.00, 0.42 +/- 0.02, 0.49 +/- 0.01, 0.11 +/- 0.01 and 0.19 +/- 0.00 for PWG, HP, SC, STAY and PREC, respectively. The genetic correlations between traits ranged from 0.17 to 0.62. The traits studied generally have potential for use as selection criteria in genetic breeding programs. The genetic correlations between all traits show that selection for one of these traits does not imply the loss of the others.
Resumo:
In this article, we propose a new Bayesian flexible cure rate survival model, which generalises the stochastic model of Klebanov et al. [Klebanov LB, Rachev ST and Yakovlev AY. A stochastic-model of radiation carcinogenesis - latent time distributions and their properties. Math Biosci 1993; 113: 51-75], and has much in common with the destructive model formulated by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)]. In our approach, the accumulated number of lesions or altered cells follows a compound weighted Poisson distribution. This model is more flexible than the promotion time cure model in terms of dispersion. Moreover, it possesses an interesting and realistic interpretation of the biological mechanism of the occurrence of the event of interest as it includes a destructive process of tumour cells after an initial treatment or the capacity of an individual exposed to irradiation to repair altered cells that results in cancer induction. In other words, what is recorded is only the damaged portion of the original number of altered cells not eliminated by the treatment or repaired by the repair system of an individual. Markov Chain Monte Carlo (MCMC) methods are then used to develop Bayesian inference for the proposed model. Also, some discussions on the model selection and an illustration with a cutaneous melanoma data set analysed by Rodrigues et al. [Rodrigues J, de Castro M, Balakrishnan N and Cancho VG. Destructive weighted Poisson cure rate models. Technical Report, Universidade Federal de Sao Carlos, Sao Carlos-SP. Brazil, 2009 (accepted in Lifetime Data Analysis)] are presented.
Resumo:
The allometric growth of two groups of Nassarius vibex on beds of the bivalve Mytella charruana on the northern coast of the State of Sao Paulo, was evaluated between September 2006 and February 2007 in the bed on Camaroeiro Beach, and from March 2007 to June 2007 at Cidade Beach. The shells from Camaroeiro were longer and wider and had a smaller shell aperture than those from Cidade; a principal components analysis also confirmed different morphometric patterns between the areas. The allometric growth of the two groups showed great variation in the development of individuals. The increase of shell width and height in relation to shell length did not differ between the two areas. Shell aperture showed a contrasting growth pattern, with individuals from Camaroeiro having smaller apertures. The methodology based on Kullback-Leibler information theory and the multi-model inference showed, for N. vibex, that the classic linear allometric growth was not the most suitable explanation for the observed morphometric relationships. The patterns of relative growth observed in the two groups of N. vibex may be a consequence of different growth and variation rates, which modifies the development of the individuals. Other factors such as food resource availability and environmental parameters, which might also differ between the two areas, should also be considered.
Discriminating Different Classes of Biological Networks by Analyzing the Graphs Spectra Distribution
Resumo:
The brain's structural and functional systems, protein-protein interaction, and gene networks are examples of biological systems that share some features of complex networks, such as highly connected nodes, modularity, and small-world topology. Recent studies indicate that some pathologies present topological network alterations relative to norms seen in the general population. Therefore, methods to discriminate the processes that generate the different classes of networks (e. g., normal and disease) might be crucial for the diagnosis, prognosis, and treatment of the disease. It is known that several topological properties of a network (graph) can be described by the distribution of the spectrum of its adjacency matrix. Moreover, large networks generated by the same random process have the same spectrum distribution, allowing us to use it as a "fingerprint". Based on this relationship, we introduce and propose the entropy of a graph spectrum to measure the "uncertainty" of a random graph and the Kullback-Leibler and Jensen-Shannon divergences between graph spectra to compare networks. We also introduce general methods for model selection and network model parameter estimation, as well as a statistical procedure to test the nullity of divergence between two classes of complex networks. Finally, we demonstrate the usefulness of the proposed methods by applying them to (1) protein-protein interaction networks of different species and (2) on networks derived from children diagnosed with Attention Deficit Hyperactivity Disorder (ADHD) and typically developing children. We conclude that scale-free networks best describe all the protein-protein interactions. Also, we show that our proposed measures succeeded in the identification of topological changes in the network while other commonly used measures (number of edges, clustering coefficient, average path length) failed.
Resumo:
The present work aimed to estimate heritability and genetic correlations of reproductive features of Nellore bulls, offspring of mothers classified as superprecocious (M1), precocious (M2) and normal (M3). Twenty one thousand hundred and eighty-six animals with average age of 21.29 months were used, evaluated through the breeding soundness evaluation from 1999 to 2008. The breeding soundness features included physical semen evaluation (progressive sperm motility and sperm vigour), semen morphology (major, minor and total sperm defects), scrotal circumference (SC), testicular volume (TV) and SC at 18 months of age (SC18). The components of variance, heritability and genetic correlations for and between the features were estimated simultaneously by restricted maximum likelihood, with the use of the vce software system vs 6. The heritability estimates were high for SC18, SC and TV (0.43, 0.63 and 0.54; 0.45, 0.45 and 0.44; 0.42, 0.45 and 0.41, respectively for the categories of mothers M1, M2 and M3) and low for physical and morphological semen aspects. The genetic correlations between SC18 and SC were high, as well as between these variables with TV. High and positive genetic correlations were recorded among SC18, SC and TV with the physical aspects of the semen, although no favourable association was verified with the morphological aspects, for the three categories of mothers. It can be concluded that the mothers sexual precocity did not affect the heritability of their offspring reproduction features.
Resumo:
The sera of a retrospective cohort (n = 41) composed of children with well characterized cow's milk allergy collected from multiple visits were analyzed using a protein microarray system measuring four classes of immunoglobulins. The frequency of the visits, age and gender distribution reflected real situation faced by the clinicians at a pediatric reference center for food allergy in 530 Paulo, Brazil. The profiling array results have shown that total IgG and IgA share similar specificity whilst IgM and in particular IgE are distantly related. The correlation of specificity of IgE and IgA is variable amongst the patients and this relationship cannot be used to predict atopy or the onset of tolerance to milk. The array profiling technique has corroborated the clinical selection criteria for this cohort albeit it clearly suggested that 4 out of the 41 patients might have allergies other than milk origin. There was also a good correlation between the array data and ImmunoCAP results, casein in particular. By using qualitative and quantitative multivariate analysis routines it was possible to produce validated statistical models to predict with reasonable accuracy the onset of tolerance to milk proteins. If expanded to larger study groups, the array profiling in combination with the multivariate techniques show potential to improve the prognostic of milk allergic patients. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Brood desertion is a life history strategy that allows parents to minimize costs related to parental care and increase their future fecundity. The harvestman Neosadocus maximus is an interesting model organism to study costs and benefits of temporary brood desertion because females abandon their clutches periodically and keep adding eggs to their clutches for some weeks. In this study, we tested if temporary brood desertion (a) imposes a cost to caring females by increasing the risk of egg predation and (b) offers a benefit to caring females by increasing fecundity as a result of increased foraging opportunities. With intensive field observations followed by a model selection approach, we showed that the proportion of consumed eggs was very low during the day and it was not influenced by the frequency of brood desertion. The proportion of consumed eggs was higher at night and it was negatively related to the frequency of brood desertion. However, frequent brood desertion did not result in higher fecundity, measured both as the number of eggs added to the current clutch and the probability of laying a second clutch over the course of the reproductive season. Considering that harvestmen are sensitive to dehydration, brood desertion during the day may attenuate the physiological stress of remaining exposed on the vegetation. Moreover, since brood desertion is higher during the day, when egg predation pressure is lower, caring females could be adjusting their maternal effort to the temporal variation in predation risk, which is regarded as the main cost of brood desertion in ectotherms.
Resumo:
We examined the effects of soil mesofauna and the litter decomposition environment (above and belowground) on leaf decomposition rates in three forest types in southeastern Brazil. To estimate decomposition experimentally, we used litterbags with a standard substrate in a full-factorial experimental design. We used model selection to compare three decomposition models and also to infer the importance of forest type, decomposition environment, mesofauna, and their interactions on the decomposition process. Rather than the frequently used simple and double-exponential models, the best model to describe our dataset was the exponential deceleration model, which assumed a single organic compartment with an exponential decrease of the decomposition rate. Decomposition was higher in the wet than in the seasonal forest, and the differences between forest types were stronger aboveground. Regarding litter decomposition environment, decomposition was predominantly higher below than aboveground, but the magnitude of this effect was higher in the seasonal than in wet forests. Mesofauna exclusion treatments had slower decomposition, except aboveground into the Semi-deciduous Forest, where the mesofauna presence did not affect decomposition. Furthermore, the effect of mesofauna was stronger in the wet forests and belowground. Overall, our results suggest that, in a regional scale, both decomposers activity and the positive effect of soil mesofauna in decomposition are constrained by abiotic factors, such as moisture conditions.
Resumo:
This paper proposes a general class of regression models for continuous proportions when the data contain zeros or ones. The proposed class of models assumes that the response variable has a mixed continuous-discrete distribution with probability mass at zero or one. The beta distribution is used to describe the continuous component of the model, since its density has a wide range of different shapes depending on the values of the two parameters that index the distribution. We use a suitable parameterization of the beta law in terms of its mean and a precision parameter. The parameters of the mixture distribution are modeled as functions of regression parameters. We provide inference, diagnostic, and model selection tools for this class of models. A practical application that employs real data is presented. (C) 2011 Elsevier B.V. All rights reserved.