241 resultados para model selection
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
Context tree models have been introduced by Rissanen in [25] as a parsimonious generalization of Markov models. Since then, they have been widely used in applied probability and statistics. The present paper investigates non-asymptotic properties of two popular procedures of context tree estimation: Rissanen's algorithm Context and penalized maximum likelihood. First showing how they are related, we prove finite horizon bounds for the probability of over- and under-estimation. Concerning overestimation, no boundedness or loss-of-memory conditions are required: the proof relies on new deviation inequalities for empirical probabilities of independent interest. The under-estimation properties rely on classical hypotheses for processes of infinite memory. These results improve on and generalize the bounds obtained in Duarte et al. (2006) [12], Galves et al. (2008) [18], Galves and Leonardi (2008) [17], Leonardi (2010) [22], refining asymptotic results of Buhlmann and Wyner (1999) [4] and Csiszar and Talata (2006) [9]. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
Clustering is a difficult task: there is no single cluster definition and the data can have more than one underlying structure. Pareto-based multi-objective genetic algorithms (e.g., MOCK Multi-Objective Clustering with automatic K-determination and MOCLE-Multi-Objective Clustering Ensemble) were proposed to tackle these problems. However, the output of such algorithms can often contains a high number of partitions, becoming difficult for an expert to manually analyze all of them. In order to deal with this problem, we present two selection strategies, which are based on the corrected Rand, to choose a subset of solutions. To test them, they are applied to the set of solutions produced by MOCK and MOCLE in the context of several datasets. The study was also extended to select a reduced set of partitions from the initial population of MOCLE. These analysis show that both versions of selection strategy proposed are very effective. They can significantly reduce the number of solutions and, at the same time, keep the quality and the diversity of the partitions in the original set of solutions. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In this paper we deal with a Bayesian analysis for right-censored survival data suitable for populations with a cure rate. We consider a cure rate model based on the negative binomial distribution, encompassing as a special case the promotion time cure model. Bayesian analysis is based on Markov chain Monte Carlo (MCMC) methods. We also present some discussion on model selection and an illustration with a real dataset.
Resumo:
Background: Neotropical freshwater stingrays (Batoidea: Potamotrygonidae) host a diverse parasite fauna, including cestodes. Both cestodes and their stingray hosts are marine-derived, but the taxonomy of this host/parasite system is poorly understood. Methodology: Morphological and molecular (Cytochrome oxidase I) data were used to investigate diversity in freshwater lineages of the cestode genus Rhinebothrium Linton, 1890. Results were based on a phylogenetic hypothesis for 74 COI sequences and morphological analysis of over 400 specimens. Cestodes studied were obtained from 888 individual potamotrygonids, representing 14 recognized and 18 potentially undescribed species from most river systems of South America. Results: Morphological species boundaries were based mainly on microthrix characters observed with scanning electron microscopy, and were supported by COI data. Four species were recognized, including two redescribed (Rhinebothrium copianullum and R. paratrygoni), and two newly described (R. brooksi n. sp. and R. fulbrighti n. sp.). Rhinebothrium paranaensis Menoret & Ivanov, 2009 is considered a junior synonym of R. paratrygoni because the morphological features of the two species overlap substantially. The diagnosis of Rhinebothrium Linton, 1890 is emended to accommodate the presence of marginal longitudinal septa observed in R. copianullum and R. brooksi n. sp. Patterns of host specificity and distribution ranged from use of few host species in few river basins, to use of as many as eight host species in multiple river basins. Significance: The level of intra-specific morphological variation observed in features such as total length and number of proglottids is unparalleled among other elasmobranch cestodes. This is attributed to the large representation of host and biogeographical samples. It is unclear whether the intra-specific morphological variation observed is unique to this freshwater system. Nonetheless, caution is urged when using morphological discontinuities to delimit elasmobranch cestode species because the amount of variation encountered is highly dependent on sample size and/or biogeographical representation.
Resumo:
We consider the problem of interaction neighborhood estimation from the partial observation of a finite number of realizations of a random field. We introduce a model selection rule to choose estimators of conditional probabilities among natural candidates. Our main result is an oracle inequality satisfied by the resulting estimator. We use then this selection rule in a two-step procedure to evaluate the interacting neighborhoods. The selection rule selects a small prior set of possible interacting points and a cutting step remove from this prior set the irrelevant points. We also prove that the Ising models satisfy the assumptions of the main theorems, without restrictions on the temperature, on the structure of the interacting graph or on the range of the interactions. It provides therefore a large class of applications for our results. We give a computationally efficient procedure in these models. We finally show the practical efficiency of our approach in a simulation study.
Resumo:
This paper applies Hierarchical Bayesian Models to price farm-level yield insurance contracts. This methodology considers the temporal effect, the spatial dependence and spatio-temporal models. One of the major advantages of this framework is that an estimate of the premium rate is obtained directly from the posterior distribution. These methods were applied to a farm-level data set of soybean in the State of the Parana (Brazil), for the period between 1994 and 2003. The model selection was based on a posterior predictive criterion. This study improves considerably the estimation of the fair premium rates considering the small number of observations.
Resumo:
We formulated a general unrestricted model of the Brazilian Emerging Markets Bond Index Plus (EMBI+) spreads, a proxy for the country`s default risk. Employing algorithms that perform automated model selection, we found that macroeconomic fundamentals, such as current account deficit ratio to gross domestic product, public deficit ratio to gross domestic product and imports over foreign exchange reserves, can explain a great part of the variation in EMBI+ spreads. There is also robust evidence of systematic contagion from Argentina and Mexico and that the variance of the spread also affects its mean.
Resumo:
Fogo selvagem (FS) is mediated by pathogenic, predominantly IgG4, anti-desmoglein 1 (Dsg1) autoantibodies and is endemic in Limao Verde, Brazil. IgG and IgG subclass autoantibodies were tested in a sample of 214 FS patients and 261 healthy controls by Dsg1 ELISA. For model selection, the sample was randomly divided into training (50%), validation (25%), and test (25%) sets. Using the training and validation sets, IgG4 was chosen as the best predictor of FS, with index values above 6.43 classified as FS. Using the test set, IgG4 has sensitivity of 92% (95% confidence interval (95% CI): 82-95%), specificity of 97% (95% CI: 89-100%), and area under the curve of 0.97 ( 95% CI: 0.94-1.00). The IgG4 positive predictive value (PPV) in Limao Verde (3% FS prevalence) was 49%. The sensitivity, specificity, and PPV of IgG anti-Dsg1 were 87, 91, and 23%, respectively. The IgG4-based classifier was validated by testing 11 FS patients before and after clinical disease and 60 Japanese pemphigus foliaceus patients. It classified 21 of 96 normal individuals from a Limao Verde cohort as having FS serology. On the basis of its PPV, half of the 21 individuals may currently have preclinical FS and could develop clinical disease in the future. Identifying individuals during preclinical FS will enhance our ability to identify the etiological agent(s) triggering FS.
Resumo:
Emotional liability and mood dysregulation characterize bipolar disorder (BID), yet no study has examined effective connectivity between parahippocampal gyrus and prefrontal cortical regions in ventromedial and dorsal/lateral neural systems subserving mood regulation in BD. Participants comprised 46 individuals (age range: 18-56 years): 21 with a DSM-IV diagnosis of BID, type I currently remitted; and 25 age- and gender-matched healthy controls (HC). Participants performed an event-related functional magnetic resonance imaging paradigm, viewing mild and intense happy and neutral faces. We employed dynamic causal modeling (I)CM) to identify significant alterations in effective connectivity between BD and HC. Bayes model selection was used to determine the best model. The right parahippocampal gyrus (PHG) and right subgenual cingulate gyrus (sgCG) were included as representative regions of the ventromedial neural system. The right dorsolateral prefrontal cortex (DLPFC) region was included as representative of the dorsal/lateral neural system. Right PHG-sgCG effective connectivity was significantly greater in BD than HC, reflecting more rapid, forward PHG-sgCG signaling in BD than HC. There was no between-group difference in sgCG-DLPFC effective connectivity. In BD, abnormally increased right PHG-sgCG effective connectivity and reduced right PHG activity to emotional stimuli suggest a dysfunctional ventromedial neural system implicated in early stimulus appraisal, encoding and automatic regulation of emotion that may represent a pathophysiological functional neural mechanism for mood dysregulation in BD. (C) 2009 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Objective: Several limitations of published bioelectrical impedance analysis (BIA) equations have been reported. The aims were to develop in a multiethnic, elderly population a new prediction equation and cross-validate it along with some published BIA equations for estimating fat-free mass using deuterium oxide dilution as the reference method. Design and setting: Cross-sectional study of elderly from five developing countries. Methods: Total body water (TBW) measured by deuterium dilution was used to determine fat-free mass (FFM) in 383 subjects. Anthropometric and BIA variables were also measured. Only 377 subjects were included for the analysis, randomly divided into development and cross-validation groups after stratified by gender. Stepwise model selection was used to generate the model and Bland Altman analysis was used to test agreement. Results: FFM = 2.95 - 3.89 (Gender) + 0.514 (Ht(2)/Z) + 0.090 (Waist) + 0.156 (Body weight). The model fit parameters were an R(2), total F-Ratio, and the SEE of 0.88, 314.3, and 3.3, respectively. None of the published BIA equations met the criteria for agreement. The new BIA equation underestimated FFM by just 0.3 kg in the cross-validation sample. The mean of the difference between FFM by TBW and the new BIA equation were not significantly different; 95% of the differences were between the limits of agreement of -6.3 to 6.9 kg of FFM. There was no significant association between the mean of the differences and their averages (r = 0.008 and p = 0.2). Conclusions: This new BIA equation offers a valid option compared with some of the current published BIA equations to estimate FFM in elderly subjects from five developing countries.
Resumo:
Cannabis sativa, the most widely used illicit drug, has profound effects on levels of anxiety in animals and humans. Although recent studies have helped provide a better understanding of the neurofunctional correlates of these effects, indicating the involvement of the amygdala and cingulate cortex, their reciprocal influence is still mostly unknown. In this study dynamic causal modelling (DCM) and Bayesian model selection (BMS) were used to explore the effects of pure compounds of C. sativa [600 mg of cannabidiol (CBD) and 10 mg Delta(9)-tetrahydrocannabinol (Delta(9)-THC)] on prefrontal-subcortical effective connectivity in 15 healthy subjects who underwent a double-blind randomized, placebo-controlled fMRI paradigm while viewing faces which elicited different levels of anxiety. In the placebo condition, BMS identified a model with driving inputs entering via the anterior cingulate and forward intrinsic connectivity between the amygdala and the anterior cingulate as the best fit. CBD but not Delta(9)-THC disrupted forward connectivity between these regions during the neural response to fearful faces. This is the first study to show that the disruption of prefrontal-subocrtical connectivity by CBD may represent neurophysiological correlates of its anxiolytic properties.
Resumo:
The phylogenetic placement of Kuhlmanniodendron Fiaschi & Groppo (Achariaceae) within Malpighiales was investigated with rbcL sequence data. This genus was recently created to accommodate Carpotroche apterocarpa Kuhlm., a poorly known species from the rainforests of Espirito Santo, Brazil. One rbcL sequence was obtained from Kuhlmanniodendron and analyzed with 73 additional sequences from Malpighiales, and 8 from two closer orders, Oxalidales and Celastrales, all of which were available at Genbank. Phylogenetic analyses were carried out with maximum parsimony and Bayesian inference; bootstrap analyses were used in maximum parsimony to evaluate branch support. The results confirmed the placement of Kuhlmanniodendron together with Camptostylus, Lindackeria, Xylotheca, and Caloncoba in a strongly supported clade (posterior probability = 0.99) that corresponds with the tribe Lindackerieae of Achariaceae (Malpighiales). Kuhlmanniodendron also does not appear to be closely related to Oncoba (Salicaceae), an African genus with similar floral and fruit morphology that has been traditionally placed among cyanogenic Flacourtiaceae (now Achariaceae). A picrosodic paper test was performed in herbarium dry leaves, and the presence of cyanogenic glycosides, a class of compounds usually found in Achariaceae, was detected. Pollen morphology and wood anatomy of Kuhlmanniodendron were also investigated, but both pollen (3-colporate and microreticulate) and wood, with solitary to multiple vessels, scalariform perforation plates and other features, do not seem to be useful to distinguish this genus from other members of the Achariaceae and are rather common among the eudicotyledons as a whole. However, perforated ray cells with scalariform plates, an uncommon wood character, present in Kuhlmanniodendron are similar to those found in Kiggelaria africana (Pangieae, Achariaceae), but the occurrence of such cells is not mapped among the angiosperms, and it is not clear how homoplastic this character could be.
Resumo:
It is known that large fragment sizes and high connectivity levels are key components for maintaining species in fragments; however, their relative effects are poorly understood, especially in tropical areas. In order to test these effects, we built models for explaining understory birds occurrence in a fragmented Atlantic Rain Forest landscape with intermediate habitat cover (3%). Data from over 9000 mist-net hours from 17 fragments differing in size (2-175 ha) and connectivity (considering corridor linkages and distance to nearby fragments) were ranked under a model selection approach. A total 1293 individuals of 62 species were recorded. Species richness, abundance and compositional variation were mainly affected by connectivity indices that consider the capacity of species to use corridors and/or to cross short distances up to 30 m through the matrix. Bird functional groups were differently affected by area and connectivity: while terrestrial insectivores, omnivores and frugivores were affected by both area and connectivity, the other groups (understory insectivores, nectarivores, and others) were affected only by connectivity. In the studied landscape, well connected fragments can sustain an elevated number of species and individuals. Connectivity gives the opportunity for individuals to use multiple fragments, reducing the influence of fragment size. While preserving large fragments is a conservation target worldwide and should continue to be, our results indicated that connectivity between fragments can enhance the area functionally connected and is beneficial to all functional groups and therefore should be a conservation priority. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Time-lagged responses of biological variables to landscape modifications are widely recognized, but rarely considered in ecological studies. In order to test for the existence of time-lags in the response of trees, small mammals, birds and frogs to changes in fragment area and connectivity, we studied a fragmented and highly dynamic landscape in the Atlantic forest region. We also investigated the biological correlates associated with differential responses among taxonomic groups. Species richness and abundance for four taxonomic groups were measured in 21 secondary forest fragments during the same period (2000-2002), following a standardized protocol. Data analyses were based on power regressions and model selection procedures. The model inputs included present (2000) and past (1962, 1981) fragment areas and connectivity, as well as observed changes in these parameters. Although past landscape structure was particularly relevant for trees, all taxonomic groups (except small mammals) were affected by landscape dynamics, exhibiting a time-lagged response. Furthermore, fragment area was more important for species groups with lower dispersal capacity, while species with higher dispersal ability had stronger responses to connectivity measures. Although these secondary forest fragments still maintain a large fraction of their original biodiversity, the delay in biological response combined with high rates of deforestation and fast forest regeneration imply in a reduction in the average age of the forest. This also indicates that future species losses are likely, especially those that are more strictly-forest dwellers. Conservation actions should be implemented to reduce species extinction, to maintain old-growth forests and to favour the regeneration process. Our results demonstrate that landscape history can strongly affect the present distribution pattern of species in fragmented landscapes, and should be considered in conservation planning. (C) 2009 Elsevier Ltd. All rights reserved.