17 resultados para likelihood-based inference
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
This paper considers likelihood-based inference for the family of power distributions. Widely applicable results are presented which can be used to conduct inference for all three parameters of the general location-scale extension of the family. More specific results are given for the special case of the power normal model. The analysis of a large data set, formed from density measurements for a certain type of pollen, illustrates the application of the family and the results for likelihood-based inference. Throughout, comparisons are made with analogous results for the direct parametrisation of the skew-normal distribution.
Resumo:
An extension of some standard likelihood based procedures to heteroscedastic nonlinear regression models under scale mixtures of skew-normal (SMSN) distributions is developed. This novel class of models provides a useful generalization of the heteroscedastic symmetrical nonlinear regression models (Cysneiros et al., 2010), since the random term distributions cover both symmetric as well as asymmetric and heavy-tailed distributions such as skew-t, skew-slash, skew-contaminated normal, among others. A simple EM-type algorithm for iteratively computing maximum likelihood estimates of the parameters is presented and the observed information matrix is derived analytically. In order to examine the performance of the proposed methods, some simulation studies are presented to show the robust aspect of this flexible class against outlying and influential observations and that the maximum likelihood estimates based on the EM-type algorithm do provide good asymptotic properties. Furthermore, local influence measures and the one-step approximations of the estimates in the case-deletion model are obtained. Finally, an illustration of the methodology is given considering a data set previously analyzed under the homoscedastic skew-t nonlinear regression model. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
We extend the random permutation model to obtain the best linear unbiased estimator of a finite population mean accounting for auxiliary variables under simple random sampling without replacement (SRS) or stratified SRS. The proposed method provides a systematic design-based justification for well-known results involving common estimators derived under minimal assumptions that do not require specification of a functional relationship between the response and the auxiliary variables.
Resumo:
Background: The temporal and geographical diversification of Neotropical insects remains poorly understood because of the complex changes in geological and climatic conditions that occurred during the Cenozoic. To better understand extant patterns in Neotropical biodiversity, we investigated the evolutionary history of three Neotropical swallowtail Troidini genera (Papilionidae). First, DNA-based species delimitation analyses were conducted to assess species boundaries within Neotropical Troidini using an enlarged fragment of the standard barcode gene. Molecularly delineated species were then used to infer a time-calibrated species-level phylogeny based on a three-gene dataset and Bayesian dating analyses. The corresponding chronogram was used to explore their temporal and geographical diversification through distinct likelihood-based methods. Results: The phylogeny for Neotropical Troidini was well resolved and strongly supported. Molecular dating and biogeographic analyses indicate that the extant lineages of Neotropical Troidini have a late Eocene (33-42 Ma) origin in North America. Two independent lineages (Battus and Euryades + Parides) reached South America via the GAARlandia temporary connection, and later became extinct in North America. They only began substantive diversification during the early Miocene in Amazonia. Macroevolutionary analysis supports the "museum model" of diversification, rather than Pleistocene refugia, as the best explanation for the diversification of these lineages. Conclusions: This study demonstrates that: (i) current Neotropical biodiversity may have originated ex situ; (ii) the GAARlandia bridge was important in facilitating invasions of South America; (iii) colonization of Amazonia initiated the crown diversification of these swallowtails; and (iv) Amazonia is not only a species-rich region but also acted as a sanctuary for the dynamics of this diversity. In particular, Amazonia probably allowed the persistence of old lineages and contributed to the steady accumulation of diversity over time with constant net diversification rates, a result that contrasts with previous studies on other South American butterflies.
Resumo:
In this paper we introduce an extension of the Lindley distribution which offers a more flexible model for lifetime data. Several statistical properties of the distribution are explored, such as the density, (reversed) failure rate, (reversed) mean residual lifetime, moments, order statistics, Bonferroni and Lorenz curves. Estimation using the maximum likelihood and inference of a random sample from the distribution are investigated. A real data application illustrates the performance of the distribution. (C) 2011 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.
Resumo:
In this paper we use Markov chain Monte Carlo (MCMC) methods in order to estimate and compare GARCH models from a Bayesian perspective. We allow for possibly heavy tailed and asymmetric distributions in the error term. We use a general method proposed in the literature to introduce skewness into a continuous unimodal and symmetric distribution. For each model we compute an approximation to the marginal likelihood, based on the MCMC output. From these approximations we compute Bayes factors and posterior model probabilities. (C) 2012 IMACS. Published by Elsevier B.V. All rights reserved.
Resumo:
We explore the meaning of information about quantities of interest. Our approach is divided in two scenarios: the analysis of observations and the planning of an experiment. First, we review the Sufficiency, Conditionality and Likelihood principles and how they relate to trivial experiments. Next, we review Blackwell Sufficiency and show that sampling without replacement is Blackwell Sufficient for sampling with replacement. Finally, we unify the two scenarios presenting an extension of the relationship between Blackwell Equivalence and the Likelihood Principle.
Resumo:
Documenting the Neotropical amphibian diversity has become a major challenge facing the threat of global climate change and the pace of environmental alteration. Recent molecular phylogenetic studies have revealed that the actual number of species in South American tropical forests is largely underestimated, but also that many lineages are millions of years old. The genera Phyzelaphryne (1 sp.) and Adelophryne (6 spp.), which compose the subfamily Phyzelaphryninae, include poorly documented, secretive, and minute frogs with an unusual distribution pattern that encompasses the biotic disjunction between Amazonia and the Atlantic forest. We generated >5.8 kb sequence data from six markers for all seven nominal species of the subfamily as well as for newly discovered populations in order to (1) test the monophyly of Phyzelaphryninae, Adelophryne and Phyzelaphryne, (2) estimate species diversity within the subfamily, and (3) investigate their historical biogeography and diversification. Phylogenetic reconstruction confirmed the monophyly of each group and revealed deep subdivisions within Adelophryne and Phyzelaphryne, with three major clades in Adelophryne located in northern Amazonia, northern Atlantic forest and southern Atlantic forest. Our results suggest that the actual number of species in Phyzelaphryninae is, at least, twice the currently recognized species diversity, with almost every geographically isolated population representing an anciently divergent candidate species. Such results highlight the challenges for conservation, especially in the northern Atlantic forest where it is still degraded at a fast pace. Molecular dating revealed that Phyzelaphryninae originated in Amazonia and dispersed during early Miocene to the Atlantic forest. The two Atlantic forest clades of Adelophryne started to diversify some 7 Ma minimum, while the northern Amazonian Adelophryne diversified much earlier, some 13 Ma minimum. This striking biogeographic pattern coincides with major events that have shaped the face of the South American continent, as we know it today. (C) 2012 Elsevier Inc. All rights reserved.
Resumo:
Background: Arboviral diseases are major global public health threats. Yet, our understanding of infection risk factors is, with a few exceptions, considerably limited. A crucial shortcoming is the widespread use of analytical methods generally not suited for observational data - particularly null hypothesis-testing (NHT) and step-wise regression (SWR). Using Mayaro virus (MAYV) as a case study, here we compare information theory-based multimodel inference (MMI) with conventional analyses for arboviral infection risk factor assessment. Methodology/Principal Findings: A cross-sectional survey of anti-MAYV antibodies revealed 44% prevalence (n = 270 subjects) in a central Amazon rural settlement. NHT suggested that residents of village-like household clusters and those using closed toilet/latrines were at higher risk, while living in non-village-like areas, using bednets, and owning fowl, pigs or dogs were protective. The "minimum adequate" SWR model retained only residence area and bednet use. Using MMI, we identified relevant covariates, quantified their relative importance, and estimated effect-sizes (beta +/- SE) on which to base inference. Residence area (beta(Village) = 2.93 +/- 0.41; beta(Upland) = -0.56 +/- 0.33, beta(Riverbanks) = -2.37 +/- 0.55) and bednet use (beta = -0.95 +/- 0.28) were the most important factors, followed by crop-plot ownership (beta = 0.39 +/- 0.22) and regular use of a closed toilet/latrine (beta = 0.19 +/- 0.13); domestic animals had insignificant protective effects and were relatively unimportant. The SWR model ranked fifth among the 128 models in the final MMI set. Conclusions/Significance: Our analyses illustrate how MMI can enhance inference on infection risk factors when compared with NHT or SWR. MMI indicates that forest crop-plot workers are likely exposed to typical MAYV cycles maintained by diurnal, forest dwelling vectors; however, MAYV might also be circulating in nocturnal, domestic-peridomestic cycles in village-like areas. This suggests either a vector shift (synanthropic mosquitoes vectoring MAYV) or a habitat/habits shift (classical MAYV vectors adapting to densely populated landscapes and nocturnal biting); any such ecological/adaptive novelty could increase the likelihood of MAYV emergence in Amazonia.
Resumo:
Bovine coronavirus has been associated with diarrhoea in newborn calves, winter dysentery in adult cattle and respiratory tract infections in calves and feedlot cattle. In Cuba, the presence of BCoV was first reported in 2006. Since then, sporadic outbreaks have continued to occur. This study was aimed at deepening the knowledge of the evolution, molecular markers of virulence and epidemiology of BCoV in Cuba. A total of 30 samples collected between 2009 and 2011 were used for PCR amplification and direct sequencing of partial or full S gene. Sequence comparison and phylogenetic studies were conducted using partial or complete S gene sequences as phylogenetic markers. All Cuban bovine coronavirus sequences were located in a single cluster supported by 100% bootstrap and 1.00 posterior probability values. The Cuban bovine coronavirus sequences were also clustered with the USA BCoV strains corresponding to the GenBank accession numbers EF424621 and EF424623, suggesting a common origin for these viruses. This phylogenetic cluster was also the only group of sequences in which no recombination events were detected. Of the 45 amino acid changes found in the Cuban strains, four were unique. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.
Resumo:
The circumscription of genera belonging to tribe Bignonieae (Bignoniaceae) has traditionally been complex, with only a few genera having stable circumscriptions in the various classification systems proposed for the tribe. The genus Lundia, for instance, is well characterized by a series of morphological synapomorphies and its circumscription has remained quite stable throughout its history. Despite the stable circumscription of Lundia, the circumscription of species within the genus has remained problematic. This study aims to reconstruct the phylogeny of Lundia in order to refine species circumscriptions, gain a better understanding of relationships between taxa, and identify potential morphological synapomorphies for species and major clades. We sampled 26 accessions representing 13 species of Lundia, and 5 outgroups, and reconstructed the phylogeny of the genus using a chloroplast (ndhF) and a nuclear marker (PepC). Data derived from sequences of the individual loci were analyzed using parsimony and Bayesian inference, and the combined molecular dataset was analyzed with Bayesian methods. The monophyly of Lundia nitidula, a species with a particularly complex circumscription, was tested using Shimodaira-Hasegawa (SH) test and the approximately unbiased test for phylogenetic tree selection (AU test). In addition, 40 morphological characters were mapped onto the tree that resulted from the analysis of the combined molecular dataset in order to identify morphological synapomorphies of individual species and major clades. Lundia and most species currently recognized within the genus were strongly supported as monophyletic in all analyses. One species, Lundia nitidula, was not resolved as monophyletic, but the monophyly of this species was not rejected by the AU and SH tests. Lundia sect. Eriolundia is resolved as paraphyletic in all analyses, while Lundia sect. Eulundia is monophyletic and supported by the same morphological characters traditionally used to circumscribe this section. The phylogeny of Lundia contributed important information for a better circumscription of species and served as basis the taxonomic revision of the genus.
Resumo:
Background: The development of sugarcane as a sustainable crop has unlimited applications. The crop is one of the most economically viable for renewable energy production, and CO2 balance. Linkage maps are valuable tools for understanding genetic and genomic organization, particularly in sugarcane due to its complex polyploid genome of multispecific origins. The overall objective of our study was to construct a novel sugarcane linkage map, compiling AFLP and EST-SSR markers, and to generate data on the distribution of markers anchored to sequences of scIvana_1, a complete sugarcane transposable element, and member of the Copia superfamily. Results: The mapping population parents ('IAC66-6' and 'TUC71-7') contributed equally to polymorphisms, independent of marker type, and generated markers that were distributed into nearly the same number of co-segregation groups (or CGs). Bi-parentally inherited alleles provided the integration of 19 CGs. The marker number per CG ranged from two to 39. The total map length was 4,843.19 cM, with a marker density of 8.87 cM. Markers were assembled into 92 CGs that ranged in length from 1.14 to 404.72 cM, with an estimated average length of 52.64 cM. The greatest distance between two adjacent markers was 48.25 cM. The scIvana_1-based markers (56) were positioned on 21 CGs, but were not regularly distributed. Interestingly, the distance between adjacent scIvana_1-based markers was less than 5 cM, and was observed on five CGs, suggesting a clustered organization. Conclusions: Results indicated the use of a NBS-profiling technique was efficient to develop retrotransposon-based markers in sugarcane. The simultaneous maximum-likelihood estimates of linkage and linkage phase based strategies confirmed the suitability of its approach to estimate linkage, and construct the linkage map. Interestingly, using our genetic data it was possible to calculate the number of retrotransposonscIvana_1 (similar to 60) copies in the sugarcane genome, confirming previously reported molecular results. In addition, this research possibly will have indirect implications in crop economics e. g., productivity enhancement via QTL studies, as the mapping population parents differ in response to an important fungal disease.
Resumo:
Traditional abduction imposes as a precondition the restriction that the background information may not derive the goal data. In first-order logic such precondition is, in general, undecidable. To avoid such problem, we present a first-order cut-based abduction method, which has KE-tableaux as its underlying inference system. This inference system allows for the automation of non-analytic proofs in a tableau setting, which permits a generalization of traditional abduction that avoids the undecidable precondition problem. After demonstrating the correctness of the method, we show how this method can be dynamically iterated in a process that leads to the construction of non-analytic first-order proofs and, in some terminating cases, to refutations as well.
Resumo:
We propose a new general Bayesian latent class model for evaluation of the performance of multiple diagnostic tests in situations in which no gold standard test exists based on a computationally intensive approach. The modeling represents an interesting and suitable alternative to models with complex structures that involve the general case of several conditionally independent diagnostic tests, covariates, and strata with different disease prevalences. The technique of stratifying the population according to different disease prevalence rates does not add further marked complexity to the modeling, but it makes the model more flexible and interpretable. To illustrate the general model proposed, we evaluate the performance of six diagnostic screening tests for Chagas disease considering some epidemiological variables. Serology at the time of donation (negative, positive, inconclusive) was considered as a factor of stratification in the model. The general model with stratification of the population performed better in comparison with its concurrents without stratification. The group formed by the testing laboratory Biomanguinhos FIOCRUZ-kit (c-ELISA and rec-ELISA) is the best option in the confirmation process by presenting false-negative rate of 0.0002% from the serial scheme. We are 100% sure that the donor is healthy when these two tests have negative results and he is chagasic when they have positive results.