916 resultados para Maximum Likelihood method
Resumo:
The Lincoln–Petersen estimator is one of the most popular estimators used in capture–recapture studies. It was developed for a sampling situation in which two sources independently identify members of a target population. For each of the two sources, it is determined if a unit of the target population is identified or not. This leads to a 2 × 2 table with frequencies f11, f10, f01, f00 indicating the number of units identified by both sources, by the first but not the second source, by the second but not the first source and not identified by any of the two sources, respectively. However, f00 is unobserved so that the 2 × 2 table is incomplete and the Lincoln–Petersen estimator provides an estimate for f00. In this paper, we consider a generalization of this situation for which one source provides not only a binary identification outcome but also a count outcome of how many times a unit has been identified. Using a truncated Poisson count model, truncating multiple identifications larger than two, we propose a maximum likelihood estimator of the Poisson parameter and, ultimately, of the population size. This estimator shows benefits, in comparison with Lincoln–Petersen’s, in terms of bias and efficiency. It is possible to test the homogeneity assumption that is not testable in the Lincoln–Petersen framework. The approach is applied to surveillance data on syphilis from Izmir, Turkey.
Resumo:
Background Polygalacturonase-inhibiting proteins (PGIPs) are leucine-rich repeat (LRR) plant cell wall glycoproteins involved in plant immunity. They are typically encoded by gene families with a small number of gene copies whose evolutionary origin has been poorly investigated. Here we report the complete characterization of the full complement of the pgip family in soybean (Glycine max [L.] Merr.) and the characterization of the genomic region surrounding the pgip family in four legume species. Results BAC clone and genome sequence analyses showed that the soybean genome contains two pgip loci. Each locus is composed of three clustered genes that are induced following infection with the fungal pathogen Sclerotinia sclerotiorum (Lib.) de Bary, and remnant sequences of pgip genes. The analyzed homeologous soybean genomic regions (about 126 Kb) that include the pgip loci are strongly conserved and this conservation extends also to the genomes of the legume species Phaseolus vulgaris L., Medicago truncatula Gaertn. and Cicer arietinum L., each containing a single pgip locus. Maximum likelihood-based gene trees suggest that the genes within the pgip clusters have independently undergone tandem duplication in each species. Conclusions The paleopolyploid soybean genome contains two pgip loci comprised in large and highly conserved duplicated regions, which are also conserved in bean, M. truncatula and C. arietinum. The genomic features of these legume pgip families suggest that the forces driving the evolution of pgip genes follow the birth-and-death model, similar to that proposed for the evolution of resistance (R) genes of NBS-LRR-type.
Resumo:
The weak-constraint inverse for nonlinear dynamical models is discussed and derived in terms of a probabilistic formulation. The well-known result that for Gaussian error statistics the minimum of the weak-constraint inverse is equal to the maximum-likelihood estimate is rederived. Then several methods based on ensemble statistics that can be used to find the smoother (as opposed to the filter) solution are introduced and compared to traditional methods. A strong point of the new methods is that they avoid the integration of adjoint equations, which is a complex task for real oceanographic or atmospheric applications. they also avoid iterative searches in a Hilbert space, and error estimates can be obtained without much additional computational effort. the feasibility of the new methods is illustrated in a two-layer quasigeostrophic model.
Resumo:
The origin of tropical forest diversity has been hotly debated for decades. Although specific mechanisms vary, many such explanations propose some vicariance in the distribution of species during glacial cycles and several have been supported by genetic evidence in Neotropical taxa. However, no consensus exists with regard to the extent or time frame of the vicariance events. Here, we analyse the cytochrome oxidase II mitochondrial gene of 250 Sabethes albiprivus B mosquitoes sampled from western Sao Paulo in Brazil. There was very low population structuring among collection sites (Phi(ST) = 0.03, P = 0.04). Historic demographic analyses and the contemporary geographic distribution of genetic diversity suggest that the populations sampled are not at demographic equilibrium. Three distinct mitochondrial clades were observed in the samples, one of which differed significantly in its geographic distribution relative to the other two within a small sampling area (similar to 70 x 35 km). This fact, supported by the inability of maximum likelihood analyses to achieve adequate fits to simple models for the population demography of the species, suggests a more complex history, possibly involving disjunct forest refugia. This hypothesis is supported by a genetic signal of recent population growth, which is expected if population sizes of this forest-obligate insect increased during the forest expansions that followed glacial periods. Although a time frame cannot be reliably inferred for the vicariance event leading to the three genetic clades, molecular clock estimates place this at similar to 1 Myr before present.
Resumo:
Copia is a retrotransposon that appears to be distributed widely among the Drosophilidae subfamily. Evolutionary analyses of regulatory regions have indicated that the Copia retrotransposon evolved through both positive and purifying selection, and that horizontal transfer (HT) could also explain its patchy distribution of the among the subfamilies of the melanogaster subgroup. Additionally, Copia elements could also have transferred between melanogaster subgroup and other species of Drosophilidae-D. willistoni and Z. tuberculatus. In this study, we surveyed seven species of the Zaprionus genus by sequencing the LTR-ULR and reverse transcriptase regions, and by using RT-PCR in order to understand the distribution and evolutionary history of Copia in the Zaprionus genus. The Copia element was detected, and was transcriptionally active, in all species investigated. Structural and selection analysis revealed Zaprionus elements to be closely related to the most ancient subfamily of the melanogaster subgroup, and they seem to be evolving mainly under relaxed purifying selection. Taken together, these results allowed us to classify the Zaprionus sequences as a new subfamily-ZapCopia, a member of the Copia retrotransposon family of the melanogaster subgroup. These findings indicate that the Copia retrotransposon is an ancient component of the genomes of the Zaprionus species and broaden our understanding of the diversity of retrotransposons in the Zaprionus genus.
Resumo:
Epidendrum L. is the largest genus of Orchidaceae in the Neotropical region; it has an impressive morphological diversification, which imposes difficulties in delimitation of both infrageneric and interspecific boundaries. In this study, we review infrageneric boundaries within the subgenus Amphiglottium and try to contribute to the understanding of morphological diversification and taxa delimitation within this group. We tested the monophyly of the subgenus Amphiglottium sect. Amphiglottium, expanding previous phylogenetic investigations and reevaluated previous infrageneric classifications proposed. Sequence data from the trnL-trnF region were analyzed with both parsimony and maximum likelihood criteria. AFLP markers were also obtained and analyzed with phylogenetic and principal coordinate analyses. Additionally, we obtained chromosome numbers for representative species within the group. The results strengthen the monophyly of the subgenus Amphiglottium but do not support the current classification system proposed by previous authors. Only section Tuberculata comprises a well-supported monophyletic group, with sections Carinata and Integra not supported. Instead of morphology, biogeographical and ecological patterns are reflected in the phylogenetic signal in this group. This study also confirms the large variability of chromosome numbers for the subgenus Amphiglottium (numbers ranging from 2n = 24 to 2n = 240), suggesting that polyploidy and hybridization are probably important mechanisms of speciation within the group.
Resumo:
Hepatitis C virus (HCV) infection frequently persists despite substantial virus-specific immune responses and the combination of pegylated interferon (INF)-alpha and ribavirin therapy. Major histocompatibility complex class I restricted CD8+ T cells are responsible for the control of viraemia in HCV infection, and several studies suggest protection against viral infection associated with specific HLAs. The reason for low rates of sustained viral response (SVR) in HCV patients remains unknown. Escape mutations in response to cytotoxic T lymphocyte are widely described; however, its influence in the treatment outcome is ill understood. Here, we investigate the differences in CD8 epitopes frequencies from the Los Alamos database between groups of patients that showed distinct response to pegylated alpha-INF with ribavirin therapy and test evidence of natural selection on the virus in those who failed treatment, using five maximum likelihood evolutionary models from PAML package. The group of sustained virological responders showed three epitopes with frequencies higher than Non-responders group, all had statistical support, and we observed evidence of selection pressure in the last group. No escape mutation was observed. Interestingly, the epitope VLSDFKTWL was 100% conserved in SVR group. These results suggest that the response to treatment can be explained by the increase in immune pressure, induced by interferon therapy, and the presence of those epitopes may represent an important factor in determining the outcome of therapy.
The genus Coleodactylus (Sphaerodactylinae, Gekkota) revisited: A molecular phylogenetic perspective
Resumo:
Nucleotide sequence data from a mitochondrial gene (16S) and two nuclear genes (c-mos, RAG-1) were used to evaluate the monophyly of the genus Coleodactylus, to provide the first phylogenetic hypothesis of relationships among its species in a cladistic framework, and to estimate the relative timing, of species divergences. Maximum Parsimony, Maximum Likelihood and Bayesian analyses of the combined data sets retrieved Coleodactylus as a monophyletic genus, although weakly Supported. Species were recovered as two genetically and morphological distinct clades, with C. amazonicus populations forming the sister taxon to the meridionalis group (C. brachystoma, C. meridionalis, C. natalensis, and C. septentrionalis). Within this group, C. septentrionalis was placed as the sister taxon to a clade comprising the rest of the species, C. meridionalis was recovered as the sister species to C. brachystoma, and C natalensis was found nested within C. meridionalis. Divergence time estimates based on penalized likelihood and Bayesian dating methods do not Support the previous hypothesis based on the Quaternary rain forest fragmentation model proposed to explain the diversification of the genus. The basal cladogenic event between major lineages of Coleodactylus was estimated to have occurred in the late Cretaceous (72.6 +/- 1.77 Mya), approximately at the same point in time than the other genera of Sphaerodactylinae diverged from each other. Within the meridionalis group, the split between C. septentrionalis and C. brachystoma + C. meridionalis was placed in the Eocene (46.4 +/- 4.22 Mya), and the divergence between C. brachystoma and C. meridionalis was estimated to have occurred in the Oligocene (29.3 +/- 4.33 Mya). Most intraspecific cladogenesis occurred through Miocene to Pliocene, and only for two conspecific samples and for C. natalensis could a Quaternary differentiation be assumed (1.9 +/- 1.3 Mya). (C) 2008 Elsevier Inc. All rights reserved.
Resumo:
Glutaredoxins (Grxs) are small (9-12 kDa) heat-stable proteins that are ubiquitously distributed. In Saccharomyces cerevisiae, seven Grx enzymes have been identified. Two of them (yGrx1 and yGrx2) are dithiolic, possessing a conserved Cys-Pro-Tyr-Cys motif. Here, we show that yGrx2 has a specific activity 15 times higher than that of yGrx1, although these two oxidoreductases share 64% identity and 85% similarity with respect to their amino acid sequences. Further characterization of the enzymatic activities through two-substrate kinetics analysis revealed that yGrx2 possesses a lower Km for glutathione and a higher turnover than yGrx1. To better comprehend these biochemical differences, the pK(a) of the N-terminal active-site cysteines (Cys27) of these two proteins and of the yGrx2-C30S mutant were determined. Since the pK(a) values of the yGrx1 and yGix2 Cys27 residues are very similar, these parameters cannot account for the difference observed between their specific activities. Therefore, crystal structures of yGrx2 in the oxidized form and with a glutathionyl mixed disulfide were determined at resolutions of 2.05 and 1.91 angstrom, respectively. Comparisons of yGrx2 structures with the recently determined structures of yGrx1 provided insights into their remarkable functional divergence. We hypothesize that the substitutions of Ser23 and Gln52 in yGrx1 by Ala23 and Glu52 in yGrx2 modify the capability of the active-site C-terminal cysteine to attack the mixed disulfide between the N-terminal active-site cysteine and the glutathione molecule. Mutagenesis studies supported this hypothesis. The observed structural and functional differences between yGrx1 and yGrx2 may reflect variations in substrate specificity. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
The open vegetation corridor of South America is a region dominated by savanna biomes. It contains forests (i.e. riverine forests) that may act as corridors for rainforest specialists between the open vegetation corridor and its neighbouring biomes (i.e. the Amazonian and Atlantic forests). A prediction for this scenario is that populations of rainforest specialists in the open vegetation corridor and in the forested biomes show no significant genetic divergence. We addressed this hypothesis by studying plumage and genetic variation of the Planalto woodcreeper Dendrocolaptes platyrostris Spix (1824) (Aves: Furnariidae), a forest specialist that occurs in both open habitat and in the Atlantic forest. The study questions were: (1) is there any evidence of genetic continuity between populations of the open habitat and the Atlantic forest and (2) is plumage variation congruent with patterns of neutral genetic structure or with ecological factors related to habitat type? We used cytochrome b and mitochondrial DNA control region sequences to show that D. platyrostris is monophyletic and presents substantial intraspecific differentiation. We found two areas of plumage stability: one associated with Cerrado and the other associated with southern Atlantic Forest. Multiple Mantel tests showed that most of the plumage variation followed the transition of habitats but not phylogeographical gaps, suggesting that selection may be related to the evolution of the plumage of the species. The results were not compatible with the idea that forest specialists in the open vegetation corridor and in the Atlantic forest are linked at the population level because birds from each region were not part of the same genetic unit. Divergence in the presence of gene flow across the ecotone between both regions might explain our results. Also, our findings indicate that the southern Atlantic forest may have been significantly affected by Pleistocene climatic alteration, although such events did not cause local extinction of most taxa, as occurred in other regions of the globe where forests were significantly affected by global glaciations. Finally, our results neither support plumage stability areas, nor subspecies as full species. (C) 2011 The Linnean Society of London, Biological Journal of the Linnean Society, 2011, 103, 801-820.
Resumo:
The toucan genus Ramphastos (Piciformes: Ramphastidae) has been a model in the formulation of Neotropical paleobiogeographic hypotheses. Weckstein (2005) reported on the phylogenetic history of this genus based on three mitochondrial genes, but some relationships were weakly supported and one of the subspecies of R. vitellinus (citreolaemus) was unsampled. This study expands on Weckstein (2005) by adding more DNA sequence data (including a nuclear marker) and more samples, including R v. citreolaemus. Maximum parsimony, maximum likelihood, and Bayesian methods recovered similar trees, with nodes showing high support. A monophyletic R. vitellinus complex was strongly supported as the sister-group to R. brevis. The results also confirmed that the southeastern and northern populations of R. vitellinus ariel are paraphyletic. X v. citreolaemus is sister to the Amazonian subspecies of the vitellinus complex. Using three protein-coding genes (COI, cytochrome-b and ND2) and interval-calibrated nodes under a Bayesian relaxed-clock framework, we infer that ramphastid genera originated in the middle Miocene to early Pliocene, Ramphastos species originated between late Miocene and early Pleistocene, and intra-specific divergences took place throughout the Pleistocene. Parsimony-based reconstruction of ancestral areas indicated that evolution of the four trans-Andean Ramphastos taxa (R. v. citreolaemus, R. a. swainsonii, R. brevis and R. sulfuratus) was associated with four independent dispersals from the cis-Andean region. The last pulse of Andean uplift may have been important for the evolution of R. sulfuratus, whereas the origin of the other trans-Andean Ramphastos taxa is consistent with vicariance due to drying events in the lowland forests north of the Andes. Estimated rates of molecular evolution were higher than the ""standard"" bird rate of 2% substitutions/site/million years for two of the three genes analyzed (cytochrome-b and ND2). (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Analysis of the phylogenetic relationships among trypanosomes from vertebrates and invertebrates disclosed a new lineage of trypanosomes circulating among anurans and sand flies that share the same ecotopes in Brazilian Amazonia. This assemblage of closely related trypanosomes was determined by comparing whole SSU rDNA sequences of anuran trypanosomes from the Brazilian biomes of Amazonia, the Pantanal, and the Atlantic Forest and from Europe, North America, and Africa, and from trypanosomes of sand flies from Amazonia. Phylogenetic trees based on maximum likelihood and parsimony corroborated the positioning of all new anuran trypanosomes in the aquatic clade but did not support the monophyly of anuran trypanosomes. However, all analyses always supported four major clades (An01-04) of anuran trypanosomes. Clade An04 is composed of trypanosomes from exotic anurans. Isolates in clades An01 and An02 were from Brazilian frogs and toads captured in the three biomes studied, Amazonia, the Pantanal and the Atlantic Forest. Clade An01 contains mostly isolates from Hylidae whereas clade An02 comprises mostly isolates from Bufonidae; and clade An03 contains trypanosomes from sand flies and anurans of Bufonidae, Leptodactylidae, and Leiuperidae exclusively from Amazonia. To our knowledge, this is the first study describing morphological and growth features, and molecular phylogenetic affiliation of trypanosomes from anurans and phlebotomines, incriminating these flies as invertebrate hosts and probably also as important vectors of Amazonian terrestrial anuran trypanosomes.
Resumo:
In this paper we introduce a parametric model for handling lifetime data where an early lifetime can be related to the infant-mortality failure or to the wear processes but we do not know which risk is responsible for the failure. The maximum likelihood approach and the sampling-based approach are used to get the inferences of interest. Some special cases of the proposed model are studied via Monte Carlo methods for size and power of hypothesis tests. To illustrate the proposed methodology, we introduce an example consisting of a real data set.
Resumo:
The substitution of missing values, also called imputation, is an important data preparation task for many domains. Ideally, the substitution of missing values should not insert biases into the dataset. This aspect has been usually assessed by some measures of the prediction capability of imputation methods. Such measures assume the simulation of missing entries for some attributes whose values are actually known. These artificially missing values are imputed and then compared with the original values. Although this evaluation is useful, it does not allow the influence of imputed values in the ultimate modelling task (e.g. in classification) to be inferred. We argue that imputation cannot be properly evaluated apart from the modelling task. Thus, alternative approaches are needed. This article elaborates on the influence of imputed values in classification. In particular, a practical procedure for estimating the inserted bias is described. As an additional contribution, we have used such a procedure to empirically illustrate the performance of three imputation methods (majority, naive Bayes and Bayesian networks) in three datasets. Three classifiers (decision tree, naive Bayes and nearest neighbours) have been used as modelling tools in our experiments. The achieved results illustrate a variety of situations that can take place in the data preparation practice.