28 resultados para maximum likelihood analysis
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Context. Convergent point (CP) search methods are important tools for studying the kinematic properties of open clusters and young associations whose members share the same spatial motion. Aims. We present a new CP search strategy based on proper motion data. We test the new algorithm on synthetic data and compare it with previous versions of the CP search method. As an illustration and validation of the new method we also present an application to the Hyades open cluster and a comparison with independent results. Methods. The new algorithm rests on the idea of representing the stellar proper motions by great circles over the celestial sphere and visualizing their intersections as the CP of the moving group. The new strategy combines a maximum-likelihood analysis for simultaneously determining the CP and selecting the most likely group members and a minimization procedure that returns a refined CP position and its uncertainties. The method allows one to correct for internal motions within the group and takes into account that the stars in the group lie at different distances. Results. Based on Monte Carlo simulations, we find that the new CP search method in many cases returns a more precise solution than its previous versions. The new method is able to find and eliminate more field stars in the sample and is not biased towards distant stars. The CP solution for the Hyades open cluster is in excellent agreement with previous determinations.
Resumo:
The major cause of athlete's foot is Trichophyton rubrum, a dermatophyte or fungal pathogen of human skin. To facilitate molecular analyses of the dermatophytes, we sequenced T. rubrum and four related species, Trichophyton tonsurans, Trichophyton equinum, Microsporum canis, and Microsporum gypseum. These species differ in host range, mating, and disease progression. The dermatophyte genomes are highly colinear yet contain gene family expansions not found in other human-associated fungi. Dermatophyte genomes are enriched for gene families containing the LysM domain, which binds chitin and potentially related carbohydrates. These LysM domains differ in sequence from those in other species in regions of the peptide that could affect substrate binding. The dermatophytes also encode novel sets of fungus-specific kinases with unknown specificity, including nonfunctional pseudokinases, which may inhibit phosphorylation by competing for kinase sites within substrates, acting as allosteric effectors, or acting as scaffolds for signaling. The dermatophytes are also enriched for a large number of enzymes that synthesize secondary metabolites, including dermatophyte-specific genes that could synthesize novel compounds. Finally, dermatophytes are enriched in several classes of proteases that are necessary for fungal growth and nutrient acquisition on keratinized tissues. Despite differences in mating ability, genes involved in mating and meiosis are conserved across species, suggesting the possibility of cryptic mating in species where it has not been previously detected. These genome analyses identify gene families that are important to our understanding of how dermatophytes cause chronic infections, how they interact with epithelial cells, and how they respond to the host immune response. IMPORTANCE Athlete's foot, jock itch, ringworm, and nail infections are common fungal infections, all caused by fungi known as dermatophytes (fungi that infect skin). This report presents the genome sequences of Trichophyton rubrum, the most frequent cause of athlete's foot, as well as four other common dermatophytes. Dermatophyte genomes are enriched for four gene classes that may contribute to the ability of these fungi to cause disease. These include (i) proteases secreted to degrade skin; (ii) kinases, including pseudokinases, that are involved in signaling necessary for adapting to skin; (iii) secondary metabolites, compounds that act as toxins or signals in the interactions between fungus and host; and (iv) a class of proteins (LysM) that appear to bind and mask cell wall components and carbohydrates, thus avoiding the host's immune response to the fungi. These genome sequences provide a strong foundation for future work in understanding how dermatophytes cause disease.
Resumo:
Bovine coronavirus has been associated with diarrhoea in newborn calves, winter dysentery in adult cattle and respiratory tract infections in calves and feedlot cattle. In Cuba, the presence of BCoV was first reported in 2006. Since then, sporadic outbreaks have continued to occur. This study was aimed at deepening the knowledge of the evolution, molecular markers of virulence and epidemiology of BCoV in Cuba. A total of 30 samples collected between 2009 and 2011 were used for PCR amplification and direct sequencing of partial or full S gene. Sequence comparison and phylogenetic studies were conducted using partial or complete S gene sequences as phylogenetic markers. All Cuban bovine coronavirus sequences were located in a single cluster supported by 100% bootstrap and 1.00 posterior probability values. The Cuban bovine coronavirus sequences were also clustered with the USA BCoV strains corresponding to the GenBank accession numbers EF424621 and EF424623, suggesting a common origin for these viruses. This phylogenetic cluster was also the only group of sequences in which no recombination events were detected. Of the 45 amino acid changes found in the Cuban strains, four were unique. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
The HIV-1 subtype C has spread efficiently in the southern states of Brazil (Rio Grande do Sul, Santa Catarina and Parana). Phylogeographic studies indicate that the subtype C epidemic in southern Brazil was initiated by the introduction of a single founder virus population at some time point between 1960 and 1980, but little is known about the spatial dynamics of viral spread. A total of 135 Brazilian HIV-1 subtype C pol sequences collected from 1992 to 2009 at the three southern state capitals (Porto Alegre, Florianopolis and Curitiba) were analyzed. Maximum-likelihood and Bayesian methods were used to explore the degree of phylogenetic mixing of subtype C sequences from different cities and to reconstruct the geographical pattern of viral spread in this country region. Phylogeographic analyses supported the monophyletic origin of the HIV-1 subtype C clade circulating in southern Brazil and placed the root of that clade in Curitiba (Parana state). This analysis further suggested that Florianopolis (Santa Catarina state) is an important staging post in the subtype C dissemination displaying high viral migration rates from and to the other cities, while viral flux between Curitiba and Porto Alegre (Rio Grande do Sul state) is very low. We found a positive correlation (r(2) = 0.64) between routine travel and viral migration rates among localities. Despite the intense viral movement, phylogenetic intermixing of subtype C sequences from different Brazilian cities is lower than expected by chance. Notably, a high proportion (67%) of subtype C sequences from Porto Alegre branched within a single local monophyletic sub-cluster. These results suggest that the HIV-1 subtype C epidemic in southern Brazil has been shaped by both frequent viral migration among states and in situ dissemination of local clades.
Resumo:
In this article we introduce a three-parameter extension of the bivariate exponential-geometric (BEG) law (Kozubowski and Panorska, 2005) [4]. We refer to this new distribution as the bivariate gamma-geometric (BGG) law. A bivariate random vector (X, N) follows the BGG law if N has geometric distribution and X may be represented (in law) as a sum of N independent and identically distributed gamma variables, where these variables are independent of N. Statistical properties such as moment generation and characteristic functions, moments and a variance-covariance matrix are provided. The marginal and conditional laws are also studied. We show that BBG distribution is infinitely divisible, just as the BEG model is. Further, we provide alternative representations for the BGG distribution and show that it enjoys a geometric stability property. Maximum likelihood estimation and inference are discussed and a reparametrization is proposed in order to obtain orthogonality of the parameters. We present an application to a real data set where our model provides a better fit than the BEG model. Our bivariate distribution induces a bivariate Levy process with correlated gamma and negative binomial processes, which extends the bivariate Levy motion proposed by Kozubowski et al. (2008) [6]. The marginals of our Levy motion are a mixture of gamma and negative binomial processes and we named it BMixGNB motion. Basic properties such as stochastic self-similarity and the covariance matrix of the process are presented. The bivariate distribution at fixed time of our BMixGNB process is also studied and some results are derived, including a discussion about maximum likelihood estimation and inference. (C) 2012 Elsevier Inc. All rights reserved.
Resumo:
Background: The development of sugarcane as a sustainable crop has unlimited applications. The crop is one of the most economically viable for renewable energy production, and CO2 balance. Linkage maps are valuable tools for understanding genetic and genomic organization, particularly in sugarcane due to its complex polyploid genome of multispecific origins. The overall objective of our study was to construct a novel sugarcane linkage map, compiling AFLP and EST-SSR markers, and to generate data on the distribution of markers anchored to sequences of scIvana_1, a complete sugarcane transposable element, and member of the Copia superfamily. Results: The mapping population parents ('IAC66-6' and 'TUC71-7') contributed equally to polymorphisms, independent of marker type, and generated markers that were distributed into nearly the same number of co-segregation groups (or CGs). Bi-parentally inherited alleles provided the integration of 19 CGs. The marker number per CG ranged from two to 39. The total map length was 4,843.19 cM, with a marker density of 8.87 cM. Markers were assembled into 92 CGs that ranged in length from 1.14 to 404.72 cM, with an estimated average length of 52.64 cM. The greatest distance between two adjacent markers was 48.25 cM. The scIvana_1-based markers (56) were positioned on 21 CGs, but were not regularly distributed. Interestingly, the distance between adjacent scIvana_1-based markers was less than 5 cM, and was observed on five CGs, suggesting a clustered organization. Conclusions: Results indicated the use of a NBS-profiling technique was efficient to develop retrotransposon-based markers in sugarcane. The simultaneous maximum-likelihood estimates of linkage and linkage phase based strategies confirmed the suitability of its approach to estimate linkage, and construct the linkage map. Interestingly, using our genetic data it was possible to calculate the number of retrotransposonscIvana_1 (similar to 60) copies in the sugarcane genome, confirming previously reported molecular results. In addition, this research possibly will have indirect implications in crop economics e. g., productivity enhancement via QTL studies, as the mapping population parents differ in response to an important fungal disease.
Resumo:
The log-Burr XII regression model for grouped survival data is evaluated in the presence of many ties. The methodology for grouped survival data is based on life tables, where the times are grouped in k intervals, and we fit discrete lifetime regression models to the data. The model parameters are estimated by maximum likelihood and jackknife methods. To detect influential observations in the proposed model, diagnostic measures based on case deletion, so-called global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to these measures, the total local influence and influential estimates are also used. We conduct Monte Carlo simulation studies to assess the finite sample behavior of the maximum likelihood estimators of the proposed model for grouped survival. A real data set is analyzed using a regression model for grouped data.
Resumo:
In this paper, an alternative skew Student-t family of distributions is studied. It is obtained as an extension of the generalized Student-t (GS-t) family introduced by McDonald and Newey [10]. The extension that is obtained can be seen as a reparametrization of the skewed GS-t distribution considered by Theodossiou [14]. A key element in the construction of such an extension is that it can be stochastically represented as a mixture of an epsilon-skew-power-exponential distribution [1] and a generalized-gamma distribution. From this representation, we can readily derive theoretical properties and easy-to-implement simulation schemes. Furthermore, we study some of its main properties including stochastic representation, moments and asymmetry and kurtosis coefficients. We also derive the Fisher information matrix, which is shown to be nonsingular for some special cases such as when the asymmetry parameter is null, that is, at the vicinity of symmetry, and discuss maximum-likelihood estimation. Simulation studies for some particular cases and real data analysis are also reported, illustrating the usefulness of the extension considered.
Resumo:
In this paper, a new family of survival distributions is presented. It is derived by considering that the latent number of failure causes follows a Poisson distribution and the time for these causes to be activated follows an exponential distribution. Three different activation schemes are also considered. Moreover, we propose the inclusion of covariates in the model formulation in order to study their effect on the expected value of the number of causes and on the failure rate function. Inferential procedure based on the maximum likelihood method is discussed and evaluated via simulation. The developed methodology is illustrated on a real data set on ovarian cancer.
Resumo:
Genes involved in host-pathogen interactions are often strongly affected by positive natural selection. The Duffy antigen, coded by the Duffy antigen receptor for chemokines (DARC) gene, serves as a receptor for Plasmodium vivax in humans and for Plasmodium knowlesi in some nonhuman primates. In the majority of sub-Saharan Africans, a nucleic acid variant in GATA-1 of the gene promoter is responsible for the nonexpression of the Duffy antigen on red blood cells and consequently resistance to invasion by P. vivax. The Duffy antigen also acts as a receptor for chemokines and is expressed in red blood cells and many other tissues of the body. Because of this dual role, we sequenced a 3,000-bp region encompassing the entire DARC gene as well as part of its 5' and 3' flanking regions in a phylogenetic sample of primates and used statistical methods to evaluate the nature of selection pressures acting on the gene during its evolution. We analyzed both coding and regulatory regions of the DARC gene. The regulatory analysis showed accelerated rates of substitution at several sites near known motifs. Our tests of positive selection in the coding region using maximum likelihood by branch sites and maximum likelihood by codon sites did not yield statistically significant evidence for the action of positive selection. However, the maximum likelihood test in which the gene was subdivided into different structural regions showed that the known binding region for P. vivax/P. knowlesi is under very different selective pressures than the remainder of the gene. In fact, most of the gene appears to be under strong purifying selection, but this is not evident in the binding region. We suggest that the binding region is under the influence of two opposing selective pressures, positive selection possibly exerted by the parasite and purifying selection exerted by chemokines.
Resumo:
This article introduces generalized beta-generated (GBG) distributions. Sub-models include all classical beta-generated, Kumaraswamy-generated and exponentiated distributions. They are maximum entropy distributions under three intuitive conditions, which show that the classical beta generator skewness parameters only control tail entropy and an additional shape parameter is needed to add entropy to the centre of the parent distribution. This parameter controls skewness without necessarily differentiating tail weights. The GBG class also has tractable properties: we present various expansions for moments, generating function and quantiles. The model parameters are estimated by maximum likelihood and the usefulness of the new class is illustrated by means of some real data sets. (c) 2011 Elsevier B.V. All rights reserved.
Resumo:
Lemonte and Cordeiro [Birnbaum-Saunders nonlinear regression models, Comput. Stat. Data Anal. 53 (2009), pp. 4441-4452] introduced a class of Birnbaum-Saunders (BS) nonlinear regression models potentially useful in lifetime data analysis. We give a general matrix Bartlett correction formula to improve the likelihood ratio (LR) tests in these models. The formula is simple enough to be used analytically to obtain several closed-form expressions in special cases. Our results generalize those in Lemonte et al. [Improved likelihood inference in Birnbaum-Saunders regressions, Comput. Stat. DataAnal. 54 (2010), pp. 1307-1316], which hold only for the BS linear regression models. We consider Monte Carlo simulations to show that the corrected tests work better than the usual LR tests.
Resumo:
Among trypanosomatids, the genus Phytomonas is the only one specifically adapted to infect plants. These hosts provide a particular habitat with a plentiful supply of carbohydrates. Phytomonas sp. lacks a cytochrome-mediated respiratory chain and Krebs cycle, and ATP production relies predominantly on glycolysis. We have characterised the complete gene encoding a putative pyruvate/indolepyruvate decarboxylase (PDC/IPDC) (548 amino acids) of P. serpens, that displays high amino acid sequence similarity with phytobacteria and Leishmania enzymes. No orthologous PDC/IPDC genes were found in Trypanosoma cruzi or T. brucei. Conservation of the PDC/IPDC gene sequence was verified in 14 Phytomonas isolates. A phylogenetic analysis shows that Phytomonas protein is robustly monophyletic with Leishmania spp. and C. fasciculata enzymes. In the trees this clade appears as a sister group of indolepyruvate decarboxylases of gamma-proteobacteria. This supports the proposition that a horizontal gene transfer event from a donor phytobacteria to a recipient ancestral trypanosome has occurred prior to the separation between Phytomonas. Leishmania and Crithidia. We have measured the PDC activity in P. serpens cell extracts. The enzyme has a Km value for pyruvate of 1.4 mM. The acquisition of a PDC, a key enzyme in alcoholic fermentation, explains earlier observations that ethanol is one of the major end-products of glucose catabolism under aerobic and anaerobic conditions. This represents an alternative and necessary route to reoxidise part of the NADH produced in the highly demanding glycolytic pathway and highlights the importance of this type of event in metabolic adaptation. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Long-term survival models have historically been considered for analyzing time-to-event data with long-term survivors fraction. However, situations in which a fraction (1 - p) of systems is subject to failure from independent competing causes of failure, while the remaining proportion p is cured or has not presented the event of interest during the time period of the study, have not been fully considered in the literature. In order to accommodate such situations, we present in this paper a new long-term survival model. Maximum likelihood estimation procedure is discussed as well as interval estimation and hypothesis tests. A real dataset illustrates the methodology.
Resumo:
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827-842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.