16 resultados para log-series distribution
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
In this paper, we propose a random intercept Poisson model in which the random effect is assumed to follow a generalized log-gamma (GLG) distribution. This random effect accommodates (or captures) the overdispersion in the counts and induces within-cluster correlation. We derive the first two moments for the marginal distribution as well as the intraclass correlation. Even though numerical integration methods are, in general, required for deriving the marginal models, we obtain the multivariate negative binomial model from a particular parameter setting of the hierarchical model. An iterative process is derived for obtaining the maximum likelihood estimates for the parameters in the multivariate negative binomial model. Residual analysis is proposed and two applications with real data are given for illustration. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
In this paper we have quantified the consistency of word usage in written texts represented by complex networks, where words were taken as nodes, by measuring the degree of preservation of the node neighborhood. Words were considered highly consistent if the authors used them with the same neighborhood. When ranked according to the consistency of use, the words obeyed a log-normal distribution, in contrast to Zipf's law that applies to the frequency of use. Consistency correlated positively with the familiarity and frequency of use, and negatively with ambiguity and age of acquisition. An inspection of some highly consistent words confirmed that they are used in very limited semantic contexts. A comparison of consistency indices for eight authors indicated that these indices may be employed for author recognition. Indeed, as expected, authors of novels could be distinguished from those who wrote scientific texts. Our analysis demonstrated the suitability of the consistency indices, which can now be applied in other tasks, such as emotion recognition.
Resumo:
The beta-Birnbaum-Saunders (Cordeiro and Lemonte, 2011) and Birnbaum-Saunders (Birnbaum and Saunders, 1969a) distributions have been used quite effectively to model failure times for materials subject to fatigue and lifetime data. We define the log-beta-Birnbaum-Saunders distribution by the logarithm of the beta-Birnbaum-Saunders distribution. Explicit expressions for its generating function and moments are derived. We propose a new log-beta-Birnbaum-Saunders regression model that can be applied to censored data and be used more effectively in survival analysis. We obtain the maximum likelihood estimates of the model parameters for censored data and investigate influence diagnostics. The new location-scale regression model is modified for the possibility that long-term survivors may be presented in the data. Its usefulness is illustrated by means of two real data sets. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The log-Burr XII regression model for grouped survival data is evaluated in the presence of many ties. The methodology for grouped survival data is based on life tables, where the times are grouped in k intervals, and we fit discrete lifetime regression models to the data. The model parameters are estimated by maximum likelihood and jackknife methods. To detect influential observations in the proposed model, diagnostic measures based on case deletion, so-called global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to these measures, the total local influence and influential estimates are also used. We conduct Monte Carlo simulation studies to assess the finite sample behavior of the maximum likelihood estimators of the proposed model for grouped survival. A real data set is analyzed using a regression model for grouped data.
Resumo:
The impact of biogeographical ancestry, self-reported 'race/color' and geographical origin on the frequency distribution of 10 CYP2C functional polymorphisms (CYP2C8*2, *3, *4, CYP2C9*2, *3, *5, *11, CYP2C19*2, *3 and *17) and their haplotypes was assessed in a representative cohort of the Brazilian population (n = 1034). TaqMan assays were used for allele discrimination at each CYP2C locus investigated. Individual proportions of European, African and Amerindian biogeographical ancestry were estimated using a panel of insertion-deletion polymorphisms. Multinomial log-linear models were applied to infer the statistical association between the CYP2C alleles and haplotypes (response variables), and biogeographical ancestry, self-reported Color and geographical origin (explanatory variables). The results showed that CYP2C19*3, CYP2C9*5 and CYP2C9*11 were rare alleles (<1%), the frequency of other variants ranged from 3.4% (CYP2C8*4) to 17.3% (CYP2C19*17). Two distinct haplotype blocks were identified: block 1 consists of three single nucleotide polymorphisms (SNPs) (CYP2C19*17, CYP2C19*2 and CYP2C9*2) and block 2 of six SNPs (CYP2C9*11, CYP2C9*3, CYP2C9*5, CYP2C8*2, CYP2C8*4 and CYP2C8*3). Diplotype analysis generated 41 haplotypes, of which eight had frequencies greater than 1% and together accounted for 96.4% of the overall genetic diversity. The distribution of CYP2C8 and CYP2C9 (but not CYP2C19) alleles, and of CYP2C haplotypes was significantly associated with self-reported Color and with the individual proportions of European and African genetic ancestry, irrespective of Color self-identification. The individual odds of having alleles CYP2C8*2, CYP2C8*3, CYP2C9*2 and CYP2C9*3, and haplotypes including these alleles, varied continuously as the proportion of European ancestry increased. Collectively, these data strongly suggest that the intrinsic heterogeneity of the Brazilian population must be acknowledged in the design and interpretation of pharmacogenomic studies of the CYP2C cluster in order to avoid spurious conclusions based on improper matching of study cohorts. This conclusion extends to other polymorphic pharmacogenes among Brazilians, and most likely to other admixed populations of the Americas. The Pharmacogenomics Journal (2012) 12, 267-276; doi: 10.1038/tpj.2010.89; published online 21 December 2010
Resumo:
The theoretical E-curve for the laminar flow of non-Newtonian fluids in circular tubes may not be accurate for real tubular systems with diffusion, mechanical vibration, wall roughness, pipe fittings, curves, coils, or corrugated walls. Deviations from the idealized laminar flow reactor (LFR) cannot be well represented using the axial dispersion or the tanks-in-series models of residence time distribution (RTD). In this work, four RTD models derived from non-ideal velocity profiles in segregated tube flow are proposed. They were used to represent the RTD of three tubular systems working with Newtonian and pseudoplastic fluids. Other RTD models were considered for comparison. The proposed models provided good adjustments, and it was possible to determine the active volumes. It is expected that these models can be useful for the analysis of LFR or for the evaluation of continuous thermal processing of viscous foods.
Resumo:
This paper introduces a skewed log-Birnbaum-Saunders regression model based on the skewed sinh-normal distribution proposed by Leiva et al. [A skewed sinh-normal distribution and its properties and application to air pollution, Comm. Statist. Theory Methods 39 (2010), pp. 426-443]. Some influence methods, such as the local influence and generalized leverage, are presented. Additionally, we derived the normal curvatures of local influence under some perturbation schemes. An empirical application to a real data set is presented in order to illustrate the usefulness of the proposed model.
Resumo:
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827-842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.
Resumo:
In this paper, we carry out robust modeling and influence diagnostics in Birnbaum-Saunders (BS) regression models. Specifically, we present some aspects related to BS and log-BS distributions and their generalizations from the Student-t distribution, and develop BS-t regression models, including maximum likelihood estimation based on the EM algorithm and diagnostic tools. In addition, we apply the obtained results to real data from insurance, which shows the uses of the proposed model. Copyright (c) 2011 John Wiley & Sons, Ltd.
Resumo:
In this paper, we present approximate distributions for the ratio of the cumulative wavelet periodograms considering stationary and non-stationary time series generated from independent Gaussian processes. We also adapt an existing procedure to use this statistic and its approximate distribution in order to test if two regularly or irregularly spaced time series are realizations of the same generating process. Simulation studies show good size and power properties for the test statistic. An application with financial microdata illustrates the test usefulness. We conclude advocating the use of these approximate distributions instead of the ones obtained through randomizations, mainly in the case of irregular time series. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
A thorough search for large-scale anisotropies in the distribution of arrival directions of cosmic rays detected above 10(18) eV at the Pierre Auger Observatory is presented. This search is performed as a function of both declination and right ascension in several energy ranges above 10(18) eV, and reported in terms of dipolar and quadrupolar coefficients. Within the systematic uncertainties, no significant deviation from isotropy is revealed. Assuming that any cosmic-ray anisotropy is dominated by dipole and quadrupole moments in this energy range, upper limits on their amplitudes are derived. These upper limits allow us to test the origin of cosmic rays above 10(18) eV from stationary Galactic sources densely distributed in the Galactic disk and predominantly emitting light particles in all directions.
Resumo:
Abstract Background A popular model for gene regulatory networks is the Boolean network model. In this paper, we propose an algorithm to perform an analysis of gene regulatory interactions using the Boolean network model and time-series data. Actually, the Boolean network is restricted in the sense that only a subset of all possible Boolean functions are considered. We explore some mathematical properties of the restricted Boolean networks in order to avoid the full search approach. The problem is modeled as a Constraint Satisfaction Problem (CSP) and CSP techniques are used to solve it. Results We applied the proposed algorithm in two data sets. First, we used an artificial dataset obtained from a model for the budding yeast cell cycle. The second data set is derived from experiments performed using HeLa cells. The results show that some interactions can be fully or, at least, partially determined under the Boolean model considered. Conclusions The algorithm proposed can be used as a first step for detection of gene/protein interactions. It is able to infer gene relationships from time-series data of gene expression, and this inference process can be aided by a priori knowledge available.
Resumo:
This work is supported by Brazilian agencies Fapesp, CAPES and CNPq
Resumo:
Background The discovery and development of anti-malarial compounds of plant origin and semisynthetic derivatives thereof, such as quinine (QN) and chloroquine (CQ), has highlighted the importance of these compounds in the treatment of malaria. Ursolic acid analogues bearing an acetyl group at C-3 have demonstrated significant anti-malarial activity. With this in mind, two new series of betulinic acid (BA) and ursolic acid (UA) derivatives with ester groups at C-3 were synthesized in an attempt to improve anti-malarial activity, reduce cytotoxicity, and search for new targets. In vitro activity against CQ-sensitive Plasmodium falciparum 3D7 and an evaluation of cytotoxicity in a mammalian cell line (HEK293T) are reported. Furthermore, two possible mechanisms of action of anti-malarial compounds have been evaluated: effects on mitochondrial membrane potential (ΔΨm) and inhibition of β-haematin formation. Results Among the 18 derivatives synthesized, those having shorter side chains were most effective against CQ-sensitive P. falciparum 3D7, and were non-cytotoxic. These derivatives were three to five times more active than BA and UA. A DiOC6(3) ΔΨm assay showed that mitochondria are not involved in their mechanism of action. Inhibition of β-haematin formation by the active derivatives was weaker than with CQ. Compounds of the BA series were generally more active against P. falciparum 3D7 than those of the UA series. Conclusions Three new anti-malarial prototypes were obtained from natural sources through an easy and relatively inexpensive synthesis. They represent an alternative for new lead compounds for anti-malarial chemotherapy.
Resumo:
Background: A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results: In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions: This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.