32 resultados para Real data
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
The log-Burr XII regression model for grouped survival data is evaluated in the presence of many ties. The methodology for grouped survival data is based on life tables, where the times are grouped in k intervals, and we fit discrete lifetime regression models to the data. The model parameters are estimated by maximum likelihood and jackknife methods. To detect influential observations in the proposed model, diagnostic measures based on case deletion, so-called global influence, and influence measures based on small perturbations in the data or in the model, referred to as local influence, are used. In addition to these measures, the total local influence and influential estimates are also used. We conduct Monte Carlo simulation studies to assess the finite sample behavior of the maximum likelihood estimators of the proposed model for grouped survival. A real data set is analyzed using a regression model for grouped data.
Resumo:
The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.
Resumo:
For the first time, we introduce a generalized form of the exponentiated generalized gamma distribution [Cordeiro et al. The exponentiated generalized gamma distribution with application to lifetime data, J. Statist. Comput. Simul. 81 (2011), pp. 827-842.] that is the baseline for the log-exponentiated generalized gamma regression model. The new distribution can accommodate increasing, decreasing, bathtub- and unimodal-shaped hazard functions. A second advantage is that it includes classical distributions reported in the lifetime literature as special cases. We obtain explicit expressions for the moments of the baseline distribution of the new regression model. The proposed model can be applied to censored data since it includes as sub-models several widely known regression models. It therefore can be used more effectively in the analysis of survival data. We obtain maximum likelihood estimates for the model parameters by considering censored data. We show that our extended regression model is very useful by means of two applications to real data.
Resumo:
The beta-Birnbaum-Saunders (Cordeiro and Lemonte, 2011) and Birnbaum-Saunders (Birnbaum and Saunders, 1969a) distributions have been used quite effectively to model failure times for materials subject to fatigue and lifetime data. We define the log-beta-Birnbaum-Saunders distribution by the logarithm of the beta-Birnbaum-Saunders distribution. Explicit expressions for its generating function and moments are derived. We propose a new log-beta-Birnbaum-Saunders regression model that can be applied to censored data and be used more effectively in survival analysis. We obtain the maximum likelihood estimates of the model parameters for censored data and investigate influence diagnostics. The new location-scale regression model is modified for the possibility that long-term survivors may be presented in the data. Its usefulness is illustrated by means of two real data sets. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
The autoregressive (AR) estimator, a non-parametric method, is used to analyze functional magnetic resonance imaging (fMRI) data. The same method has been used, with success, in several other time series data analysis. It uses exclusively the available experimental data points to estimate the most plausible power spectra compatible with the experimental data and there is no need to make any assumption about non-measured points. The time series, obtained from fMRI block paradigm data, is analyzed by the AR method to determine the brain active regions involved in the processing of a given stimulus. This method is considerably more reliable than the fast Fourier transform or the parametric methods. The time series corresponding to each image pixel is analyzed using the AR estimator and the corresponding poles are obtained. The pole distribution gives the shape of power spectra, and the pixels with poles at the stimulation frequency are considered as the active regions. The method was applied in simulated and real data, its superiority is shown by the receiver operating characteristic curves which were obtained using the simulated data.
Resumo:
In this paper we introduce an extension of the Lindley distribution which offers a more flexible model for lifetime data. Several statistical properties of the distribution are explored, such as the density, (reversed) failure rate, (reversed) mean residual lifetime, moments, order statistics, Bonferroni and Lorenz curves. Estimation using the maximum likelihood and inference of a random sample from the distribution are investigated. A real data application illustrates the performance of the distribution. (C) 2011 The Korean Statistical Society. Published by Elsevier B.V. All rights reserved.
Resumo:
In this article, we introduce an asymmetric extension to the univariate slash-elliptical family of distributions studied in Gomez et al. (2007a). This new family results from a scale mixture between the epsilon-skew-symmetric family of distributions and the uniform distribution. A general expression is presented for the density with special cases such as the normal, Cauchy, Student-t, and Pearson type II distributions. Some special properties and moments are also investigated. Results of two real data sets applications are also reported, illustrating the fact that the family introduced can be useful in practice.
Resumo:
An experimental platform that allows application of internal faults on the armature windings of a specially modified synchronous generator in a controlled environment is described. It allows recording and studying current and voltage waveforms of internal fault conditions that may occur in a synchronous generator. Thus, traditional and new protection functions can be tested by using real data, and the transient response of the machine due to internal faults can be analyzed more closely. The hardware-software platform is described in detail, as well as all its control functions. The results can contribute significantly in new protection developments, as well as for educational purposes.
Resumo:
The purpose of this paper is to develop a Bayesian analysis for the right-censored survival data when immune or cured individuals may be present in the population from which the data is taken. In our approach the number of competing causes of the event of interest follows the Conway-Maxwell-Poisson distribution which generalizes the Poisson distribution. Markov chain Monte Carlo (MCMC) methods are used to develop a Bayesian procedure for the proposed model. Also, some discussions on the model selection and an illustration with a real data set are considered.
Resumo:
In this article we introduce a three-parameter extension of the bivariate exponential-geometric (BEG) law (Kozubowski and Panorska, 2005) [4]. We refer to this new distribution as the bivariate gamma-geometric (BGG) law. A bivariate random vector (X, N) follows the BGG law if N has geometric distribution and X may be represented (in law) as a sum of N independent and identically distributed gamma variables, where these variables are independent of N. Statistical properties such as moment generation and characteristic functions, moments and a variance-covariance matrix are provided. The marginal and conditional laws are also studied. We show that BBG distribution is infinitely divisible, just as the BEG model is. Further, we provide alternative representations for the BGG distribution and show that it enjoys a geometric stability property. Maximum likelihood estimation and inference are discussed and a reparametrization is proposed in order to obtain orthogonality of the parameters. We present an application to a real data set where our model provides a better fit than the BEG model. Our bivariate distribution induces a bivariate Levy process with correlated gamma and negative binomial processes, which extends the bivariate Levy motion proposed by Kozubowski et al. (2008) [6]. The marginals of our Levy motion are a mixture of gamma and negative binomial processes and we named it BMixGNB motion. Basic properties such as stochastic self-similarity and the covariance matrix of the process are presented. The bivariate distribution at fixed time of our BMixGNB process is also studied and some results are derived, including a discussion about maximum likelihood estimation and inference. (C) 2012 Elsevier Inc. All rights reserved.
Resumo:
In this paper we obtain asymptotic expansions, up to order n(-1/2) and under a sequence of Pitman alternatives, for the nonnull distribution functions of the likelihood ratio, Wald, score and gradient test statistics in the class of symmetric linear regression models. This is a wide class of models which encompasses the t model and several other symmetric distributions with longer-than normal tails. The asymptotic distributions of all four statistics are obtained for testing a subset of regression parameters. Furthermore, in order to compare the finite-sample performance of these tests in this class of models, Monte Carlo simulations are presented. An empirical application to a real data set is considered for illustrative purposes. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
In this paper we introduce a new distribution, namely, the slashed half-normal distribution and it can be seen as an extension of the half-normal distribution. It is shown that the resulting distribution has more kurtosis than the ordinary half-normal distribution. Moments and some properties are derived for the new distribution. Moment estimators and maximum likelihood estimators can computed using numerical procedures. Results of two real data application are reported where model fitting is implemented by using maximum likelihood estimation. The applications illustrate the better performance of the new distribution.
Resumo:
In this paper, an alternative skew Student-t family of distributions is studied. It is obtained as an extension of the generalized Student-t (GS-t) family introduced by McDonald and Newey [10]. The extension that is obtained can be seen as a reparametrization of the skewed GS-t distribution considered by Theodossiou [14]. A key element in the construction of such an extension is that it can be stochastically represented as a mixture of an epsilon-skew-power-exponential distribution [1] and a generalized-gamma distribution. From this representation, we can readily derive theoretical properties and easy-to-implement simulation schemes. Furthermore, we study some of its main properties including stochastic representation, moments and asymmetry and kurtosis coefficients. We also derive the Fisher information matrix, which is shown to be nonsingular for some special cases such as when the asymmetry parameter is null, that is, at the vicinity of symmetry, and discuss maximum-likelihood estimation. Simulation studies for some particular cases and real data analysis are also reported, illustrating the usefulness of the extension considered.
Resumo:
Background: In the analysis of effects by cell treatment such as drug dosing, identifying changes on gene network structures between normal and treated cells is a key task. A possible way for identifying the changes is to compare structures of networks estimated from data on normal and treated cells separately. However, this approach usually fails to estimate accurate gene networks due to the limited length of time series data and measurement noise. Thus, approaches that identify changes on regulations by using time series data on both conditions in an efficient manner are demanded. Methods: We propose a new statistical approach that is based on the state space representation of the vector autoregressive model and estimates gene networks on two different conditions in order to identify changes on regulations between the conditions. In the mathematical model of our approach, hidden binary variables are newly introduced to indicate the presence of regulations on each condition. The use of the hidden binary variables enables an efficient data usage; data on both conditions are used for commonly existing regulations, while for condition specific regulations corresponding data are only applied. Also, the similarity of networks on two conditions is automatically considered from the design of the potential function for the hidden binary variables. For the estimation of the hidden binary variables, we derive a new variational annealing method that searches the configuration of the binary variables maximizing the marginal likelihood. Results: For the performance evaluation, we use time series data from two topologically similar synthetic networks, and confirm that our proposed approach estimates commonly existing regulations as well as changes on regulations with higher coverage and precision than other existing approaches in almost all the experimental settings. For a real data application, our proposed approach is applied to time series data from normal Human lung cells and Human lung cells treated by stimulating EGF-receptors and dosing an anticancer drug termed Gefitinib. In the treated lung cells, a cancer cell condition is simulated by the stimulation of EGF-receptors, but the effect would be counteracted due to the selective inhibition of EGF-receptors by Gefitinib. However, gene expression profiles are actually different between the conditions, and the genes related to the identified changes are considered as possible off-targets of Gefitinib. Conclusions: From the synthetically generated time series data, our proposed approach can identify changes on regulations more accurately than existing methods. By applying the proposed approach to the time series data on normal and treated Human lung cells, candidates of off-target genes of Gefitinib are found. According to the published clinical information, one of the genes can be related to a factor of interstitial pneumonia, which is known as a side effect of Gefitinib.
Resumo:
Ng and Kotz (1995) introduced a distribution that provides greater flexibility to extremes. We define and study a new class of distributions called the Kummer beta generalized family to extend the normal, Weibull, gamma and Gumbel distributions, among several other well-known distributions. Some special models are discussed. The ordinary moments of any distribution in the new family can be expressed as linear functions of probability weighted moments of the baseline distribution. We examine the asymptotic distributions of the extreme values. We derive the density function of the order statistics, mean absolute deviations and entropies. We use maximum likelihood estimation to fit the distributions in the new class and illustrate its potentiality with an application to a real data set.