111 resultados para survival data analysis


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Alternative splicing of gene transcripts greatly expands the functional capacity of the genome, and certain splice isoforms may indicate specific disease states such as cancer. Splice junction microarrays interrogate thousands of splice junctions, but data analysis is difficult and error prone because of the increased complexity compared to differential gene expression analysis. We present Rank Change Detection (RCD) as a method to identify differential splicing events based upon a straightforward probabilistic model comparing the over-or underrepresentation of two or more competing isoforms. RCD has advantages over commonly used methods because it is robust to false positive errors due to nonlinear trends in microarray measurements. Further, RCD does not depend on prior knowledge of splice isoforms, yet it takes advantage of the inherent structure of mutually exclusive junctions, and it is conceptually generalizable to other types of splicing arrays or RNA-Seq. RCD specifically identifies the biologically important cases when a splice junction becomes more or less prevalent compared to other mutually exclusive junctions. The example data is from different cell lines of glioblastoma tumors assayed with Agilent microarrays.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper aims to find relations between the socioeconomic characteristics, activity participation, land use patterns and travel behavior of the residents in the Sao Paulo Metropolitan Area (SPMA) by using Exploratory Multivariate Data Analysis (EMDA) techniques. The variables influencing travel pattern choices are investigated using: (a) Cluster Analysis (CA), grouping and characterizing the Traffic Zones (17), proposing the independent variable called Origin Cluster and, (b) Decision Tree (DT) to find a priori unknown relations among socioeconomic characteristics, land use attributes of the origin TZ and destination choices. The analysis was based on the origin-destination home-interview survey carried out in SPMA in 1997. The DT application revealed the variables of greatest influence on the travel pattern choice. The most important independent variable considered by DT is car ownership, followed by the Use of Transportation ""credits"" for Transit tariff, and, finally, activity participation variables and Origin Cluster. With these results, it was possible to analyze the influence of a family income, car ownership, position of the individual in the family, use of transportation ""credits"" for transit tariff (mainly for travel mode sequence choice), activities participation (activity sequence choice) and Origin Cluster (destination/travel distance choice). (c) 2010 Elsevier Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we proposed a new two-parameters lifetime distribution with increasing failure rate. The new distribution arises on a latent complementary risk problem base. The properties of the proposed distribution are discussed, including a formal proof of its probability density function and explicit algebraic formulae for its reliability and failure rate functions, quantiles and moments, including the mean and variance. A simple EM-type algorithm for iteratively computing maximum likelihood estimates is presented. The Fisher information matrix is derived analytically in order to obtaining the asymptotic covariance matrix. The methodology is illustrated on a real data set. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The inverse Weibull distribution has the ability to model failure rates which are quite common in reliability and biological studies. A three-parameter generalized inverse Weibull distribution with decreasing and unimodal failure rate is introduced and studied. We provide a comprehensive treatment of the mathematical properties of the new distribution including expressions for the moment generating function and the rth generalized moment. The mixture model of two generalized inverse Weibull distributions is investigated. The identifiability property of the mixture model is demonstrated. For the first time, we propose a location-scale regression model based on the log-generalized inverse Weibull distribution for modeling lifetime data. In addition, we develop some diagnostic tools for sensitivity analysis. Two applications of real data are given to illustrate the potentiality of the proposed regression model.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Survival models involving frailties are commonly applied in studies where correlated event time data arise due to natural or artificial clustering. In this paper we present an application of such models in the animal breeding field. Specifically, a mixed survival model with a multivariate correlated frailty term is proposed for the analysis of data from over 3611 Brazilian Nellore cattle. The primary aim is to evaluate parental genetic effects on the trait length in days that their progeny need to gain a commercially specified standard weight gain. This trait is not measured directly but can be estimated from growth data. Results point to the importance of genetic effects and suggest that these models constitute a valuable data analysis tool for beef cattle breeding.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A five-parameter distribution so-called the beta modified Weibull distribution is defined and studied. The new distribution contains, as special submodels, several important distributions discussed in the literature, such as the generalized modified Weibull, beta Weibull, exponentiated Weibull, beta exponential, modified Weibull and Weibull distributions, among others. The new distribution can be used effectively in the analysis of survival data since it accommodates monotone, unimodal and bathtub-shaped hazard functions. We derive the moments and examine the order statistics and their moments. We propose the method of maximum likelihood for estimating the model parameters and obtain the observed information matrix. A real data set is used to illustrate the importance and flexibility of the new distribution.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A bathtub-shaped failure rate function is very useful in survival analysis and reliability studies. The well-known lifetime distributions do not have this property. For the first time, we propose a location-scale regression model based on the logarithm of an extended Weibull distribution which has the ability to deal with bathtub-shaped failure rate functions. We use the method of maximum likelihood to estimate the model parameters and some inferential procedures are presented. We reanalyze a real data set under the new model and the log-modified Weibull regression model. We perform a model check based on martingale-type residuals and generated envelopes and the statistics AIC and BIC to select appropriate models. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A four-parameter extension of the generalized gamma distribution capable of modelling a bathtub-shaped hazard rate function is defined and studied. The beauty and importance of this distribution lies in its ability to model monotone and non-monotone failure rate functions, which are quite common in lifetime data analysis and reliability. The new distribution has a number of well-known lifetime special sub-models, such as the exponentiated Weibull, exponentiated generalized half-normal, exponentiated gamma and generalized Rayleigh, among others. We derive two infinite sum representations for its moments. We calculate the density of the order statistics and two expansions for their moments. The method of maximum likelihood is used for estimating the model parameters and the observed information matrix is obtained. Finally, a real data set from the medical area is analysed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Joint generalized linear models and double generalized linear models (DGLMs) were designed to model outcomes for which the variability can be explained using factors and/or covariates. When such factors operate, the usual normal regression models, which inherently exhibit constant variance, will under-represent variation in the data and hence may lead to erroneous inferences. For count and proportion data, such noise factors can generate a so-called overdispersion effect, and the use of binomial and Poisson models underestimates the variability and, consequently, incorrectly indicate significant effects. In this manuscript, we propose a DGLM from a Bayesian perspective, focusing on the case of proportion data, where the overdispersion can be modeled using a random effect that depends on some noise factors. The posterior joint density function was sampled using Monte Carlo Markov Chain algorithms, allowing inferences over the model parameters. An application to a data set on apple tissue culture is presented, for which it is shown that the Bayesian approach is quite feasible, even when limited prior information is available, thereby generating valuable insight for the researcher about its experimental results.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We introduce the log-beta Weibull regression model based on the beta Weibull distribution (Famoye et al., 2005; Lee et al., 2007). We derive expansions for the moment generating function which do not depend on complicated functions. The new regression model represents a parametric family of models that includes as sub-models several widely known regression models that can be applied to censored survival data. We employ a frequentist analysis, a jackknife estimator, and a parametric bootstrap for the parameters of the proposed model. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess global influences. Further, for different parameter settings, sample sizes, and censoring percentages, several simulations are performed. In addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We define martingale and deviance residuals to evaluate the model assumptions. The extended regression model is very useful for the analysis of real data and could give more realistic fits than other special regression models.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The stock market suffers uncertain relations throughout the entire negotiation process, with different variables exerting direct and indirect influence on stock prices. This study focuses on the analysis of certain aspects that may influence these values offered by the capital market, based on the Brazil Index of the Sao Paulo Stock Exchange (Bovespa), which selects 100 stocks among the most traded on Bovespa in terms of number of trades and financial volume. The selected variables are characterized by the companies` activity area and the business volume in the month of data collection, i.e. April/2007. This article proposes an analysis that joins the accounting view of the stock price variables that can be influenced with the use of multivariate qualitative data analysis. Data were explored through Correspondence Analysis (Anacor) and Homogeneity Analysis (Homals). According to the research, the selected variables are associated with the values presented by the stocks, which become an internal control instrument and a decision-making tool when it comes to choosing investments.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background/Purpose: The median survival for patients with metastatic colorectal cancer (mCRC) has progressively increased over the past decades. Since the introduction of 5-fluorouracil (5-FU)-based chemotherapy, followed by hepatic resection of metastases, and more recently the adoption of newer chemotherapeutic regimens associated with targeted therapy, the gains are getting more substantial. Despite the recognition of the potential for long-term survival after surgical resection of metastatic disease, long-term survival data to determine the potential curative role of chemotherapy alone is lacking. Methods: We performed a retrospective review of 2751 patients who presented with mCRC at The MD Anderson Cancer Center from 1990 through 2003. Patients alive at 5 years who achieved complete response with chemotherapy and were not submitted to any surgical or interventional procedures directed to the metastatic sites were included in the analysis. Results: The 5-year overall survival rate for all patients with mCRC during this period was 10.8%. Among these long-term survivors, 2.2% achieved a sustained complete response after chemotherapy (all 6 with fluoropyrimidines and 2 with irinotecan) as the only treatment modality and were without evidence of disease until the last follow-up visit (median of 10.3 years). This number corresponds to 0.24% (6 of 2541) of all patients with mCRC included in this review. Conclusion: Cure with chemotherapy alone is possible for a very small number of patients with metastatic colorectal cancer. Improved therapies are increasing complete response rates, but the impact of modern chemotherapy on durable complete responses will require additional follow up.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector`s orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we formulate a flexible density function from the selection mechanism viewpoint (see, for example, Bayarri and DeGroot (1992) and Arellano-Valle et al. (2006)) which possesses nice biological and physical interpretations. The new density function contains as special cases many models that have been proposed recently in the literature. In constructing this model, we assume that the number of competing causes of the event of interest has a general discrete distribution characterized by its probability generating function. This function has an important role in the selection procedure as well as in computing the conditional personal cure rate. Finally, we illustrate how various models can be deduced as special cases of the proposed model. (C) 2011 Elsevier B.V. All rights reserved.