995 resultados para CENSORED DATA


Relevância:

70.00% 70.00%

Publicador:

Resumo:

A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Survival models are being widely applied to the engineering field to model time-to-event data once censored data is here a common issue. Using parametric models or not, for the case of heterogeneous data, they may not always represent a good fit. The present study relays on critical pumps survival data where traditional parametric regression might be improved in order to obtain better approaches. Considering censored data and using an empiric method to split the data into two subgroups to give the possibility to fit separated models to our censored data, we’ve mixture two distinct distributions according a mixture-models approach. We have concluded that it is a good method to fit data that does not fit to a usual parametric distribution and achieve reliable parameters. A constant cumulative hazard rate policy was used as well to check optimum inspection times using the obtained model from the mixture-model, which could be a plus when comparing with the actual maintenance policies to check whether changes should be introduced or not.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The inverse Weibull distribution has the ability to model failure rates which are quite common in reliability and biological studies. A three-parameter generalized inverse Weibull distribution with decreasing and unimodal failure rate is introduced and studied. We provide a comprehensive treatment of the mathematical properties of the new distribution including expressions for the moment generating function and the rth generalized moment. The mixture model of two generalized inverse Weibull distributions is investigated. The identifiability property of the mixture model is demonstrated. For the first time, we propose a location-scale regression model based on the log-generalized inverse Weibull distribution for modeling lifetime data. In addition, we develop some diagnostic tools for sensitivity analysis. Two applications of real data are given to illustrate the potentiality of the proposed regression model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we compare three residuals to assess departures from the error assumptions as well as to detect outlying observations in log-Burr XII regression models with censored observations. These residuals can also be used for the log-logistic regression model, which is a special case of the log-Burr XII regression model. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and the empirical distribution of each residual is displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended to the modified martingale-type residual in log-Burr XII regression models with censored data.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A bathtub-shaped failure rate function is very useful in survival analysis and reliability studies. The well-known lifetime distributions do not have this property. For the first time, we propose a location-scale regression model based on the logarithm of an extended Weibull distribution which has the ability to deal with bathtub-shaped failure rate functions. We use the method of maximum likelihood to estimate the model parameters and some inferential procedures are presented. We reanalyze a real data set under the new model and the log-modified Weibull regression model. We perform a model check based on martingale-type residuals and generated envelopes and the statistics AIC and BIC to select appropriate models. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In a sample of censored survival times, the presence of an immune proportion of individuals who are not subject to death, failure or relapse, may be indicated by a relatively high number of individuals with large censored survival times. In this paper the generalized log-gamma model is modified for the possibility that long-term survivors may be present in the data. The model attempts to separately estimate the effects of covariates on the surviving fraction, that is, the proportion of the population for which the event never occurs. The logistic function is used for the regression model of the surviving fraction. Inference for the model parameters is considered via maximum likelihood. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. Finally, a data set from the medical area is analyzed under the log-gamma generalized mixture model. A residual analysis is performed in order to select an appropriate model.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The zero-inflated negative binomial model is used to account for overdispersion detected in data that are initially analyzed under the zero-Inflated Poisson model A frequentist analysis a jackknife estimator and a non-parametric bootstrap for parameter estimation of zero-inflated negative binomial regression models are considered In addition an EM-type algorithm is developed for performing maximum likelihood estimation Then the appropriate matrices for assessing local influence on the parameter estimates under different perturbation schemes and some ways to perform global influence analysis are derived In order to study departures from the error assumption as well as the presence of outliers residual analysis based on the standardized Pearson residuals is discussed The relevance of the approach is illustrated with a real data set where It is shown that zero-inflated negative binomial regression models seems to fit the data better than the Poisson counterpart (C) 2010 Elsevier B V All rights reserved

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We introduce the log-beta Weibull regression model based on the beta Weibull distribution (Famoye et al., 2005; Lee et al., 2007). We derive expansions for the moment generating function which do not depend on complicated functions. The new regression model represents a parametric family of models that includes as sub-models several widely known regression models that can be applied to censored survival data. We employ a frequentist analysis, a jackknife estimator, and a parametric bootstrap for the parameters of the proposed model. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess global influences. Further, for different parameter settings, sample sizes, and censoring percentages, several simulations are performed. In addition, the empirical distribution of some modified residuals are displayed and compared with the standard normal distribution. These studies suggest that the residual analysis usually performed in normal linear regression models can be extended to a modified deviance residual in the proposed regression model applied to censored data. We define martingale and deviance residuals to evaluate the model assumptions. The extended regression model is very useful for the analysis of real data and could give more realistic fits than other special regression models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Methods. We studied participants with acute and/or early HIV infection and TDR in 2 cohorts (San Francisco, California, and Sao Paulo, Brazil). We followed baseline mutations longitudinally and compared replacement rates between mutation classes with use of a parametric proportional hazards model. Results. Among 75 individuals with 195 TDR mutations, M184V/I became undetectable markedly faster than did nonnucleoside reverse-transcriptase inhibitor (NNRTI) mutations (hazard ratio, 77.5; 95% confidence interval [CI], 14.7-408.2; P < .0001), while protease inhibitor and NNRTI replacement rates were similar. Higher plasma HIV-1 RNA level predicted faster mutation replacement, but this was not statistically significant (hazard ratio, 1.71 log(10) copies/mL; 95% CI, .90-3.25 log(10) copies/mL; P = .11). We found substantial person-to-person variability in mutation replacement rates not accounted for by viral load or mutation class (P < .0001). Conclusions. The rapid replacement of M184V/I mutations is consistent with known fitness costs. The long-term persistence of NNRTI and protease inhibitor mutations suggests a risk for person-to-person propagation. Host and/or viral factors not accounted for by viral load or mutation class are likely influencing mutation replacement and warrant further study.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The receiver-operating characteristic (ROC) curve is the most widely used measure for evaluating the performance of a diagnostic biomarker when predicting a binary disease outcome. The ROC curve displays the true positive rate (or sensitivity) and the false positive rate (or 1-specificity) for different cut-off values used to classify an individual as healthy or diseased. In time-to-event studies, however, the disease status (e.g. death or alive) of an individual is not a fixed characteristic, and it varies along the study. In such cases, when evaluating the performance of the biomarker, several issues should be taken into account: first, the time-dependent nature of the disease status; and second, the presence of incomplete data (e.g. censored data typically present in survival studies). Accordingly, to assess the discrimination power of continuous biomarkers for time-dependent disease outcomes, time-dependent extensions of true positive rate, false positive rate, and ROC curve have been recently proposed. In this work, we present new nonparametric estimators of the cumulative/dynamic time-dependent ROC curve that allow accounting for the possible modifying effect of current or past covariate measures on the discriminatory power of the biomarker. The proposed estimators can accommodate right-censored data, as well as covariate-dependent censoring. The behavior of the estimators proposed in this study will be explored through simulations and illustrated using data from a cohort of patients who suffered from acute coronary syndrome.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Low concentrations of elements in geochemical analyses have the peculiarity of beingcompositional data and, for a given level of significance, are likely to be beyond thecapabilities of laboratories to distinguish between minute concentrations and completeabsence, thus preventing laboratories from reporting extremely low concentrations of theanalyte. Instead, what is reported is the detection limit, which is the minimumconcentration that conclusively differentiates between presence and absence of theelement. A spatially distributed exhaustive sample is employed in this study to generateunbiased sub-samples, which are further censored to observe the effect that differentdetection limits and sample sizes have on the inference of population distributionsstarting from geochemical analyses having specimens below detection limit (nondetects).The isometric logratio transformation is used to convert the compositional data in thesimplex to samples in real space, thus allowing the practitioner to properly borrow fromthe large source of statistical techniques valid only in real space. The bootstrap method isused to numerically investigate the reliability of inferring several distributionalparameters employing different forms of imputation for the censored data. The casestudy illustrates that, in general, best results are obtained when imputations are madeusing the distribution best fitting the readings above detection limit and exposes theproblems of other more widely used practices. When the sample is spatially correlated, itis necessary to combine the bootstrap with stochastic simulation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Introduction: Imatinib, a first-line drug for chronic myeloid leukaemia (CML), has been increasingly proposed for therapeutic drug monitoring (TDM), as trough concentrations >=1000 ng/ml (Cmin) have been associated with improved molecular and complete cytogenetic response (CCyR). The pharmacological monitoring project of EUTOS (European Treatment and Outcome Study) was launched to validate retrospectively the correlation between Cmin and response in a large population of patients followed by central TDM in Bordeaux.¦Methods: 1898 CML patients with first TDM 0-9 years after imatinib initiation, providing cytogenetic data along with demographic and comedication (37%) information, were included. Individual Cmin, estimated by non-linear regression (NONMEM), was adjusted to initial standard dose (400 mg/day) and stratified at 1000 ng/ml. Kaplan-Meier estimates of overall cumulative CCyR rates (stratified by sex, age, comedication and Cmin) were compared using asymptotic logrank k-sample test for interval-censored data. Differences in Cmin were assessed by Wilcoxon test.¦Results: There were no significant differences in overall cumulative CCyR rates between Cmin strata, sex and comedication with P-glycoprotein inhibitors/inducers or CYP3A4 inhibitors (p >0.05). Lower rates were observed in 113 young patients <30 years (p = 0.037; 1-year rates: 43% vs 60% in older patients), as well as in 29 patients with CYP3A4 inducers (p = 0.001, 1-year rates: 40% vs 66% without). Higher rates were observed in 108 patients on organic-cation-transporter-1 (hOCT-1) inhibitors (p = 0.034, 1-year rates: 83% vs 56% without). Considering 1-year CCyR rates, a trend towards better response for Cmin above 1000 ng/ml was observed: 64% (95%CI: 60-69%) vs 59% (95%CI: 56-61%). Median Cmin (400 mg/day) was significantly reduced in male patients (732 vs 899ng/ml, p <0.001), young patients <30 years (734 vs 802 ng/ml, p = 0.037) and under CYP3A4 inducers (758 vs 859 ng/ml, p = 0.022). Under hOCT-1 inhibitors, Cmin was increased (939 vs 827 ng/ml, p = 0.038).¦Conclusion: Based on observational TDM data, the impact of imatinib Cmin >1000 ng/ml on CCyR was not salient. Young CML patients (<30 years) and patients taking CYP3A4 inducers probably need close monitoring and possibly higher imatinib doses, due to lower Cmin along with lower CCyR rates. Patients taking hOCT-1 inhibitors seem in contrast to have improved CCyR response rates. The precise role for imatinib TDM remains to be established prospectively.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Robust estimators for accelerated failure time models with asymmetric (or symmetric) error distribution and censored observations are proposed. It is assumed that the error model belongs to a log-location-scale family of distributions and that the mean response is the parameter of interest. Since scale is a main component of mean, scale is not treated as a nuisance parameter. A three steps procedure is proposed. In the first step, an initial high breakdown point S estimate is computed. In the second step, observations that are unlikely under the estimated model are rejected or down weighted. Finally, a weighted maximum likelihood estimate is computed. To define the estimates, functions of censored residuals are replaced by their estimated conditional expectation given that the response is larger than the observed censored value. The rejection rule in the second step is based on an adaptive cut-off that, asymptotically, does not reject any observation when the data are generat ed according to the model. Therefore, the final estimate attains full efficiency at the model, with respect to the maximum likelihood estimate, while maintaining the breakdown point of the initial estimator. Asymptotic results are provided. The new procedure is evaluated with the help of Monte Carlo simulations. Two examples with real data are discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Due to the inherent limitations of the analytical methods of measurement, environmental exposure data often present observations described as below a certain detection limit, also called left-censored data. Censored data directly interferes in almost all types of statistical analyzes, including descriptive parameters, hypothesis testing, confidence intervals, correlations and regressions. In this work, we investigated the performance of the main classes of methods from major publications available in the literature, considering their advantages and limitations. Some criteria for selecting the best method of dealing with censored data are presented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The average availability of a repairable system is the expected proportion of time that the system is operating in the interval [0, t]. The present article discusses the nonparametric estimation of the average availability when (i) the data on 'n' complete cycles of system operation are available, (ii) the data are subject to right censorship, and (iii) the process is observed upto a specified time 'T'. In each case, a nonparametric confidence interval for the average availability is also constructed. Simulations are conducted to assess the performance of the estimators.