934 resultados para logistic regression analysis
Resumo:
Objective: To identify potential prognostic factors for pulmonary thromboembolism (PTE), establishing a mathematical model to predict the risk for fatal PTE and nonfatal PTE.Method: the reports on 4,813 consecutive autopsies performed from 1979 to 1998 in a Brazilian tertiary referral medical school were reviewed for a retrospective study. From the medical records and autopsy reports of the 512 patients found with macroscopically and/or microscopically,documented PTE, data on demographics, underlying diseases, and probable PTE site of origin were gathered and studied by multiple logistic regression. Thereafter, the jackknife method, a statistical cross-validation technique that uses the original study patients to validate a clinical prediction rule, was performed.Results: the autopsy rate was 50.2%, and PTE prevalence was 10.6%. In 212 cases, PTE was the main cause of death (fatal PTE). The independent variables selected by the regression significance criteria that were more likely to be associated with fatal PTE were age (odds ratio [OR], 1.02; 95% confidence interval [CI], 1.00 to 1.03), trauma (OR, 8.5; 95% CI, 2.20 to 32.81), right-sided cardiac thrombi (OR, 1.96; 95% CI, 1.02 to 3.77), pelvic vein thrombi (OR, 3.46; 95% CI, 1.19 to 10.05); those most likely to be associated with nonfatal PTE were systemic arterial hypertension (OR, 0.51; 95% CI, 0.33 to 0.80), pneumonia (OR, 0.46; 95% CI, 0.30 to 0.71), and sepsis (OR, 0.16; 95% CI, 0.06 to 0.40). The results obtained from the application of the equation in the 512 cases studied using logistic regression analysis suggest the range in which logit p > 0.336 favors the occurrence of fatal PTE, logit p < - 1.142 favors nonfatal PTE, and logit P with intermediate values is not conclusive. The cross-validation prediction misclassification rate was 25.6%, meaning that the prediction equation correctly classified the majority of the cases (74.4%).Conclusions: Although the usefulness of this method in everyday medical practice needs to be confirmed by a prospective study, for the time being our results suggest that concerning prevention, diagnosis, and treatment of PTE, strict attention should be given to those patients presenting the variables that are significant in the logistic regression model.
Resumo:
The history of the logistic function since its introduction in 1838 is reviewed, and the logistic model for a polychotomous response variable is presented with a discussion of the assumptions involved in its derivation and use. Following this, the maximum likelihood estimators for the model parameters are derived along with a Newton-Raphson iterative procedure for evaluation. A rigorous mathematical derivation of the limiting distribution of the maximum likelihood estimators is then presented using a characteristic function approach. An appendix with theorems on the asymptotic normality of sample sums when the observations are not identically distributed, with proofs, supports the presentation on asymptotic properties of the maximum likelihood estimators. Finally, two applications of the model are presented using data from the Hypertension Detection and Follow-up Program, a prospective, population-based, randomized trial of treatment for hypertension. The first application compares the risk of five-year mortality from cardiovascular causes with that from noncardiovascular causes; the second application compares risk factors for fatal or nonfatal coronary heart disease with those for fatal or nonfatal stroke. ^
Resumo:
Peer-reviewed
Resumo:
OBJECTIVES: This paper is concerned with checking goodness-of-fit of binary logistic regression models. For the practitioners of data analysis, the broad classes of procedures for checking goodness-of-fit available in the literature are described. The challenges of model checking in the context of binary logistic regression are reviewed. As a viable solution, a simple graphical procedure for checking goodness-of-fit is proposed. METHODS: The graphical procedure proposed relies on pieces of information available from any logistic analysis; the focus is on combining and presenting these in an informative way. RESULTS: The information gained using this approach is presented with three examples. In the discussion, the proposed method is put into context and compared with other graphical procedures for checking goodness-of-fit of binary logistic models available in the literature. CONCLUSION: A simple graphical method can significantly improve the understanding of any logistic regression analysis and help to prevent faulty conclusions.
Resumo:
A statistical technique for fault analysis in industrial printing is reported. The method specifically deals with binary data, for which the results of the production process fall into two categories, rejected or accepted. The method is referred to as logistic regression, and is capable of predicting future fault occurrences by the analysis of current measurements from machine parts sensors. Individual analysis of each type of fault can determine which parts of the plant have a significant influence on the occurrence of such faults; it is also possible to infer which measurable process parameters have no significant influence on the generation of these faults. Information derived from the analysis can be helpful in the operator's interpretation of the current state of the plant. Appropriate actions may then be taken to prevent potential faults from occurring. The algorithm is being implemented as part of an applied self-learning expert system.
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
This article focuses on business risk management in the insurance industry. A methodology for estimating the profit loss caused by each customer in the portfolio due to policy cancellation is proposed. Using data from a European insurance company, customer behaviour over time is analyzed in order to estimate the probability of policy cancelation and the resulting potential profit loss due to cancellation. Customers may have up to two different lines of business contracts: motor insurance and other diverse insurance (such as, home contents, life or accident insurance). Implications for understanding customer cancellation behaviour as the core of business risk management are outlined.
Resumo:
A wide range of numerical models and tools have been developed over the last decades to support the decision making process in environmental applications, ranging from physical models to a variety of statistically-based methods. In this study, a landslide susceptibility map of a part of Three Gorges Reservoir region of China was produced, employing binary logistic regression analyses. The available information includes the digital elevation model of the region, geological map and different GIS layers including land cover data obtained from satellite imagery. The landslides were observed and documented during the field studies. The validation analysis is exploited to investigate the quality of mapping.
Resumo:
Logistic regression is included into the analysis techniques which are valid for observationalmethodology. However, its presence at the heart of thismethodology, and more specifically in physical activity and sports studies, is scarce. With a view to highlighting the possibilities this technique offers within the scope of observational methodology applied to physical activity and sports, an application of the logistic regression model is presented. The model is applied in the context of an observational design which aims to determine, from the analysis of use of the playing area, which football discipline (7 a side football, 9 a side football or 11 a side football) is best adapted to the child"s possibilities. A multiple logistic regression model can provide an effective prognosis regarding the probability of a move being successful (reaching the opposing goal area) depending on the sector in which the move commenced and the football discipline which is being played.
Resumo:
The increasing demand of consumer markets for the welfare of birds in poultry house has motivated many scientific researches to monitor and classify the welfare according to the production environment. Given the complexity between the birds and the environment of the aviary, the correct interpretation of the conduct becomes an important way to estimate the welfare of these birds. This study obtained multiple logistic regression models with capacity of estimating the welfare of broiler breeders in relation to the environment of the aviaries and behaviors expressed by the birds. In the experiment, were observed several behaviors expressed by breeders housed in a climatic chamber under controlled temperatures and three different ammonia concentrations from the air monitored daily. From the analysis of the data it was obtained two logistic regression models, of which the first model uses a value of ammonia concentration measured by unit and the second model uses a binary value to classify the ammonia concentration that is assigned by a person through his olfactory perception. The analysis showed that both models classified the broiler breeder's welfare successfully.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
In epidemiological work, outcomes are frequently non-normal, sample sizes may be large, and effects are often small. To relate health outcomes to geographic risk factors, fast and powerful methods for fitting spatial models, particularly for non-normal data, are required. We focus on binary outcomes, with the risk surface a smooth function of space. We compare penalized likelihood models, including the penalized quasi-likelihood (PQL) approach, and Bayesian models based on fit, speed, and ease of implementation. A Bayesian model using a spectral basis representation of the spatial surface provides the best tradeoff of sensitivity and specificity in simulations, detecting real spatial features while limiting overfitting and being more efficient computationally than other Bayesian approaches. One of the contributions of this work is further development of this underused representation. The spectral basis model outperforms the penalized likelihood methods, which are prone to overfitting, but is slower to fit and not as easily implemented. Conclusions based on a real dataset of cancer cases in Taiwan are similar albeit less conclusive with respect to comparing the approaches. The success of the spectral basis with binary data and similar results with count data suggest that it may be generally useful in spatial models and more complicated hierarchical models.
Resumo:
The ordinal logistic regression models are used to analyze the dependant variable with multiple outcomes that can be ranked, but have been underutilized. In this study, we describe four logistic regression models for analyzing the ordinal response variable. ^ In this methodological study, the four regression models are proposed. The first model uses the multinomial logistic model. The second is adjacent-category logit model. The third is the proportional odds model and the fourth model is the continuation-ratio model. We illustrate and compare the fit of these models using data from the survey designed by the University of Texas, School of Public Health research project PCCaSO (Promoting Colon Cancer Screening in people 50 and Over), to study the patient’s confidence in the completion colorectal cancer screening (CRCS). ^ The purpose of this study is two fold: first, to provide a synthesized review of models for analyzing data with ordinal response, and second, to evaluate their usefulness in epidemiological research, with particular emphasis on model formulation, interpretation of model coefficients, and their implications. Four ordinal logistic models that are used in this study include (1) Multinomial logistic model, (2) Adjacent-category logistic model [9], (3) Continuation-ratio logistic model [10], (4) Proportional logistic model [11]. We recommend that the analyst performs (1) goodness-of-fit tests, (2) sensitivity analysis by fitting and comparing different models.^
Resumo:
Ordinal outcomes are frequently employed in diagnosis and clinical trials. Clinical trials of Alzheimer's disease (AD) treatments are a case in point using the status of mild, moderate or severe disease as outcome measures. As in many other outcome oriented studies, the disease status may be misclassified. This study estimates the extent of misclassification in an ordinal outcome such as disease status. Also, this study estimates the extent of misclassification of a predictor variable such as genotype status. An ordinal logistic regression model is commonly used to model the relationship between disease status, the effect of treatment, and other predictive factors. A simulation study was done. First, data based on a set of hypothetical parameters and hypothetical rates of misclassification was created. Next, the maximum likelihood method was employed to generate likelihood equations accounting for misclassification. The Nelder-Mead Simplex method was used to solve for the misclassification and model parameters. Finally, this method was applied to an AD dataset to detect the amount of misclassification present. The estimates of the ordinal regression model parameters were close to the hypothetical parameters. β1 was hypothesized at 0.50 and the mean estimate was 0.488, β2 was hypothesized at 0.04 and the mean of the estimates was 0.04. Although the estimates for the rates of misclassification of X1 were not as close as β1 and β2, they validate this method. X 1 0-1 misclassification was hypothesized as 2.98% and the mean of the simulated estimates was 1.54% and, in the best case, the misclassification of k from high to medium was hypothesized at 4.87% and had a sample mean of 3.62%. In the AD dataset, the estimate for the odds ratio of X 1 of having both copies of the APOE 4 allele changed from an estimate of 1.377 to an estimate 1.418, demonstrating that the estimates of the odds ratio changed when the analysis includes adjustment for misclassification. ^