915 resultados para Logistic regression model


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In 2011, there will be an estimated 1,596,670 new cancer cases and 571,950 cancer-related deaths in the US. With the ever-increasing applications of cancer genetics in epidemiology, there is great potential to identify genetic risk factors that would help identify individuals with increased genetic susceptibility to cancer, which could be used to develop interventions or targeted therapies that could hopefully reduce cancer risk and mortality. In this dissertation, I propose to develop a new statistical method to evaluate the role of haplotypes in cancer susceptibility and development. This model will be flexible enough to handle not only haplotypes of any size, but also a variety of covariates. I will then apply this method to three cancer-related data sets (Hodgkin Disease, Glioma, and Lung Cancer). I hypothesize that there is substantial improvement in the estimation of association between haplotypes and disease, with the use of a Bayesian mathematical method to infer haplotypes that uses prior information from known genetics sources. Analysis based on haplotypes using information from publically available genetic sources generally show increased odds ratios and smaller p-values in both the Hodgkin, Glioma, and Lung data sets. For instance, the Bayesian Joint Logistic Model (BJLM) inferred haplotype TC had a substantially higher estimated effect size (OR=12.16, 95% CI = 2.47-90.1 vs. 9.24, 95% CI = 1.81-47.2) and more significant p-value (0.00044 vs. 0.008) for Hodgkin Disease compared to a traditional logistic regression approach. Also, the effect sizes of haplotypes modeled with recessive genetic effects were higher (and had more significant p-values) when analyzed with the BJLM. Full genetic models with haplotype information developed with the BJLM resulted in significantly higher discriminatory power and a significantly higher Net Reclassification Index compared to those developed with haplo.stats for lung cancer. Future analysis for this work could be to incorporate the 1000 Genomes project, which offers a larger selection of SNPs can be incorporated into the information from known genetic sources as well. Other future analysis include testing non-binary outcomes, like the levels of biomarkers that are present in lung cancer (NNK), and extending this analysis to full GWAS studies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62J12, 62F35

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a sample of censored survival times, the presence of an immune proportion of individuals who are not subject to death, failure or relapse, may be indicated by a relatively high number of individuals with large censored survival times. In this paper the generalized log-gamma model is modified for the possibility that long-term survivors may be present in the data. The model attempts to separately estimate the effects of covariates on the surviving fraction, that is, the proportion of the population for which the event never occurs. The logistic function is used for the regression model of the surviving fraction. Inference for the model parameters is considered via maximum likelihood. Some influence methods, such as the local influence and total local influence of an individual are derived, analyzed and discussed. Finally, a data set from the medical area is analyzed under the log-gamma generalized mixture model. A residual analysis is performed in order to select an appropriate model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Using data from a logging experiment in the eastern Brazilian Amazon region, we develop a matrix growth and yield model that captures the dynamic effects of harvest system choice on forest structure and composition. Multinomial logistic regression is used to estimate the growth transition parameters for a 10-year time step, while a Poisson regression model is used to estimate recruitment parameters. The model is designed to be easily integrated with an economic model of decisionmaking to perform tropical forest policy analysis. The model is used to compare the long-run structure and composition of a stand arising from the choice of implementing either conventional logging techniques or more carefully planned and executed reduced-impact logging (RIL) techniques, contrasted against a baseline projection of an unlogged forest. Results from log and leave scenarios show that a stand logged according to Brazilian management requirements will require well over 120 years to recover its initial commercial volume, regardless of logging technique employed. Implementing RIL, however, accelerates this recovery. Scenarios imposing a 40-year cutting cycle raise the possibility of sustainable harvest volumes, although at significantly lower levels than is implied by current regulations. Meeting current Brazilian forest policy goals may require an increase in the planned total area of permanent production forest or the widespread adoption of silvicultural practices that increase stand recovery and volume accumulation rates after RIL harvests. Published by Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. Although population viability analysis (PVA) is widely employed, forecasts from PVA models are rarely tested. This study in a fragmented forest in southern Australia contrasted field data on patch occupancy and abundance for the arboreal marsupial greater glider Petauroides volans with predictions from a generic spatially explicit PVA model. This work represents one of the first landscape-scale tests of its type. 2. Initially we contrasted field data from a set of eucalypt forest patches totalling 437 ha with a naive null model in which forecasts of patch occupancy were made, assuming no fragmentation effects and based simply on remnant area and measured densities derived from nearby unfragmented forest. The naive null model predicted an average total of approximately 170 greater gliders, considerably greater than the true count (n = 81). 3. Congruence was examined between field data and predictions from PVA under several metapopulation modelling scenarios. The metapopulation models performed better than the naive null model. Logistic regression showed highly significant positive relationships between predicted and actual patch occupancy for the four scenarios (P = 0.001-0.006). When the model-derived probability of patch occupancy was high (0.50-0.75, 0.75-1.00), there was greater congruence between actual patch occupancy and the predicted probability of occupancy. 4. For many patches, probability distribution functions indicated that model predictions for animal abundance in a given patch were not outside those expected by chance. However, for some patches the model either substantially over-predicted or under-predicted actual abundance. Some important processes, such as inter-patch dispersal, that influence the distribution and abundance of the greater glider may not have been adequately modelled. 5. Additional landscape-scale tests of PVA models, on a wider range of species, are required to assess further predictions made using these tools. This will help determine those taxa for which predictions are and are not accurate and give insights for improving models for applied conservation management.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A risk score model was developed based in a population of 1,224 individuals from the general population without known diabetes aging 35 years or more from an urban Brazilian population sample in order to select individuals who should be screened in subsequent testing and improve the efficacy of public health assurance. External validation was performed in a second, independent, population from a different city ascertained through a similar epidemiological protocol. The risk score was developed by multiple logistic regression and model performance and cutoff values were derived from a receiver operating characteristic curve. Model`s capacity of predicting fasting blood glucose levels was tested analyzing data from a 5-year follow-up protocol conducted in the general population. Items independently and significantly associated with diabetes were age, BMI and known hypertension. Sensitivity, specificity and proportion of further testing necessary for the best cutoff value were 75.9, 66.9 and 37.2%, respectively. External validation confirmed the model`s adequacy (AUC equal to 0.72). Finally, model score was also capable of predicting fasting blood glucose progression in non-diabetic individuals in a 5-year follow-up period. In conclusion, this simple diabetes risk score was able to identify individuals with an increased likelihood of having diabetes and it can be used to stratify subpopulations in which performing of subsequent tests is necessary and probably cost-effective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To identify prediction factors for the development of leptospirosis-associated pulmonary hemorrhage syndrome (LPHS). Methods: We conducted a prospective cohort study. The study comprised of 203 patients, aged >= 14 years, admitted with complications of the severe form of leptospirosis at the Emilio Ribas Institute of Infectology (Sao Paulo, Brazil) between 1998 and 2004. Laboratory and demographic data were obtained and the severity of illness score and involvement of the lungs and others organs were determined. Logistic regression was performed to identify independent predictors of LPHS. A prospective validation cohort of 97 subjects with severe form of leptospirosis admitted at the same hospital between 2004 and 2006 was used to independently evaluate the predictive value of the model. Results: The overall mortality rate was 7.9%. Multivariate logistic regression revealed that five factors were independently associated with the development of LPHS: serum potassium (mmol/L) (OR = 2.6; 95% CI = 1.1-5.9); serum creatinine (mmol/L) (OR = 1.2; 95% CI = 1.1-1.4); respiratory rate (breaths/min) (OR = 1.1; 95% CI = 1.1-1.2); presenting shock (OR = 69.9; 95% CI = 20.1-236.4), and Glasgow Coma Scale Score (GCS) < 15 (OR = 7.7; 95% CI = 1.3-23.0). We used these findings to calculate the risk of LPHS by the use of a spreadsheet. In the validation cohort, the equation classified correctly 92% of patients (Kappa statistic = 0.80). Conclusions: We developed and validated a multivariate model for predicting LPHS. This tool should prove useful in identifying LPHS patients, allowing earlier management and thereby reducing mortality. (C) 2009 The British Infection Society. Published by Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In many occupational safety interventions, the objective is to reduce the injury incidence as well as the mean claims cost once injury has occurred. The claims cost data within a period typically contain a large proportion of zero observations (no claim). The distribution thus comprises a point mass at 0 mixed with a non-degenerate parametric component. Essentially, the likelihood function can be factorized into two orthogonal components. These two components relate respectively to the effect of covariates on the incidence of claims and the magnitude of claims, given that claims are made. Furthermore, the longitudinal nature of the intervention inherently imposes some correlation among the observations. This paper introduces a zero-augmented gamma random effects model for analysing longitudinal data with many zeros. Adopting the generalized linear mixed model (GLMM) approach reduces the original problem to the fitting of two independent GLMMs. The method is applied to evaluate the effectiveness of a workplace risk assessment teams program, trialled within the cleaning services of a Western Australian public hospital.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Little is known about the risk of progression to hazardous alcohol use in people currently drinking at safe limits. We aimed to develop a prediction model (predictAL) for the development of hazardous drinking in safe drinkers. Methods: A prospective cohort study of adult general practice attendees in six European countries and Chile followed up over 6 months. We recruited 10,045 attendees between April 2003 to February 2005. 6193 European and 2462 Chilean attendees recorded AUDIT scores below 8 in men and 5 in women at recruitment and were used in modelling risk. 38 risk factors were measured to construct a risk model for the development of hazardous drinking using stepwise logistic regression. The model was corrected for over fitting and tested in an external population. The main outcome was hazardous drinking defined by an AUDIT score >= 8 in men and >= 5 in women. Results: 69.0% of attendees were recruited, of whom 89.5% participated again after six months. The risk factors in the final predictAL model were sex, age, country, baseline AUDIT score, panic syndrome and lifetime alcohol problem. The predictAL model's average c-index across all six European countries was 0.839 (95% CI 0.805, 0.873). The Hedge's g effect size for the difference in log odds of predicted probability between safe drinkers in Europe who subsequently developed hazardous alcohol use and those who did not was 1.38 (95% CI 1.25, 1.51). External validation of the algorithm in Chilean safe drinkers resulted in a c-index of 0.781 (95% CI 0.717, 0.846) and Hedge's g of 0.68 (95% CI 0.57, 0.78). Conclusions: The predictAL risk model for development of hazardous consumption in safe drinkers compares favourably with risk algorithms for disorders in other medical settings and can be a useful first step in prevention of alcohol misuse.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVE: The objective of the study was to develop a model for estimating patient 28-day in-hospital mortality using 2 different statistical approaches. DESIGN: The study was designed to develop an outcome prediction model for 28-day in-hospital mortality using (a) logistic regression with random effects and (b) a multilevel Cox proportional hazards model. SETTING: The study involved 305 intensive care units (ICUs) from the basic Simplified Acute Physiology Score (SAPS) 3 cohort. PATIENTS AND PARTICIPANTS: Patients (n = 17138) were from the SAPS 3 database with follow-up data pertaining to the first 28 days in hospital after ICU admission. INTERVENTIONS: None. MEASUREMENTS AND RESULTS: The database was divided randomly into 5 roughly equal-sized parts (at the ICU level). It was thus possible to run the model-building procedure 5 times, each time taking four fifths of the sample as a development set and the remaining fifth as the validation set. At 28 days after ICU admission, 19.98% of the patients were still in the hospital. Because of the different sampling space and outcome variables, both models presented a better fit in this sample than did the SAPS 3 admission score calibrated to vital status at hospital discharge, both on the general population and in major subgroups. CONCLUSIONS: Both statistical methods can be used to model the 28-day in-hospital mortality better than the SAPS 3 admission model. However, because the logistic regression approach is specifically designed to forecast 28-day mortality, and given the high uncertainty associated with the assumption of the proportionality of risks in the Cox model, the logistic regression approach proved to be superior.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper investigates the usefulness of switching Gaussian state space models as a tool for implementing dynamic model selecting (DMS) or averaging (DMA) in time-varying parameter regression models. DMS methods allow for model switching, where a different model can be chosen at each point in time. Thus, they allow for the explanatory variables in the time-varying parameter regression model to change over time. DMA will carry out model averaging in a time-varying manner. We compare our exact approach to DMA/DMS to a popular existing procedure which relies on the use of forgetting factor approximations. In an application, we use DMS to select different predictors in an in ation forecasting application. We also compare different ways of implementing DMA/DMS and investigate whether they lead to similar results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Given the very large amount of data obtained everyday through population surveys, much of the new research again could use this information instead of collecting new samples. Unfortunately, relevant data are often disseminated into different files obtained through different sampling designs. Data fusion is a set of methods used to combine information from different sources into a single dataset. In this article, we are interested in a specific problem: the fusion of two data files, one of which being quite small. We propose a model-based procedure combining a logistic regression with an Expectation-Maximization algorithm. Results show that despite the lack of data, this procedure can perform better than standard matching procedures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Capercaillie, Tetrao urogallus, is a threatened species in central Europe, and Swiss populations declined 40 to 50 % between 1970 and 1985. Capercaillie are sensitive to forest structure, and loss of habitat is a major cause of their decline. Knowledge of habitat characteristics is therefore essential for capercaillie conservation. Here, we present models predicting capercaillie probability of occurrence, based on relevant structural habitat variables. Models were built using multiple logistic regression analyses on capercaillie presence/absence data. Vegetation survey was carried out in July 1999 in a 170-km2 forested area (Jura mountains, canton de Vaud, western Switzerland) inhabited by capercaillie and presence/absence of the species was assessed according to dropping presence/absence. The survey was based on 10-m-radius sample plots each in a 1-km2 forest patch (n = 76 with capercaillie droppings, n = 80 without). A first model included seven out of 27 measured habitat variables and a second model only four. The latter model best represents practical needs. It includes three variables which had a negative impact on capercaillie presence: tree and shrub covers and spruce, Picea excelsa, shrub cover, and one which had a positive effect: bilberry, Vaccinium myrtillus, cover, highlighting that capaercaillie selected open forest with high bilberry abundance. The model can be used to map potential capercaillie habitat distribution and to manage the habitat in favour of capercaillie (protection and adapted forestry practices) in the Swiss Jura mountains.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

RATIONALE: An objective and simple prognostic model for patients with pulmonary embolism could be helpful in guiding initial intensity of treatment. OBJECTIVES: To develop a clinical prediction rule that accurately classifies patients with pulmonary embolism into categories of increasing risk of mortality and other adverse medical outcomes. METHODS: We randomly allocated 15,531 inpatient discharges with pulmonary embolism from 186 Pennsylvania hospitals to derivation (67%) and internal validation (33%) samples. We derived our prediction rule using logistic regression with 30-day mortality as the primary outcome, and patient demographic and clinical data routinely available at presentation as potential predictor variables. We externally validated the rule in 221 inpatients with pulmonary embolism from Switzerland and France. MEASUREMENTS: We compared mortality and nonfatal adverse medical outcomes across the derivation and two validation samples. MAIN RESULTS: The prediction rule is based on 11 simple patient characteristics that were independently associated with mortality and stratifies patients with pulmonary embolism into five severity classes, with 30-day mortality rates of 0-1.6% in class I, 1.7-3.5% in class II, 3.2-7.1% in class III, 4.0-11.4% in class IV, and 10.0-24.5% in class V across the derivation and validation samples. Inpatient death and nonfatal complications were <or= 1.1% among patients in class I and <or= 1.9% among patients in class II. CONCLUSIONS: Our rule accurately classifies patients with pulmonary embolism into classes of increasing risk of mortality and other adverse medical outcomes. Further validation of the rule is important before its implementation as a decision aid to guide the initial management of patients with pulmonary embolism.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

None of the current surveillance streams monitoring the presence of scrapie in Great Britain provide a comprehensive and unbiased estimate of the prevalence of the disease at the holding level. Previous work to estimate the under-ascertainment adjusted prevalence of scrapie in Great Britain applied multiple-list capture–recapture methods. The enforcement of new control measures on scrapie-affected holdings in 2004 has stopped the overlapping between surveillance sources and, hence, the application of multiple-list capture–recapture models. Alternative methods, still under the capture–recapture methodology, relying on repeated entries in one single list have been suggested in these situations. In this article, we apply one-list capture–recapture approaches to data held on the Scrapie Notifications Database to estimate the undetected population of scrapie-affected holdings with clinical disease in Great Britain for the years 2002, 2003, and 2004. For doing so, we develop a new diagnostic tool for indication of heterogeneity as well as a new understanding of the Zelterman and Chao’s lower bound estimators to account for potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a special, locally truncated Poisson likelihood equivalent to a binomial likelihood. This understanding allows the extension of the Zelterman approach by means of logistic regression to include observed heterogeneity in the form of covariates—in case studied here, the holding size and country of origin. Our results confirm the presence of substantial unobserved heterogeneity supporting the application of our two estimators. The total scrapie-affected holding population in Great Britain is around 300 holdings per year. None of the covariates appear to inform the model significantly.