151 resultados para Logistic regression mixture models

em Universit


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A wide range of numerical models and tools have been developed over the last decades to support the decision making process in environmental applications, ranging from physical models to a variety of statistically-based methods. In this study, a landslide susceptibility map of a part of Three Gorges Reservoir region of China was produced, employing binary logistic regression analyses. The available information includes the digital elevation model of the region, geological map and different GIS layers including land cover data obtained from satellite imagery. The landslides were observed and documented during the field studies. The validation analysis is exploited to investigate the quality of mapping.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The role of land cover change as a significant component of global change has become increasingly recognized in recent decades. Large databases measuring land cover change, and the data which can potentially be used to explain the observed changes, are also becoming more commonly available. When developing statistical models to investigate observed changes, it is important to be aware that the chosen sampling strategy and modelling techniques can influence results. We present a comparison of three sampling strategies and two forms of grouped logistic regression models (multinomial and ordinal) in the investigation of patterns of successional change after agricultural land abandonment in Switzerland. Results indicated that both ordinal and nominal transitional change occurs in the landscape and that the use of different sampling regimes and modelling techniques as investigative tools yield different results. Synthesis and applications. Our multimodel inference identified successfully a set of consistently selected indicators of land cover change, which can be used to predict further change, including annual average temperature, the number of already overgrown neighbouring areas of land and distance to historically destructive avalanche sites. This allows for more reliable decision making and planning with respect to landscape management. Although both model approaches gave similar results, ordinal regression yielded more parsimonious models that identified the important predictors of land cover change more efficiently. Thus, this approach is favourable where land cover change pattern can be interpreted as an ordinal process. Otherwise, multinomial logistic regression is a viable alternative.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: We sought to improve upon previously published statistical modeling strategies for binary classification of dyslipidemia for general population screening purposes based on the waist-to-hip circumference ratio and body mass index anthropometric measurements. METHODS: Study subjects were participants in WHO-MONICA population-based surveys conducted in two Swiss regions. Outcome variables were based on the total serum cholesterol to high density lipoprotein cholesterol ratio. The other potential predictor variables were gender, age, current cigarette smoking, and hypertension. The models investigated were: (i) linear regression; (ii) logistic classification; (iii) regression trees; (iv) classification trees (iii and iv are collectively known as "CART"). Binary classification performance of the region-specific models was externally validated by classifying the subjects from the other region. RESULTS: Waist-to-hip circumference ratio and body mass index remained modest predictors of dyslipidemia. Correct classification rates for all models were 60-80%, with marked gender differences. Gender-specific models provided only small gains in classification. The external validations provided assurance about the stability of the models. CONCLUSIONS: There were no striking differences between either the algebraic (i, ii) vs. non-algebraic (iii, iv), or the regression (i, iii) vs. classification (ii, iv) modeling approaches. Anticipated advantages of the CART vs. simple additive linear and logistic models were less than expected in this particular application with a relatively small set of predictor variables. CART models may be more useful when considering main effects and interactions between larger sets of predictor variables.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Uncertainty quantification of petroleum reservoir models is one of the present challenges, which is usually approached with a wide range of geostatistical tools linked with statistical optimisation or/and inference algorithms. The paper considers a data driven approach in modelling uncertainty in spatial predictions. Proposed semi-supervised Support Vector Regression (SVR) model has demonstrated its capability to represent realistic features and describe stochastic variability and non-uniqueness of spatial properties. It is able to capture and preserve key spatial dependencies such as connectivity, which is often difficult to achieve with two-point geostatistical models. Semi-supervised SVR is designed to integrate various kinds of conditioning data and learn dependences from them. A stochastic semi-supervised SVR model is integrated into a Bayesian framework to quantify uncertainty with multiple models fitted to dynamic observations. The developed approach is illustrated with a reservoir case study. The resulting probabilistic production forecasts are described by uncertainty envelopes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Prediction of species' distributions is central to diverse applications in ecology, evolution and conservation science. There is increasing electronic access to vast sets of occurrence records in museums and herbaria, yet little effective guidance on how best to use this information in the context of numerous approaches for modelling distributions. To meet this need, we compared 16 modelling methods over 226 species from 6 regions of the world, creating the most comprehensive set of model comparisons to date. We used presence-only data to fit models, and independent presence-absence data to evaluate the predictions. Along with well-established modelling methods such as generalised additive models and GARP and BIOCLIM, we explored methods that either have been developed recently or have rarely been applied to modelling species' distributions. These include machine-learning methods and community models, both of which have features that may make them particularly well suited to noisy or sparse information, as is typical of species' occurrence data. Presence-only data were effective for modelling species' distributions for many species and regions. The novel methods consistently outperformed more established methods. The results of our analysis are promising for the use of data from museums and herbaria, especially as methods suited to the noise inherent in such data improve.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Multiple logistic regression is precluded from many practical applications in ecology that aim to predict the geographic distributions of species because it requires absence data, which are rarely available or are unreliable. In order to use multiple logistic regression, many studies have simulated "pseudo-absences" through a number of strategies, but it is unknown how the choice of strategy influences models and their geographic predictions of species. In this paper we evaluate the effect of several prevailing pseudo-absence strategies on the predictions of the geographic distribution of a virtual species whose "true" distribution and relationship to three environmental predictors was predefined. We evaluated the effect of using a) real absences b) pseudo-absences selected randomly from the background and c) two-step approaches: pseudo-absences selected from low suitability areas predicted by either Ecological Niche Factor Analysis: (ENFA) or BIOCLIM. We compared how the choice of pseudo-absence strategy affected model fit, predictive power, and information-theoretic model selection results. Results Models built with true absences had the best predictive power, best discriminatory power, and the "true" model (the one that contained the correct predictors) was supported by the data according to AIC, as expected. Models based on random pseudo-absences had among the lowest fit, but yielded the second highest AUC value (0.97), and the "true" model was also supported by the data. Models based on two-step approaches had intermediate fit, the lowest predictive power, and the "true" model was not supported by the data. Conclusion If ecologists wish to build parsimonious GLM models that will allow them to make robust predictions, a reasonable approach is to use a large number of randomly selected pseudo-absences, and perform model selection based on an information theoretic approach. However, the resulting models can be expected to have limited fit.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The predictive potential of six selected factors was assessed in 72 patients with primary myelodysplastic syndrome using univariate and multivariate logistic regression analysis of survival at 18 months. Factors were age (above median of 69 years), dysplastic features in the three myeloid bone marrow cell lineages, presence of chromosome defects, all metaphases abnormal, double or complex chromosome defects (C23), and a Bournemouth score of 2, 3, or 4 (B234). In the multivariate approach, B234 and C23 proved to be significantly associated with a reduction in the survival probability. The similarity of the regression coefficients associated with these two factors means that they have about the same weight. Consequently, the model was simplified by counting the number of factors (0, 1, or 2) present in each patient, thus generating a scoring system called the Lausanne-Bournemouth score (LB score). The LB score combines the well-recognized and easy-to-use Bournemouth score (B score) with the chromosome defect complexity, C23 constituting an additional indicator of patient outcome. The predicted risk of death within 18 months calculated from the model is as follows: 7.1% (confidence interval: 1.7-24.8) for patients with an LB score of 0, 60.1% (44.7-73.8) for an LB score of 1, and 96.8% (84.5-99.4) for an LB score of 2. The scoring system presented here has several interesting features. The LB score may improve the predictive value of the B score, as it is able to recognize two prognostic groups in the intermediate risk category of patients with B scores of 2 or 3. It has also the ability to identify two distinct prognostic subclasses among RAEB and possibly CMML patients. In addition to its above-described usefulness in the prognostic evaluation, the LB score may bring new insights into the understanding of evolution patterns in MDS. We used the combination of the B score and chromosome complexity to define four classes which may be considered four possible states of myelodysplasia and which describe two distinct evolutional pathways.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A wide range of modelling algorithms is used by ecologists, conservation practitioners, and others to predict species ranges from point locality data. Unfortunately, the amount of data available is limited for many taxa and regions, making it essential to quantify the sensitivity of these algorithms to sample size. This is the first study to address this need by rigorously evaluating a broad suite of algorithms with independent presence-absence data from multiple species and regions. We evaluated predictions from 12 algorithms for 46 species (from six different regions of the world) at three sample sizes (100, 30, and 10 records). We used data from natural history collections to run the models, and evaluated the quality of model predictions with area under the receiver operating characteristic curve (AUC). With decreasing sample size, model accuracy decreased and variability increased across species and between models. Novel modelling methods that incorporate both interactions between predictor variables and complex response shapes (i.e. GBM, MARS-INT, BRUTO) performed better than most methods at large sample sizes but not at the smallest sample sizes. Other algorithms were much less sensitive to sample size, including an algorithm based on maximum entropy (MAXENT) that had among the best predictive power across all sample sizes. Relative to other algorithms, a distance metric algorithm (DOMAIN) and a genetic algorithm (OM-GARP) had intermediate performance at the largest sample size and among the best performance at the lowest sample size. No algorithm predicted consistently well with small sample size (n < 30) and this should encourage highly conservative use of predictions based on small sample size and restrict their use to exploratory modelling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Healthy lifestyle including sufficient physical activity may mitigate or prevent adverse long-term effects of childhood cancer. We described daily physical activities and sports in childhood cancer survivors and controls, and assessed determinants of both activity patterns. METHODOLOGY/PRINCIPAL FINDINGS: The Swiss Childhood Cancer Survivor Study is a questionnaire survey including all children diagnosed with cancer 1976-2003 at age 0-15 years, registered in the Swiss Childhood Cancer Registry, who survived ≥5 years and reached adulthood (≥20 years). Controls came from the population-based Swiss Health Survey. We compared the two populations and determined risk factors for both outcomes in separate multivariable logistic regression models. The sample included 1058 survivors and 5593 controls (response rates 78% and 66%). Sufficient daily physical activities were reported by 52% (n = 521) of survivors and 37% (n = 2069) of controls (p<0.001). In contrast, 62% (n = 640) of survivors and 65% (n = 3635) of controls reported engaging in sports (p = 0.067). Risk factors for insufficient daily activities in both populations were: older age (OR for ≥35 years: 1.5, 95CI 1.2-2.0), female gender (OR 1.6, 95CI 1.3-1.9), French/Italian Speaking (OR 1.4, 95CI 1.1-1.7), and higher education (OR for university education: 2.0, 95CI 1.5-2.6). Risk factors for no sports were: being a survivor (OR 1.3, 95CI 1.1-1.6), older age (OR for ≥35 years: 1.4, 95CI 1.1-1.8), migration background (OR 1.5, 95CI 1.3-1.8), French/Italian speaking (OR 1.4, 95CI 1.2-1.7), lower education (OR for compulsory schooling only: 1.6, 95CI 1.2-2.2), being married (OR 1.7, 95CI 1.5-2.0), having children (OR 1.3, 95CI 1.4-1.9), obesity (OR 2.4, 95CI 1.7-3.3), and smoking (OR 1.7, 95CI 1.5-2.1). Type of diagnosis was only associated with sports. CONCLUSIONS/SIGNIFICANCE: Physical activity levels in survivors were lower than recommended, but comparable to controls and mainly determined by socio-demographic and cultural factors. Strategies to improve physical activity levels could be similar as for the general population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Species distribution models (SDMs) are increasingly used to predict environmentally induced range shifts of habitats of plant and animal species. Consequently SDMs are valuable tools for scientifically based conservation decisions. The aims of this paper are (1) to identify important drivers of butterfly species persistence or extinction, and (2) to analyse the responses of endangered butterfly species of dry grasslands and wetlands to likely future landscape changes in Switzerland. Future land use was represented by four scenarios describing: (1) ongoing land use changes as observed at the end of the last century; (2) a liberalisation of the agricultural markets; (3) a slightly lowered agricultural production; and (4) a strongly lowered agricultural production. Two model approaches have been applied. The first (logistic regression with principal components) explains what environmental variables have significant impact on species presence (and absence). The second (predictive SDM) is used to project species distribution under current and likely future land uses. The results of the explanatory analyses reveal that four principal components related to urbanisation, abandonment of open land and intensive agricultural practices as well as two climate parameters are primary drivers of species occurrence (decline). The scenario analyses show that lowered agricultural production is likely to favour dry grassland species due to an increase of non-intensively used land, open canopy forests, and overgrown areas. In the liberalisation scenario dry grassland species show a decrease in abundance due to a strong increase of forested patches. Wetland butterfly species would decrease under all four scenarios as their habitats become overgrown

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Hypotension, a common intra-operative incident, bears an important potential for morbidity. It is most often manageable and sometimes preventable, which renders its study important. Therefore, we aimed at examining hospital variations in the occurrence of intra-operative hypotension and its predictors. As secondary endpoints, we determined to what extent hypotension relates to the risk of post-operative incidents and death. METHODS: We used the Anaesthesia Databank Switzerland, built on routinely and prospectively collected data on all anaesthesias in 21 hospitals. The three outcomes were assessed using multi-level logistic regression models. RESULTS: Among 147,573 anaesthesias, hypotension ranged from 0.6% to 5.2% in participating hospitals, and from 0.3% up to 12% in different surgical specialties. Most (73.4%) were minor single events. Age, ASA status, combined general and regional anaesthesia techniques, duration of surgery and hospitalization were significantly associated with hypotension. Although significantly associated, the emergency status of the surgery had a weaker effect. Hospitals' odds ratios for hypotension varied between 0.12 and 2.50 (P < or = 0.001), even after adjusting for patient and anaesthesia factors, and for type of surgery. At least one post-operative incident occurred in 9.7% of the procedures, including 0.03% deaths. Intra-operative hypotension was associated with a higher risk of post-operative incidents and death. CONCLUSION: Wide variations remain in the occurrence of hypotension among hospitals after adjustment for risk factors. Although differential reporting from hospitals may exist, variations in anaesthesia techniques and blood pressure maintenance may also have contributed. Intra-operative hypotension is associated with morbidities and sometimes death, and constant vigilance must thus be advocated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Only a few studies have explored the relation between coffee and tea intake and head and neck cancers, with inconsistent results. METHODS: We pooled individual-level data from nine case-control studies of head and neck cancers, including 5,139 cases and 9,028 controls. Logistic regression was used to estimate odds ratios (OR) and 95% confidence intervals (95% CI), adjusting for potential confounders. RESULTS: Caffeinated coffee intake was inversely related with the risk of cancer of the oral cavity and pharynx: the ORs were 0.96 (95% CI, 0.94-0.98) for an increment of 1 cup per day and 0.61 (95% CI, 0.47-0.80) in drinkers of >4 cups per day versus nondrinkers. This latter estimate was consistent for different anatomic sites (OR, 0.46; 95% CI, 0.30-0.71 for oral cavity; OR, 0.58; 95% CI, 0.41-0.82 for oropharynx/hypopharynx; and OR, 0.61; 95% CI, 0.37-1.01 for oral cavity/pharynx not otherwise specified) and across strata of selected covariates. No association of caffeinated coffee drinking was found with laryngeal cancer (OR, 0.96; 95% CI, 0.64-1.45 in drinkers of >4 cups per day versus nondrinkers). Data on decaffeinated coffee were too sparse for detailed analysis, but indicated no increased risk. Tea intake was not associated with head and neck cancer risk (OR, 0.99; 95% CI, 0.89-1.11 for drinkers versus nondrinkers). CONCLUSIONS: This pooled analysis of case-control studies supports the hypothesis of an inverse association between caffeinated coffee drinking and risk of cancer of the oral cavity and pharynx. IMPACT: Given widespread use of coffee and the relatively high incidence and low survival of head and neck cancers, the observed inverse association may have appreciable public health relevance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

AIM: To confirm the accuracy of sentinel node biopsy (SNB) procedure and its morbidity, and to investigate predictive factors for SN status and prognostic factors for disease-free survival (DFS) and disease-specific survival (DSS). MATERIALS AND METHODS: Between October 1997 and December 2004, 327 consecutive patients in one centre with clinically node-negative primary skin melanoma underwent an SNB by the triple technique, i.e. lymphoscintigraphy, blue-dye and gamma-probe. Multivariate logistic regression analyses as well as the Kaplan-Meier were performed. RESULTS: Twenty-three percent of the patients had at least one metastatic SN, which was significantly associated with Breslow thickness (p<0.001). The success rate of SNB was 99.1% and its morbidity was 7.6%. With a median follow-up of 33 months, the 5-year DFS/DSS were 43%/49% for patients with positive SN and 83.5%/87.4% for patients with negative SN, respectively. The false-negative rate of SNB was 8.6% and sensitivity 91.4%. On multivariate analysis, DFS was significantly worsened by Breslow thickness (RR=5.6, p<0.001), positive SN (RR=5.0, p<0.001) and male sex (RR=2.9, p=0.001). The presence of a metastatic SN (RR=8.4, p<0.001), male sex (RR=6.1, p<0.001), Breslow thickness (RR=3.2, p=0.013) and ulceration (RR=2.6, p=0.015) were significantly associated with a poorer DSS. CONCLUSION: SNB is a reliable procedure with high sensitivity (91.4%) and low morbidity. Breslow thickness was the only statistically significant parameter predictive of SN status. DFS was worsened in decreasing order by Breslow thickness, metastatic SN and male gender. Similarly DSS was significantly worsened by a metastatic SN, male gender, Breslow thickness and ulceration. These data reinforce the SN status as a powerful staging procedure