876 resultados para classification and regression tree
Resumo:
Background: Tuberculosis (TB) remains a public health issue worldwide. The lack of specific clinical symptoms to diagnose TB makes the correct decision to admit patients to respiratory isolation a difficult task for the clinician. Isolation of patients without the disease is common and increases health costs. Decision models for the diagnosis of TB in patients attending hospitals can increase the quality of care and decrease costs, without the risk of hospital transmission. We present a predictive model for predicting pulmonary TB in hospitalized patients in a high prevalence area in order to contribute to a more rational use of isolation rooms without increasing the risk of transmission. Methods: Cross sectional study of patients admitted to CFFH from March 2003 to December 2004. A classification and regression tree (CART) model was generated and validated. The area under the ROC curve (AUC), sensitivity, specificity, positive and negative predictive values were used to evaluate the performance of model. Validation of the model was performed with a different sample of patients admitted to the same hospital from January to December 2005. Results: We studied 290 patients admitted with clinical suspicion of TB. Diagnosis was confirmed in 26.5% of them. Pulmonary TB was present in 83.7% of the patients with TB (62.3% with positive sputum smear) and HIV/AIDS was present in 56.9% of patients. The validated CART model showed sensitivity, specificity, positive predictive value and negative predictive value of 60.00%, 76.16%, 33.33%, and 90.55%, respectively. The AUC was 79.70%. Conclusions: The CART model developed for these hospitalized patients with clinical suspicion of TB had fair to good predictive performance for pulmonary TB. The most important variable for prediction of TB diagnosis was chest radiograph results. Prospective validation is still necessary, but our model offer an alternative for decision making in whether to isolate patients with clinical suspicion of TB in tertiary health facilities in countries with limited resources.
Resumo:
Abstract Background Smear negative pulmonary tuberculosis (SNPT) accounts for 30% of pulmonary tuberculosis cases reported yearly in Brazil. This study aimed to develop a prediction model for SNPT for outpatients in areas with scarce resources. Methods The study enrolled 551 patients with clinical-radiological suspicion of SNPT, in Rio de Janeiro, Brazil. The original data was divided into two equivalent samples for generation and validation of the prediction models. Symptoms, physical signs and chest X-rays were used for constructing logistic regression and classification and regression tree models. From the logistic regression, we generated a clinical and radiological prediction score. The area under the receiver operator characteristic curve, sensitivity, and specificity were used to evaluate the model's performance in both generation and validation samples. Results It was possible to generate predictive models for SNPT with sensitivity ranging from 64% to 71% and specificity ranging from 58% to 76%. Conclusion The results suggest that those models might be useful as screening tools for estimating the risk of SNPT, optimizing the utilization of more expensive tests, and avoiding costs of unnecessary anti-tuberculosis treatment. Those models might be cost-effective tools in a health care network with hierarchical distribution of scarce resources.
Resumo:
Background: Strategies for cancer reduction and management are targeted at both individual and area levels. Area-level strategies require careful understanding of geographic differences in cancer incidence, in particular the association with factors such as socioeconomic status, ethnicity and accessibility. This study aimed to identify the complex interplay of area-level factors associated with high area-specific incidence of Australian priority cancers using a classification and regression tree (CART) approach. Methods: Area-specific smoothed standardised incidence ratios were estimated for priority-area cancers across 478 statistical local areas in Queensland, Australia (1998-2007, n=186,075). For those cancers with significant spatial variation, CART models were used to identify whether area-level accessibility, socioeconomic status and ethnicity were associated with high area-specific incidence. Results: The accessibility of a person’s residence had the most consistent association with the risk of cancer diagnosis across the specific cancers. Many cancers were likely to have high incidence in more urban areas, although male lung cancer and cervical cancer tended to have high incidence in more remote areas. The impact of socioeconomic status and ethnicity on these associations differed by type of cancer. Conclusions: These results highlight the complex interactions between accessibility, socioeconomic status and ethnicity in determining cancer incidence risk.
Resumo:
Risk assessment systems for introduced species are being developed and applied globally, but methods for rigorously evaluating them are still in their infancy. We explore classification and regression tree models as an alternative to the current Australian Weed Risk Assessment system, and demonstrate how the performance of screening tests for unwanted alien species may be quantitatively compared using receiver operating characteristic (ROC) curve analysis. The optimal classification tree model for predicting weediness included just four out of a possible 44 attributes of introduced plants examined, namely: (i) intentional human dispersal of propagules; (ii) evidence of naturalization beyond native range; (iii) evidence of being a weed elsewhere; and (iv) a high level of domestication. Intentional human dispersal of propagules in combination with evidence of naturalization beyond a plants native range led to the strongest prediction of weediness. A high level of domestication in combination with no evidence of naturalization mitigated the likelihood of an introduced plant becoming a weed resulting from intentional human dispersal of propagules. Unlikely intentional human dispersal of propagules combined with no evidence of being a weed elsewhere led to the lowest predicted probability of weediness. The failure to include intrinsic plant attributes in the model suggests that either these attributes are not useful general predictors of weediness, or data and analysis were inadequate to elucidate the underlying relationship(s). This concurs with the historical pessimism that we will ever be able to accurately predict invasive plants. Given the apparent importance of propagule pressure (the number of individuals of an species released), future attempts at evaluating screening model performance for identifying unwanted plants need to account for propagule pressure when collating and/or analysing datasets. The classification tree had a cross-validated sensitivity of 93.6% and specificity of 36.7%. Based on the area under the ROC curve, the performance of the classification tree in correctly classifying plants as weeds or non-weeds was slightly inferior (Area under ROC curve = 0.83 +/- 0.021 (+/- SE)) to that of the current risk assessment system in use (Area under ROC curve = 0.89 +/- 0.018 (+/- SE)), although requires many fewer questions to be answered.
Resumo:
Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.
Resumo:
Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.
Resumo:
This research assesses the potential impact of weekly weather variability on the incidence of cryptosporidiosis disease using time series zero-inflated Poisson (ZIP) and classification and regression tree (CART) models. Data on weather variables, notified cryptosporidiosis cases and population size in Brisbane were supplied by the Australian Bureau of Meteorology, Queensland Department of Health, and Australian Bureau of Statistics, respectively. Both time series ZIP and CART models show a clear association between weather variables (maximum temperature, relative humidity, rainfall and wind speed) and cryptosporidiosis disease. The time series CART models indicated that, when weekly maximum temperature exceeded 31°C and relative humidity was less than 63%, the relative risk of cryptosporidiosis rose by 13.64 (expected morbidity: 39.4; 95% confidence interval: 30.9–47.9). These findings may have applications as a decision support tool in planning disease control and risk management programs for cryptosporidiosis disease.
Resumo:
This study examined the distribution of major mosquito species and their roles in the transmission of Ross River virus (RRV) infection for coastline and inland areas in Brisbane, Australia (27°28′ S, 153°2′ E). We obtained data on the monthly counts of RRV cases in Brisbane between November 1998 and December 2001 by statistical local areas from the Queensland Department of Health and the monthly mosquito abundance from the Brisbane City Council. Correlation analysis was used to assess the pairwise relationships between mosquito density and the incidence of RRV disease. This study showed that the mosquito abundance of Aedes vigilax (Skuse), Culex annulirostris (Skuse), and Aedes vittiger (Skuse) were significantly associated with the monthly incidence of RRV in the coastline area, whereas Aedes vigilax, Culex annulirostris, and Aedes notoscriptus (Skuse) were significantly associated with the monthly incidence of RRV in the inland area. The results of the classification and regression tree (CART) analysis show that both occurrence and incidence of RRV were influenced by interactions between species in both coastal and inland regions. We found that there was an 89% chance for an occurrence of RRV if the abundance of Ae. vigifax was between 64 and 90 in the coastline region. There was an 80% chance for an occurrence of RRV if the density of Cx. annulirostris was between 53 and 74 in the inland area. The results of this study may have applications as a decision support tool in planning disease control of RRV and other mosquito-borne diseases.
Resumo:
Background: It remains unclear whether it is possible to develop a spatiotemporal epidemic prediction model for cryptosporidiosis disease. This paper examined the impact of social economic and weather factors on cryptosporidiosis and explored the possibility of developing such a model using social economic and weather data in Queensland, Australia. ----- ----- Methods: Data on weather variables, notified cryptosporidiosis cases and social economic factors in Queensland were supplied by the Australian Bureau of Meteorology, Queensland Department of Health, and Australian Bureau of Statistics, respectively. Three-stage spatiotemporal classification and regression tree (CART) models were developed to examine the association between social economic and weather factors and monthly incidence of cryptosporidiosis in Queensland, Australia. The spatiotemporal CART model was used for predicting the outbreak of cryptosporidiosis in Queensland, Australia. ----- ----- Results: The results of the classification tree model (with incidence rates defined as binary presence/absence) showed that there was an 87% chance of an occurrence of cryptosporidiosis in a local government area (LGA) if the socio-economic index for the area (SEIFA) exceeded 1021, while the results of regression tree model (based on non-zero incidence rates) show when SEIFA was between 892 and 945, and temperature exceeded 32°C, the relative risk (RR) of cryptosporidiosis was 3.9 (mean morbidity: 390.6/100,000, standard deviation (SD): 310.5), compared to monthly average incidence of cryptosporidiosis. When SEIFA was less than 892 the RR of cryptosporidiosis was 4.3 (mean morbidity: 426.8/100,000, SD: 319.2). A prediction map for the cryptosporidiosis outbreak was made according to the outputs of spatiotemporal CART models. ----- ----- Conclusions: The results of this study suggest that spatiotemporal CART models based on social economic and weather variables can be used for predicting the outbreak of cryptosporidiosis in Queensland, Australia.