905 resultados para Classification Methods


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-04

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most of the modem developments with classification trees are aimed at improving their predictive capacity. This article considers a curiously neglected aspect of classification trees, namely the reliability of predictions that come from a given classification tree. In the sense that a node of a tree represents a point in the predictor space in the limit, the aim of this article is the development of localized assessment of the reliability of prediction rules. A classification tree may be used either to provide a probability forecast, where for each node the membership probabilities for each class constitutes the prediction, or a true classification where each new observation is predictively assigned to a unique class. Correspondingly, two types of reliability measure will be derived-namely, prediction reliability and classification reliability. We use bootstrapping methods as the main tool to construct these measures. We also provide a suite of graphical displays by which they may be easily appreciated. In addition to providing some estimate of the reliability of specific forecasts of each type, these measures can also be used to guide future data collection to improve the effectiveness of the tree model. The motivating example we give has a binary response, namely the presence or absence of a species of Eucalypt, Eucalyptus cloeziana, at a given sampling location in response to a suite of environmental covariates, (although the methods are not restricted to binary response data).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The expectation-maximization (EM) algorithm has been of considerable interest in recent years as the basis for various algorithms in application areas of neural networks such as pattern recognition. However, there exists some misconceptions concerning its application to neural networks. In this paper, we clarify these misconceptions and consider how the EM algorithm can be adopted to train multilayer perceptron (MLP) and mixture of experts (ME) networks in applications to multiclass classification. We identify some situations where the application of the EM algorithm to train MLP networks may be of limited value and discuss some ways of handling the difficulties. For ME networks, it is reported in the literature that networks trained by the EM algorithm using iteratively reweighted least squares (IRLS) algorithm in the inner loop of the M-step, often performed poorly in multiclass classification. However, we found that the convergence of the IRLS algorithm is stable and that the log likelihood is monotonic increasing when a learning rate smaller than one is adopted. Also, we propose the use of an expectation-conditional maximization (ECM) algorithm to train ME networks. Its performance is demonstrated to be superior to the IRLS algorithm on some simulated and real data sets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recent research suggests that the retrospective review of the International Classification of Disease (ICD-9-CM) codes assigned to a patient episode will identify a similar number of healthcare-acquired surgical-site infections as compared with prospective surveillance by infection control practitioners (ICP). We tested this finding by replicating the methods for 380 surgical procedures. The sensitivity and specificity of the ICP undertaking prospective surveillance was 80% and 100%, and the sensitivity and specificity of the review of ICD-10-AM codes was 60% and 98.9%. Based on these results we do not support retrospective review of ICD-10-AM codes in preference prospective surveillance for SSI. (C) 2004 The Hospital Infection Society. Published by Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and Purpose - Although implemented in 1998, no research has examined how well the Australian National Subacute and Nonacute Patient (AN-SNAP) Casemix Classification predicts length of stay (LOS), discharge destination, and functional improvement in public hospital stroke rehabilitation units in Australia. Methods - 406 consecutive admissions to 3 stroke rehabilitation units in Queensland, Australia were studied. Sociode-mographic, clinical, and functional data were collected. General linear modeling and logistic regression were used to assess the ability of AN-SNAP to predict outcomes. Results - AN-SNAP significantly predicted each outcome. There were clear relationships between the outcomes of longer LOS, poorer functional improvement and discharge into care, and the AN-SNAP classes that reflected poorer functional ability and older age. Other predictors included living situation, acute LOS, comorbidity, and stroke type. Conclusions - AN-SNAP is a consistent predictor of LOS, functional change and discharge destination, and has utility in assisting clinicians to set rehabilitation goals and plan discharge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Risk assessment systems for introduced species are being developed and applied globally, but methods for rigorously evaluating them are still in their infancy. We explore classification and regression tree models as an alternative to the current Australian Weed Risk Assessment system, and demonstrate how the performance of screening tests for unwanted alien species may be quantitatively compared using receiver operating characteristic (ROC) curve analysis. The optimal classification tree model for predicting weediness included just four out of a possible 44 attributes of introduced plants examined, namely: (i) intentional human dispersal of propagules; (ii) evidence of naturalization beyond native range; (iii) evidence of being a weed elsewhere; and (iv) a high level of domestication. Intentional human dispersal of propagules in combination with evidence of naturalization beyond a plants native range led to the strongest prediction of weediness. A high level of domestication in combination with no evidence of naturalization mitigated the likelihood of an introduced plant becoming a weed resulting from intentional human dispersal of propagules. Unlikely intentional human dispersal of propagules combined with no evidence of being a weed elsewhere led to the lowest predicted probability of weediness. The failure to include intrinsic plant attributes in the model suggests that either these attributes are not useful general predictors of weediness, or data and analysis were inadequate to elucidate the underlying relationship(s). This concurs with the historical pessimism that we will ever be able to accurately predict invasive plants. Given the apparent importance of propagule pressure (the number of individuals of an species released), future attempts at evaluating screening model performance for identifying unwanted plants need to account for propagule pressure when collating and/or analysing datasets. The classification tree had a cross-validated sensitivity of 93.6% and specificity of 36.7%. Based on the area under the ROC curve, the performance of the classification tree in correctly classifying plants as weeds or non-weeds was slightly inferior (Area under ROC curve = 0.83 +/- 0.021 (+/- SE)) to that of the current risk assessment system in use (Area under ROC curve = 0.89 +/- 0.018 (+/- SE)), although requires many fewer questions to be answered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the statistical problem of catalogue matching from a machine learning perspective with the goal of producing probabilistic outputs, and using all available information. A framework is provided that unifies two existing approaches to producing probabilistic outputs in the literature, one based on combining distribution estimates and the other based on combining probabilistic classifiers. We apply both of these to the problem of matching the HI Parkes All Sky Survey radio catalogue with large positional uncertainties to the much denser SuperCOSMOS catalogue with much smaller positional uncertainties. We demonstrate the utility of probabilistic outputs by a controllable completeness and efficiency trade-off and by identifying objects that have high probability of being rare. Finally, possible biasing effects in the output of these classifiers are also highlighted and discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To demonstrate properties of the International Classification of the External Cause of Injury (ICECI) as a tool for use in injury prevention research. Methods: The Childhood Injury Prevention Study (CHIPS) is a prospective longitudinal follow up study of a cohort of 871 children 5 - 12 years of age, with a nested case crossover component. The ICECI is the latest tool in the International Classification of Diseases (ICD) family and has been designed to improve the precision of coding injury events. The details of all injury events recorded in the study, as well as all measured injury related exposures, were coded using the ICECI. This paper reports a substudy on the utility and practicability of using the ICECI in the CHIPS to record exposures. Interrater reliability was quantified for a sample of injured participants using the Kappa statistic to measure concordance between codes independently coded by two research staff. Results: There were 767 diaries collected at baseline and event details from 563 injuries and exposure details from injury crossover periods. There were no event, location, or activity details which could not be coded using the ICECI. Kappa statistics for concordance between raters within each of the dimensions ranged from 0.31 to 0.93 for the injury events and 0.94 and 0.97 for activity and location in the control periods. Discussion: This study represents the first detailed account of the properties of the ICECI revealed by its use in a primary analytic epidemiological study of injury prevention. The results of this study provide considerable support for the ICECI and its further use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Invasive vertebrate pests together with overabundant native species cause significant economic and environmental damage in the Australian rangelands. Access to artificial watering points, created for the pastoral industry, has been a major factor in the spread and survival of these pests. Existing methods of controlling watering points are mechanical and cannot discriminate between target species. This paper describes an intelligent system of controlling watering points based on machine vision technology. Initial test results clearly demonstrate proof of concept for machine vision in this application. These initial experiments were carried out as part of a 3-year project using machine vision software to manage all large vertebrates in the Australian rangelands. Concurrent work is testing the use of automated gates and innovative laneway and enclosure design. The system will have application in any habitat throughout the world where a resource is limited and can be enclosed for the management of livestock or wildlife.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A system for the NDI' testing of the integrity of conposite materials and of adhesive bonds has been developed to meet industrial requirements. The vibration techniques used were found to be applicable to the development of fluid measuring transducers. The vibrational spectra of thin rectangular bars were used for the NDT work. A machined cut in a bar had a significant effect on the spectrum but a genuine crack gave an unambiguous response at high amplitudes. This was the generation of fretting crack noise at frequencies far above that of the drive. A specially designed vibrational decrement meter which, in effect, measures mechanical energy loss enabled a numerical classification of material adhesion to be obtained. This was used to study bars which had been flame or plasma sprayed with a variety of materials. It has become a useful tool in optimising coating methods. A direct industrial application was to classify piston rings of high performance I.C. engines. Each consists of a cast iron ring with a channel into which molybdenum, a good bearing surface, is sprayed. The NDT classification agreed quite well with the destructive test normally used. The techniques and equipment used for the NOT work were applied to the development of the tuning fork transducers investigated by Hassan into commercial density and viscosity devices. Using narrowly spaced, large area tines a thin lamina of fluid is trapped between them. It stores a large fraction of the vibrational energy which, acting as an inertia load reduces the frequency. Magnetostrictive and piezoelectric effects together or in combination enable the fork to be operated through a flange. This allows it to be used in pipeline or 'dipstick' applications. Using a different tine geometry the viscosity loading can be predoninant. This as well as the signal decrement of the density transducer makes a practical viscometer.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In any investigation in optometry involving more that two treatment or patient groups, an investigator should be using ANOVA to analyse the results assuming that the data conform reasonably well to the assumptions of the analysis. Ideally, specific null hypotheses should be built into the experiment from the start so that the treatments variation can be partitioned to test these effects directly. If 'post-hoc' tests are used, then an experimenter should examine the degree of protection offered by the test against the possibilities of making either a type 1 or a type 2 error. All experimenters should be aware of the complexity of ANOVA. The present article describes only one common form of the analysis, viz., that which applies to a single classification of the treatments in a randomised design. There are many different forms of the analysis each of which is appropriate to the analysis of a specific experimental design. The uses of some of the most common forms of ANOVA in optometry have been described in a further article. If in any doubt, an investigator should consult a statistician with experience of the analysis of experiments in optometry since once embarked upon an experiment with an unsuitable design, there may be little that a statistician can do to help.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach. © 2008 Springer-Verlag Berlin Heidelberg.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Urban regions present some of the most challenging areas for the remote sensing community. Many different types of land cover have similar spectral responses, making them difficult to distinguish from one another. Traditional per-pixel classification techniques suffer particularly badly because they only use these spectral properties to determine a class, and no other properties of the image, such as context. This project presents the results of the classification of a deeply urban area of Dudley, West Midlands, using 4 methods: Supervised Maximum Likelihood, SMAP, ECHO and Unsupervised Maximum Likelihood. An accuracy assessment method is then developed to allow a fair representation of each procedure and a direct comparison between them. Subsequently, a classification procedure is developed that makes use of the context in the image, though a per-polygon classification. The imagery is broken up into a series of polygons extracted from the Marr-Hildreth zero-crossing edge detector. These polygons are then refined using a region-growing algorithm, and then classified according to the mean class of the fine polygons. The imagery produced by this technique is shown to be of better quality and of a higher accuracy than that of other conventional methods. Further refinements are suggested and examined to improve the aesthetic appearance of the imagery. Finally a comparison with the results produced from a previous study of the James Bridge catchment, in Darleston, West Midlands, is made, showing that the Polygon classified ATM imagery performs significantly better than the Maximum Likelihood classified videography used in the initial study, despite the presence of geometric correction errors.