6 resultados para decision tree
em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast
Resumo:
This paper proposes a new hierarchical learning structure, namely the holistic triple learning (HTL), for extending the binary support vector machine (SVM) to multi-classification problems. For an N-class problem, a HTL constructs a decision tree up to a depth of A leaf node of the decision tree is allowed to be placed with a holistic triple learning unit whose generalisation abilities are assessed and approved. Meanwhile, the remaining nodes in the decision tree each accommodate a standard binary SVM classifier. The holistic triple classifier is a regression model trained on three classes, whose training algorithm is originated from a recently proposed implementation technique, namely the least-squares support vector machine (LS-SVM). A major novelty with the holistic triple classifier is the reduced number of support vectors in the solution. For the resultant HTL-SVM, an upper bound of the generalisation error can be obtained. The time complexity of training the HTL-SVM is analysed, and is shown to be comparable to that of training the one-versus-one (1-vs.-1) SVM, particularly on small-scale datasets. Empirical studies show that the proposed HTL-SVM achieves competitive classification accuracy with a reduced number of support vectors compared to the popular 1-vs-1 alternative.
Resumo:
The Magellanic Clouds are uniquely placed to study the stellar contribution to dust emission. Individual stars can be resolved in these systems even in the mid-infrared, and they are close enough to allow detection of infrared excess caused by dust. We have searched the Spitzer Space Telescope data archive for all Infrared Spectrograph (IRS) staring-mode observations of the Small Magellanic Cloud (SMC) and found that 209 Infrared Array Camera (IRAC) point sources within the footprint of the Surveying the Agents of Galaxy Evolution in the Small Magellanic Cloud (SAGE-SMC) Spitzer Legacy programme were targeted, within a total of 311 staring-mode observations. We classify these point sources using a decision tree method of object classification, based on infrared spectral features, continuum and spectral energy distribution shape, bolometric luminosity, cluster membership and variability information. We find 58 asymptotic giant branch (AGB) stars, 51 young stellar objects, 4 post-AGB objects, 22 red supergiants, 27 stars (of which 23 are dusty OB stars), 24 planetary nebulae (PNe), 10 Wolf-Rayet stars, 3 H II regions, 3 R Coronae Borealis stars, 1 Blue Supergiant and 6 other objects, including 2 foreground AGB stars. We use these classifications to evaluate the success of photometric classification methods reported in the literature.
Resumo:
Clusters of text documents output by clustering algorithms are often hard to interpret. We describe motivating real-world scenarios that necessitate reconfigurability and high interpretability of clusters and outline the problem of generating clusterings with interpretable and reconfigurable cluster models. We develop two clustering algorithms toward the outlined goal of building interpretable and reconfigurable cluster models. They generate clusters with associated rules that are composed of conditions on word occurrences or nonoccurrences. The proposed approaches vary in the complexity of the format of the rules; RGC employs disjunctions and conjunctions in rule generation whereas RGC-D rules are simple disjunctions of conditions signifying presence of various words. In both the cases, each cluster is comprised of precisely the set of documents that satisfy the corresponding rule. Rules of the latter kind are easy to interpret, whereas the former leads to more accurate clustering. We show that our approaches outperform the unsupervised decision tree approach for rule-generating clustering and also an approach we provide for generating interpretable models for general clusterings, both by significant margins. We empirically show that the purity and f-measure losses to achieve interpretability can be as little as 3 and 5%, respectively using the algorithms presented herein.
Resumo:
Algorithms for concept drift handling are important for various applications including video analysis and smart grids. In this paper we present decision tree ensemble classication method based on the Random Forest algorithm for concept drift. The weighted majority voting ensemble aggregation rule is employed based on the ideas of Accuracy Weighted Ensemble (AWE) method. Base learner weight in our case is computed for each sample evaluation using base learners accuracy and intrinsic proximity measure of Random Forest. Our algorithm exploits both temporal weighting of samples and ensemble pruning as a forgetting strategy. We present results of empirical comparison of our method with îriginal random forest with incorporated replace-the-looser forgetting andother state-of-the-art concept-drift classiers like AWE2.
Resumo:
Background: Sepsis can lead to multiple organ failure and death. Timely and appropriate treatment can reduce in-hospital mortality and morbidity. Objectives: To determine the clinical effectiveness and cost-effectiveness of three tests [LightCycler SeptiFast Test MGRADE® (Roche Diagnostics, Risch-Rotkreuz, Switzerland); SepsiTest™ (Molzym Molecular Diagnostics, Bremen, Germany); and the IRIDICA BAC BSI assay (Abbott Diagnostics, Lake Forest, IL, USA)] for the rapid identification of bloodstream bacteria and fungi in patients with suspected sepsis compared with standard practice (blood culture with or without matrix-absorbed laser desorption/ionisation time-offlight mass spectrometry). Data sources: Thirteen electronic databases (including MEDLINE, EMBASE and The Cochrane Library) were searched from January 2006 to May 2015 and supplemented by hand-searching relevant articles. Review methods: A systematic review and meta-analysis of effectiveness studies were conducted. A review of published economic analyses was undertaken and a de novo health economic model was constructed. A decision tree was used to estimate the costs and quality-adjusted life-years (QALYs) associated with each test; all other parameters were estimated from published sources. The model was populated with evidence from the systematic review or individual studies, if this was considered more appropriate (base case 1). In a secondary analysis, estimates (based on experience and opinion) from seven clinicians regarding the benefits of earlier test results were sought (base case 2). A NHS and Personal Social Services perspective was taken, and costs and benefits were discounted at 3.5% per annum. Scenario analyses were used to assess uncertainty. Results: For the review of diagnostic test accuracy, 62 studies of varying methodological quality were included. A meta-analysis of 54 studies comparing SeptiFast with blood culture found that SeptiFast had an estimated summary specificity of 0.86 [95% credible interval (CrI) 0.84 to 0.89] and sensitivity of 0.65 (95% CrI 0.60 to 0.71). Four studies comparing SepsiTest with blood culture found that SepsiTest had an estimated summary specificity of 0.86 (95% CrI 0.78 to 0.92) and sensitivity of 0.48 (95% CrI 0.21 to 0.74), and four studies comparing IRIDICA with blood culture found that IRIDICA had an estimated summary specificity of 0.84 (95% CrI 0.71 to 0.92) and sensitivity of 0.81 (95% CrI 0.69 to 0.90). Owing to the deficiencies in study quality for all interventions, diagnostic accuracy data should be treated with caution. No randomised clinical trial evidence was identified that indicated that any of the tests significantly improved key patient outcomes, such as mortality or duration in an intensive care unit or hospital. Base case 1 estimated that none of the three tests provided a benefit to patients compared with standard practice and thus all tests were dominated. In contrast, in base case 2 it was estimated that all cost per QALY-gained values were below £20,000; the IRIDICA BAC BSI assay had the highest estimated incremental net benefit, but results from base case 2 should be treated with caution as these are not evidence based. Limitations: Robust data to accurately assess the clinical effectiveness and cost-effectiveness of the interventions are currently unavailable. Conclusions: The clinical effectiveness and cost-effectiveness of the interventions cannot be reliably determined with the current evidence base. Appropriate studies, which allow information from the tests to be implemented in clinical practice, are required.
Resumo:
Adaptive Multiple-Input Multiple-Output (MIMO) systems achieve a much higher information rate than conventional fixed schemes due to their ability to adapt their configurations according to the wireless communications environment. However, current adaptive MIMO detection schemes exhibit either low performance (and hence low spectral efficiency) or huge computational
complexity. In particular, whilst deterministic Sphere Decoder (SD) detection schemes are well established for static MIMO systems, exhibiting deterministic parallel structure, low computational complexity and quasi-ML detection performance, there are no corresponding adaptive schemes. This paper solves
this problem, describing a hybrid tree based adaptive modulation detection scheme. Fixed Complexity Sphere Decoding (FSD) and Real-Values FSD (RFSD) are modified and combined into a hybrid scheme exploited at low and medium SNR to provide the highest possible information rate with quasi-ML Bit Error
Rate (BER) performance, while Reduced Complexity RFSD, BChase and Decision Feedback (DFE) schemes are exploited in the high SNR regions. This algorithm provides the facility to balance the detection complexity with BER performance with compatible information rate in dynamic, adaptive MIMO communications
environments.