66 resultados para Supervised classifiers
Resumo:
In many domains when we have several competing classifiers available we want to synthesize them or some of them to get a more accurate classifier by a combination function. In this paper we propose a ‘class-indifferent’ method for combining classifier decisions represented by evidential structures called triplet and quartet, using Dempster's rule of combination. This method is unique in that it distinguishes important elements from the trivial ones in representing classifier decisions, makes use of more information than others in calculating the support for class labels and provides a practical way to apply the theoretically appealing Dempster–Shafer theory of evidence to the problem of ensemble learning. We present a formalism for modelling classifier decisions as triplet mass functions and we establish a range of formulae for combining these mass functions in order to arrive at a consensus decision. In addition we carry out a comparative study with the alternatives of simplet and dichotomous structure and also compare two combination methods, Dempster's rule and majority voting, over the UCI benchmark data, to demonstrate the advantage our approach offers. (A continuation of the work in this area that was published in IEEE Trans on KDE, and conferences)
Resumo:
The diagnosis of myelodysplastic syndrome (MDS) currently relies primarily on the morphologic assessment of the patient's bone marrow and peripheral blood cells. Moreover, prognostic scoring systems rely on observer-dependent assessments of blast percentage and dysplasia. Gene expression profiling could enhance current diagnostic and prognostic systems by providing a set of standardized, objective gene signatures. Within the Microarray Innovations in LEukemia study, a diagnostic classification model was investigated to distinguish the distinct subclasses of pediatric and adult leukemia, as well as MDS. Overall, the accuracy of the diagnostic classification model for subtyping leukemia was approximately 93%, but this was not reflected for the MDS samples giving only approximately 50% accuracy. Discordant samples of MDS were classified either into acute myeloid leukemia (AML) or
Resumo:
This paper presents a lookup circuit with advanced memory techniques and algorithms that examines network packet headers at high throughput rates. Hardware solutions and test scenarios are introduced to evaluate the proposed approach. The experimental results show that the proposed lookup circuit is able to achieve at least 39 million packet header lookups per second, which facilitates the application of next-generation stateful packet classifications at beyond 20Gbps internet traffic throughput rates.
Resumo:
Support vector machines (SVMs), though accurate, are not preferred in applications requiring high classification speed or when deployed in systems of limited computational resources, due to the large number of support vectors involved in the model. To overcome this problem we have devised a primal SVM method with the following properties: (1) it solves for the SVM representation without the need to invoke the representer theorem, (2) forward and backward selections are combined to approach the final globally optimal solution, and (3) a criterion is introduced for identification of support vectors leading to a much reduced support vector set. In addition to introducing this method the paper analyzes the complexity of the algorithm and presents test results on three public benchmark problems and a human activity recognition application. These applications demonstrate the effectiveness and efficiency of the proposed algorithm.
--------------------------------------------------------------------------------
Resumo:
Mobile malware has continued to grow at an alarming rate despite on-going mitigation efforts. This has been much more prevalent on Android due to being an open platform that is rapidly overtaking other competing platforms in the mobile smart devices market. Recently, a new generation of Android malware families has emerged with advanced evasion capabilities which make them much more difficult to detect using conventional methods. This paper proposes and investigates a parallel machine learning based classification approach for early detection of Android malware. Using real malware samples and benign applications, a composite classification model is developed from parallel combination of heterogeneous classifiers. The empirical evaluation of the model under different combination schemes demonstrates its efficacy and potential to improve detection accuracy. More importantly, by utilizing several classifiers with diverse characteristics, their strengths can be harnessed not only for enhanced Android malware detection but also quicker white box analysis by means of the more interpretable constituent classifiers.
Resumo:
Many modeling problems require to estimate a scalar output from one or more time series. Such problems are usually tackled by extracting a fixed number of features from the time series (like their statistical moments), with a consequent loss in information that leads to suboptimal predictive models. Moreover, feature extraction techniques usually make assumptions that are not met by real world settings (e.g. uniformly sampled time series of constant length), and fail to deliver a thorough methodology to deal with noisy data. In this paper a methodology based on functional learning is proposed to overcome the aforementioned problems; the proposed Supervised Aggregative Feature Extraction (SAFE) approach allows to derive continuous, smooth estimates of time series data (yielding aggregate local information), while simultaneously estimating a continuous shape function yielding optimal predictions. The SAFE paradigm enjoys several properties like closed form solution, incorporation of first and second order derivative information into the regressor matrix, interpretability of the generated functional predictor and the possibility to exploit Reproducing Kernel Hilbert Spaces setting to yield nonlinear predictive models. Simulation studies are provided to highlight the strengths of the new methodology w.r.t. standard unsupervised feature selection approaches. © 2012 IEEE.
Resumo:
In this paper a multiple classifier machine learning methodology for Predictive Maintenance (PdM) is presented. PdM is a prominent strategy for dealing with maintenance issues given the increasing need to minimize downtime and associated costs. One of the challenges with PdM is generating so called ’health factors’ or quantitative indicators of the status of a system associated with a given maintenance issue, and determining their relationship to operating costs and failure risk. The proposed PdM methodology allows dynamical decision rules to be adopted for maintenance management and can be used with high-dimensional and censored data problems. This is achieved by training multiple classification modules with different prediction horizons to provide different performance trade-offs in terms of frequency of unexpected breaks and unexploited lifetime and then employing this information in an operating cost based maintenance decision system to minimise expected costs. The effectiveness of the methodology is demonstrated using a simulated example and a benchmark semiconductor manufacturing maintenance problem.
Resumo:
Effectiveness of brief/minimal contact self-activation interventions that encourage participation in physical activity (PA) for chronic low back pain (CLBP >12 weeks) is unproven. The primary objective of this assessor-blinded randomized controlled trial was to investigate the difference between an individualized walking programme (WP), group exercise class (EC), and usual physiotherapy (UP, control) in mean change in functional disability at 6 months. A sample of 246 participants with CLBP aged 18 to 65 years (79 men and 167 women; mean age ± SD: 45.4 ± 11.4 years) were recruited from 5 outpatient physiotherapy departments in Dublin, Ireland. Consenting participants completed self-report measures of functional disability, pain, quality of life, psychosocial beliefs, and PA were randomly allocated to the WP (n = 82), EC (n = 83), or UP (n = 81) and followed up at 3 (81%; n = 200), 6 (80.1%; n = 197), and 12 months (76.4%; n = 188). Cost diaries were completed at all follow-ups. An intention-to-treat analysis using a mixed between-within repeated-measures analysis of covariance found significant improvements over time on the Oswestry Disability Index (Primary Outcome), the Numerical Rating Scale, Fear Avoidance-PA scale, and the EuroQol EQ-5D-3L Weighted Health Index (P < 0.05), but no significant between-group differences and small between-group effect sizes (WP: mean difference at 6 months, 6.89 Oswestry Disability Index points, 95% confidence interval [CI] -3.64 to -10.15; EC: -5.91, CI: -2.68 to -9.15; UP: -5.09, CI: -1.93 to -8.24). The WP had the lowest mean costs and the highest level of adherence. Supervised walking provides an effective alternative to current forms of CLBP management.