59 resultados para Associative Classifiers
Resumo:
Support vector machines (SVMs), though accurate, are not preferred in applications requiring high classification speed or when deployed in systems of limited computational resources, due to the large number of support vectors involved in the model. To overcome this problem we have devised a primal SVM method with the following properties: (1) it solves for the SVM representation without the need to invoke the representer theorem, (2) forward and backward selections are combined to approach the final globally optimal solution, and (3) a criterion is introduced for identification of support vectors leading to a much reduced support vector set. In addition to introducing this method the paper analyzes the complexity of the algorithm and presents test results on three public benchmark problems and a human activity recognition application. These applications demonstrate the effectiveness and efficiency of the proposed algorithm.
--------------------------------------------------------------------------------
Resumo:
Across a range of domains in psychology different theories assume different mental representations of knowledge. For example, in the literature on category-based inductive reasoning, certain theories (e.g., Rogers & McClelland, 2004; Sloutsky & Fisher, 2008) assume that the knowledge upon which inductive inferences are based is associative, whereas others (e.g., Heit & Rubinstein, 1994; Kemp & Tenenbaum, 2009; Osherson, Smith, Wilkie, López, & Shafir, 1990) assume that knowledge is structured. In this article we investigate whether associative and structured knowledge underlie inductive reasoning to different degrees under different processing conditions. We develop a measure of knowledge about the degree of association between categories and show that it dissociates from measures of structured knowledge. In Experiment 1 participants rated the strength of inductive arguments whose categories were either taxonomically or causally related. A measure of associative strength predicted reasoning when people had to respond fast, whereas causal and taxonomic knowledge explained inference strength when people responded slowly. In Experiment 2, we also manipulated whether the causal link between the categories was predictive or diagnostic. Participants preferred predictive to diagnostic arguments except when they responded under cognitive load. In Experiment 3, using an open-ended induction paradigm, people generated and evaluated their own conclusion categories. Inductive strength was predicted by associative strength under heavy cognitive load, whereas an index of structured knowledge was more predictive of inductive strength under minimal cognitive load. Together these results suggest that associative and structured models of reasoning apply best under different processing conditions and that the application of structured knowledge in reasoning is often effortful.
Resumo:
Mobile malware has continued to grow at an alarming rate despite on-going mitigation efforts. This has been much more prevalent on Android due to being an open platform that is rapidly overtaking other competing platforms in the mobile smart devices market. Recently, a new generation of Android malware families has emerged with advanced evasion capabilities which make them much more difficult to detect using conventional methods. This paper proposes and investigates a parallel machine learning based classification approach for early detection of Android malware. Using real malware samples and benign applications, a composite classification model is developed from parallel combination of heterogeneous classifiers. The empirical evaluation of the model under different combination schemes demonstrates its efficacy and potential to improve detection accuracy. More importantly, by utilizing several classifiers with diverse characteristics, their strengths can be harnessed not only for enhanced Android malware detection but also quicker white box analysis by means of the more interpretable constituent classifiers.
Resumo:
In this paper a multiple classifier machine learning methodology for Predictive Maintenance (PdM) is presented. PdM is a prominent strategy for dealing with maintenance issues given the increasing need to minimize downtime and associated costs. One of the challenges with PdM is generating so called ’health factors’ or quantitative indicators of the status of a system associated with a given maintenance issue, and determining their relationship to operating costs and failure risk. The proposed PdM methodology allows dynamical decision rules to be adopted for maintenance management and can be used with high-dimensional and censored data problems. This is achieved by training multiple classification modules with different prediction horizons to provide different performance trade-offs in terms of frequency of unexpected breaks and unexploited lifetime and then employing this information in an operating cost based maintenance decision system to minimise expected costs. The effectiveness of the methodology is demonstrated using a simulated example and a benchmark semiconductor manufacturing maintenance problem.
Resumo:
Many types of non-invasive brain stimulation alter corticospinal excitability (CSE). Paired associative stimulation (PAS) has attracted particular attention as its effects ostensibly adhere to Hebbian principles of neural plasticity. In prototypical form, a single electrical stimulus is directed to a peripheral nerve in close temporal contiguity with transcranial magnetic stimulation delivered to the contralateral primary motor cortex (M1). Repeated pairing of the two discrete stimulus events (i.e. association) over an extended period either increases or decreases the excitability of corticospinal projections from M1, contingent on the interstimulus interval. We studied a novel form of associative stimulation, consisting of brief trains of peripheral afferent stimulation paired with short bursts of high frequency (≥80 Hz) transcranial alternating current stimulation (tACS) over contralateral M1. Elevations in the excitability of corticospinal projections to the forearm were observed for a range of tACS frequency (80, 140 and 250 Hz), current (1, 2 and 3 mA) and duration (500 and 1000 ms) parameters. The effects were at least as reliable as those brought about by PAS or transcranial direct current stimulation. When paired with tACS, muscle tendon vibration also induced elevations of CSE. No such changes were brought about by the tACS or peripheral afferent stimulation alone. In demonstrating that associative effects are expressed when the timing of the peripheral and cortical events is not precisely circumscribed, these findings suggest that multiple cellular pathways may contribute to a long term potentiation-type response. Their relative contributions will differ depending on the nature of the induction protocol that is used.
Resumo:
Paired Associative Stimulation (PAS) has come to prominence as a potential therapeutic intervention for the treatment of brain injury/disease, and as an experimental method with which to investigate Hebbian principles of neural plasticity in humans. Prototypically, a single electrical stimulus is directed to a peripheral nerve in advance of transcranial magnetic stimulation (TMS) delivered to the contralateral primary motor cortex (M1). Repeated pairing of the stimuli (i.e., association) over an extended period may increase or decrease the excitability of corticospinal projections from M1, in manner that depends on the interstimulus interval (ISI). It has been suggested that these effects represent a form of associative long-term potentiation (LTP) and depression (LTD) that bears resemblance to spike-timing dependent plasticity (STDP) as it has been elaborated in animal models. With a large body of empirical evidence having emerged since the cardinal features of PAS were first described, and in light of the variations from the original protocols that have been implemented, it is opportune to consider whether the phenomenology of PAS remains consistent with the characteristic features that were initially disclosed. This assessment necessarily has bearing upon interpretation of the effects of PAS in relation to the specific cellular pathways that are putatively engaged, including those that adhere to the rules of STDP. The balance of evidence suggests that the mechanisms that contribute to the LTP- and LTD-type responses to PAS differ depending on the precise nature of the induction protocol that is used. In addition to emphasizing the requirement for additional explanatory models, in the present analysis we highlight the key features of the PAS phenomenology that require interpretation.
Resumo:
The introduction of predictive molecular markers has radically enhanced the identification of which patients may benefit from a given treatment. Despite recent controversies, KRAS mutation is currently the most recognized molecular predictive marker in colorectal cancer (CRC), predicting efficacy of anti-epidermal growth factor receptor (anti-EGFR) antibodies. However, other relevant markers have been reported and claimed to identify patients that will benefit from anti-EGFR therapies. This group of markers includes BRAF mutations, PI3KCA mutations, and loss of PTEN expression. Similarly, molecular markers for cytotoxic agents' efficacy also may predict outcome in patients with CRC. This review aims to summarize the most important predictive molecular classifiers in patients with CRC and further discuss any inconsistent or conflicting findings for these molecular classifiers.
Resumo:
In this study, we introduce an original distance definition for graphs, called the Markov-inverse-F measure (MiF). This measure enables the integration of classical graph theory indices with new knowledge pertaining to structural feature extraction from semantic networks. MiF improves the conventional Jaccard and/or Simpson indices, and reconciles both the geodesic information (random walk) and co-occurrence adjustment (degree balance and distribution). We measure the effectiveness of graph-based coefficients through the application of linguistic graph information for a neural activity recorded during conceptual processing in the human brain. Specifically, the MiF distance is computed between each of the nouns used in a previous neural experiment and each of the in-between words in a subgraph derived from the Edinburgh Word Association Thesaurus of English. From the MiF-based information matrix, a machine learning model can accurately obtain a scalar parameter that specifies the degree to which each voxel in (the MRI image of) the brain is activated by each word or each principal component of the intermediate semantic features. Furthermore, correlating the voxel information with the MiF-based principal components, a new computational neurolinguistics model with a network connectivity paradigm is created. This allows two dimensions of context space to be incorporated with both semantic and neural distributional representations.
Resumo:
Feature selection and feature weighting are useful techniques for improving the classification accuracy of K-nearest-neighbor (K-NN) rule. The term feature selection refers to algorithms that select the best subset of the input feature set. In feature weighting, each feature is multiplied by a weight value proportional to the ability of the feature to distinguish pattern classes. In this paper, a novel hybrid approach is proposed for simultaneous feature selection and feature weighting of K-NN rule based on Tabu Search (TS) heuristic. The proposed TS heuristic in combination with K-NN classifier is compared with several classifiers on various available data sets. The results have indicated a significant improvement in the performance in classification accuracy. The proposed TS heuristic is also compared with various feature selection algorithms. Experiments performed revealed that the proposed hybrid TS heuristic is superior to both simple TS and sequential search algorithms. We also present results for the classification of prostate cancer using multispectral images, an important problem in biomedicine.
Resumo:
The multiplicative spectrum of a complex Banach space X is the class K(X) of all (automatically compact and Hausdorff) topological spaces appearing as spectra of Banach algebras (X,*) for all possible continuous multiplications on X turning X into a commutative associative complex algebra with the unity. The properties of the multiplicative spectrum are studied. In particular, we show that K(X^n) consists of countable compact spaces with at most n non-isolated points for any separable hereditarily indecomposable Banach space X. We prove that K(C[0,1]) coincides with the class of all metrizable compact spaces.
Resumo:
Support vector machine (SVM) is a powerful technique for data classification. Despite of its good theoretic foundations and high classification accuracy, normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is highly dependent on the size of data set. This paper presents a novel SVM classification approach for large data sets by using minimum enclosing ball clustering. After the training data are partitioned by the proposed clustering method, the centers of the clusters are used for the first time SVM classification. Then we use the clusters whose centers are support vectors or those clusters which have different classes to perform the second time SVM classification. In this stage most data are removed. Several experimental results show that the approach proposed in this paper has good classification accuracy compared with classic SVM while the training is significantly faster than several other SVM classifiers.
Resumo:
A study was performed to determine if targeted metabolic profiling of cattle sera could be used to establish a predictive tool for identifying hormone misuse in cattle. Metabolites were assayed in heifers (n ) 5) treated with nortestosterone decanoate (0.85 mg/kg body weight), untreated heifers (n ) 5), steers (n ) 5) treated with oestradiol benzoate (0.15 mg/kg body weight) and untreated steers (n ) 5). Treatments were administered on days 0, 14, and 28 throughout a 42 day study period. Two support vector machines (SVMs) were trained, respectively, from heifer and steer data to identify hormonetreated animals. Performance of both SVM classifiers were evaluated by sensitivity and specificity of treatment prediction. The SVM trained on steer data achieved 97.33% sensitivity and 93.85% specificity while the one on heifer data achieved 94.67% sensitivity and 87.69% specificity. Solutions of SVM classifiers were further exploited to determine those days when classification accuracy of the SVM was most reliable. For heifers and steers, days 17-35 were determined to be the most selective. In summary, bioinformatics applied to targeted metabolic profiles generated from standard clinical chemistry analyses, has yielded an accurate, inexpensive, high-throughput test for predicting steroid abuse in cattle.
Resumo:
Logistic regression and Gaussian mixture model (GMM) classifiers have been trained to estimate the probability of acute myocardial infarction (AMI) in patients based upon the concentrations of a panel of cardiac markers. The panel consists of two new markers, fatty acid binding protein (FABP) and glycogen phosphorylase BB (GPBB), in addition to the traditional cardiac troponin I (cTnI), creatine kinase MB (CKMB) and myoglobin. The effect of using principal component analysis (PCA) and Fisher discriminant analysis (FDA) to preprocess the marker concentrations was also investigated. The need for classifiers to give an accurate estimate of the probability of AMI is argued and three categories of performance measure are described, namely discriminatory ability, sharpness, and reliability. Numerical performance measures for each category are given and applied. The optimum classifier, based solely upon the samples take on admission, was the logistic regression classifier using FDA preprocessing. This gave an accuracy of 0.85 (95% confidence interval: 0.78-0.91) and a normalised Brier score of 0.89. When samples at both admission and a further time, 1-6 h later, were included, the performance increased significantly, showing that logistic regression classifiers can indeed use the information from the five cardiac markers to accurately and reliably estimate the probability AMI. © Springer-Verlag London Limited 2008.