878 resultados para Classification, Decimal
Resumo:
This paper is concerned with the use of a genetic algorithm to select financial ratios for corporate distress classification models. For this purpose, the fitness value associated to a set of ratios is made to reflect the requirements of maximizing the amount of information available for the model and minimizing the collinearity between the model inputs. A case study involving 60 failed and continuing British firms in the period 1997-2000 is used for illustration. The classification model based on ratios selected by the genetic algorithm compares favorably with a model employing ratios usually found in the financial distress literature.
Resumo:
Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics.
Resumo:
Deep Brain Stimulation has been used in the study of and for treating Parkinson’s Disease (PD) tremor symptoms since the 1980s. In the research reported here we have carried out a comparative analysis to classify tremor onset based on intraoperative microelectrode recordings of a PD patient’s brain Local Field Potential (LFP) signals. In particular, we compared the performance of a Support Vector Machine (SVM) with two well known artificial neural network classifiers, namely a Multiple Layer Perceptron (MLP) and a Radial Basis Function Network (RBN). The results show that in this study, using specifically PD data, the SVM provided an overall better classification rate achieving an accuracy of 81% recognition.
Resumo:
Obesity is a key factor in the development of the metabolic syndrome (MetS), which is associated with increased cardiometabolic risk. We investigated whether obesity classification by body mass index (BMI) and body fat percentage (BF%) influences cardiometabolic profile and dietary responsiveness in 486 MetS subjects (LIPGENE dietary intervention study). Anthropometric measures, markers of inflammation and glucose metabolism, lipid profiles, adhesion molecules and haemostatic factors were determined at baseline and after 12 weeks of 4 dietary interventions (high saturated fat (SFA), high monounsaturated fat (MUFA) and 2 low fat high complex carbohydrate (LFHCC) diets, 1 supplemented with long chain n-3 polyunsaturated fatty acids (LC n-3 PUFAs)). 39% and 87% of subjects classified as normal and overweight by BMI were obese according to their BF%. Individuals classified as obese by BMI (± 30 kg/m2) and BF% (± 25% (men) and ± 35% (women)) (OO, n = 284) had larger waist and hip measurements, higher BMI and were heavier (P < 0.001) than those classified as non-obese by BMI but obese by BF% (NOO, n = 92). OO individuals displayed a more pro-inflammatory (higher C reactive protein (CRP) and leptin), pro-thrombotic (higher plasminogen activator inhibitor-1 (PAI-1)), pro-atherogenic (higher leptin/adiponectin ratio) and more insulin resistant (higher HOMA-IR) metabolic profile relative to the NOO group (P < 0.001). Interestingly, tumour necrosis factor alpha (TNF-α) concentrations were lower post-intervention in NOO individuals compared to OO subjects (P < 0.001). In conclusion, assessing BF% and BMI as part of a metabotype may help identify individuals at greater cardiometabolic risk than BMI alone.
Resumo:
This paper reviews the ways that quality can be assessed in standing waters, a subject that has hitherto attracted little attention but which is now a legal requirement in Europe. It describes a scheme for the assessment and monitoring of water and ecological quality in standing waters greater than about I ha in area in England & Wales although it is generally relevant to North-west Europe. Thirteen hydrological, chemical and biological variables are used to characterise the standing water body in any current sampling. These are lake volume, maximum depth, onductivity, Secchi disc transparency, pH, total alkalinity, calcium ion concentration, total N concentration,winter total oxidised inorganic nitrogen (effectively nitrate) concentration, total P concentration, potential maximum chlorophyll a concentration, a score based on the nature of the submerged and emergent plant community, and the presence or absence of a fish community. Inter alia these variables are key indicators of the state of eutrophication, acidification, salinisation and infilling of a water body.
Resumo:
In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.
Resumo:
The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.
Resumo:
Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.
Resumo:
Induction of classification rules is one of the most important technologies in data mining. Most of the work in this field has concentrated on the Top Down Induction of Decision Trees (TDIDT) approach. However, alternative approaches have been developed such as the Prism algorithm for inducing modular rules. Prism often produces qualitatively better rules than TDIDT but suffers from higher computational requirements. We investigate approaches that have been developed to minimize the computational requirements of TDIDT, in order to find analogous approaches that could reduce the computational requirements of Prism.