Biblioteca Digital

848 resultados para JEL Classification Q5

Identifying problematic classes in text classification

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Real-world text classification tasks often suffer from poor class structure with many overlapping classes and blurred boundaries. Training data pooled from multiple sources tend to be inconsistent and contain erroneous labelling, leading to poor performance of standard text classifiers. The classification of health service products to specialized procurement classes is used to examine and quantify the extent of these problems. A novel method is presented to analyze the labelled data by selectively merging classes where there is not enough information for the classifier to distinguish them. Initial results show the method can identify the most problematic classes, which can be used either as a focus to improve the training data or to merge classes to increase confidence in the predicted results of the classifier.

Toward a classification system of relational activity in consumer electronic communities: the moderators' tale

Relevância:

20.00% 20.00%

Publicador:

Variable selection for financial distress classification using a genetic algorithm

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is concerned with the use of a genetic algorithm to select financial ratios for corporate distress classification models. For this purpose, the fitness value associated to a set of ratios is made to reflect the requirements of maximizing the amount of information available for the model and minimizing the collinearity between the model inputs. A case study involving 60 failed and continuing British firms in the period 1997-2000 is used for illustration. The classification model based on ratios selected by the genetic algorithm compares favorably with a model employing ratios usually found in the financial distress literature.

Re-conceptualizing Bartlett and Ghoshal's classification of national subsidiary roles in the multinational enterprise

Relevância:

20.00% 20.00%

Publicador:

Towards a classification system of relational activity in consumer electronic communities: the moderators’ tale

Relevância:

20.00% 20.00%

Publicador:

On accounting classification and the international harmonisation debate

Relevância:

20.00% 20.00%

Publicador:

Corpus, concordance, classification: young learners in the L1 classroom

Relevância:

20.00% 20.00%

Publicador:

Biomarker discovery and redundancy reduction towards classification using a multi-factorial MALDI-TOF MS T2DM mouse model dataset

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Diabetes like many diseases and biological processes is not mono-causal. On the one hand multifactorial studies with complex experimental design are required for its comprehensive analysis. On the other hand, the data from these studies often include a substantial amount of redundancy such as proteins that are typically represented by a multitude of peptides. Coping simultaneously with both complexities (experimental and technological) makes data analysis a challenge for Bioinformatics.

Comparative work organisation, managerial hierarchies and occupational classification

Relevância:

20.00% 20.00%

Publicador:

Parkinson’s Disease tremor classification – a comparison between Support Vector Machines and neural networks

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Deep Brain Stimulation has been used in the study of and for treating Parkinson’s Disease (PD) tremor symptoms since the 1980s. In the research reported here we have carried out a comparative analysis to classify tremor onset based on intraoperative microelectrode recordings of a PD patient’s brain Local Field Potential (LFP) signals. In particular, we compared the performance of a Support Vector Machine (SVM) with two well known artificial neural network classifiers, namely a Multiple Layer Perceptron (MLP) and a Radial Basis Function Network (RBN). The results show that in this study, using specifically PD data, the SVM provided an overall better classification rate achieving an accuracy of 81% recognition.

Obesity and body fat classification in the metabolic syndrome: impact on cardiometabolic risk metabotype

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Obesity is a key factor in the development of the metabolic syndrome (MetS), which is associated with increased cardiometabolic risk. We investigated whether obesity classification by body mass index (BMI) and body fat percentage (BF%) influences cardiometabolic profile and dietary responsiveness in 486 MetS subjects (LIPGENE dietary intervention study). Anthropometric measures, markers of inflammation and glucose metabolism, lipid profiles, adhesion molecules and haemostatic factors were determined at baseline and after 12 weeks of 4 dietary interventions (high saturated fat (SFA), high monounsaturated fat (MUFA) and 2 low fat high complex carbohydrate (LFHCC) diets, 1 supplemented with long chain n-3 polyunsaturated fatty acids (LC n-3 PUFAs)). 39% and 87% of subjects classified as normal and overweight by BMI were obese according to their BF%. Individuals classified as obese by BMI (± 30 kg/m2) and BF% (± 25% (men) and ± 35% (women)) (OO, n = 284) had larger waist and hip measurements, higher BMI and were heavier (P < 0.001) than those classified as non-obese by BMI but obese by BF% (NOO, n = 92). OO individuals displayed a more pro-inflammatory (higher C reactive protein (CRP) and leptin), pro-thrombotic (higher plasminogen activator inhibitor-1 (PAI-1)), pro-atherogenic (higher leptin/adiponectin ratio) and more insulin resistant (higher HOMA-IR) metabolic profile relative to the NOO group (P < 0.001). Interestingly, tumour necrosis factor alpha (TNF-α) concentrations were lower post-intervention in NOO individuals compared to OO subjects (P < 0.001). In conclusion, assessing BF% and BMI as part of a metabotype may help identify individuals at greater cardiometabolic risk than BMI alone.

The monitoring of ecological quality and the classification of standing waters in temperate regions: a review and proposal based on a worked scheme for British waters

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reviews the ways that quality can be assessed in standing waters, a subject that has hitherto attracted little attention but which is now a legal requirement in Europe. It describes a scheme for the assessment and monitoring of water and ecological quality in standing waters greater than about I ha in area in England & Wales although it is generally relevant to North-west Europe. Thirteen hydrological, chemical and biological variables are used to characterise the standing water body in any current sampling. These are lake volume, maximum depth, onductivity, Secchi disc transparency, pH, total alkalinity, calcium ion concentration, total N concentration,winter total oxidised inorganic nitrogen (effectively nitrate) concentration, total P concentration, potential maximum chlorophyll a concentration, a score based on the nature of the submerged and emergent plant community, and the presence or absence of a fish community. Inter alia these variables are key indicators of the state of eutrophication, acidification, salinisation and infilling of a water body.

PMCRI: a parallel modular classification rule induction framework

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In a world where massive amounts of data are recorded on a large scale we need data mining technologies to gain knowledge from the data in a reasonable time. The Top Down Induction of Decision Trees (TDIDT) algorithm is a very widely used technology to predict the classification of newly recorded data. However alternative technologies have been derived that often produce better rules but do not scale well on large datasets. Such an alternative to TDIDT is the PrismTCS algorithm. PrismTCS performs particularly well on noisy data but does not scale well on large datasets. In this paper we introduce Prism and investigate its scaling behaviour. We describe how we improved the scalability of the serial version of Prism and investigate its limitations. We then describe our work to overcome these limitations by developing a framework to parallelise algorithms of the Prism family and similar algorithms. We also present the scale up results of a first prototype implementation.

Parallel induction of modular classification rules

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Distributed Rule Induction (DRI) project at the University of Portsmouth is concerned with distributed data mining algorithms for automatically generating rules of all kinds. In this paper we present a system architecture and its implementation for inducing modular classification rules in parallel in a local area network using a distributed blackboard system. We present initial results of a prototype implementation based on the Prism algorithm.

P-Prism: a computationally efficient approach to scaling up classification rule induction

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Top Down Induction of Decision Trees (TDIDT) is the most commonly used method of constructing a model from a dataset in the form of classification rules to classify previously unseen data. Alternative algorithms have been developed such as the Prism algorithm. Prism constructs modular rules which produce qualitatively better rules than rules induced by TDIDT. However, along with the increasing size of databases, many existing rule learning algorithms have proved to be computational expensive on large datasets. To tackle the problem of scalability, parallel classification rule induction algorithms have been introduced. As TDIDT is the most popular classifier, even though there are strongly competitive alternative algorithms, most parallel approaches to inducing classification rules are based on TDIDT. In this paper we describe work on a distributed classifier that induces classification rules in a parallel manner based on Prism.

«
1
2
...
48
49
50
51
52
53
54
...
56
57
»