949 resultados para decision tree
Resumo:
The aims of the project were twofold: 1) To investigate classification procedures for remotely sensed digital data, in order to develop modifications to existing algorithms and propose novel classification procedures; and 2) To investigate and develop algorithms for contextual enhancement of classified imagery in order to increase classification accuracy. The following classifiers were examined: box, decision tree, minimum distance, maximum likelihood. In addition to these the following algorithms were developed during the course of the research: deviant distance, look up table and an automated decision tree classifier using expert systems technology. Clustering techniques for unsupervised classification were also investigated. Contextual enhancements investigated were: mode filters, small area replacement and Wharton's CONAN algorithm. Additionally methods for noise and edge based declassification and contextual reclassification, non-probabilitic relaxation and relaxation based on Markov chain theory were developed. The advantages of per-field classifiers and Geographical Information Systems were investigated. The conclusions presented suggest suitable combinations of classifier and contextual enhancement, given user accuracy requirements and time constraints. These were then tested for validity using a different data set. A brief examination of the utility of the recommended contextual algorithms for reducing the effects of data noise was also carried out.
Resumo:
Tonal, textural and contextual properties are used in manual photointerpretation of remotely sensed data. This study has used these three attributes to produce a lithological map of semi arid northwest Argentina by semi automatic computer classification procedures of remotely sensed data. Three different types of satellite data were investigated, these were LANDSAT MSS, TM and SIR-A imagery. Supervised classification procedures using tonal features only produced poor classification results. LANDSAT MSS produced classification accuracies in the range of 40 to 60%, while accuracies of 50 to 70% were achieved using LANDSAT TM data. The addition of SIR-A data produced increases in the classification accuracy. The increased classification accuracy of TM over the MSS is because of the better discrimination of geological materials afforded by the middle infra red bands of the TM sensor. The maximum likelihood classifier consistently produced classification accuracies 10 to 15% higher than either the minimum distance to means or decision tree classifier, this improved accuracy was obtained at the cost of greatly increased processing time. A new type of classifier the spectral shape classifier, which is computationally as fast as a minimum distance to means classifier is described. However, the results for this classifier were disappointing, being lower in most cases than the minimum distance or decision tree procedures. The classification results using only tonal features were felt to be unacceptably poor, therefore textural attributes were investigated. Texture is an important attribute used by photogeologists to discriminate lithology. In the case of TM data, texture measures were found to increase the classification accuracy by up to 15%. However, in the case of the LANDSAT MSS data the use of texture measures did not provide any significant increase in the accuracy of classification. For TM data, it was found that second order texture, especially the SGLDM based measures, produced highest classification accuracy. Contextual post processing was found to increase classification accuracy and improve the visual appearance of classified output by removing isolated misclassified pixels which tend to clutter classified images. Simple contextual features, such as mode filters were found to out perform more complex features such as gravitational filter or minimal area replacement methods. Generally the larger the size of the filter, the greater the increase in the accuracy. Production rules were used to build a knowledge based system which used tonal and textural features to identify sedimentary lithologies in each of the two test sites. The knowledge based system was able to identify six out of ten lithologies correctly.
Resumo:
This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.
Resumo:
The studies presented in this thesis were carried out because of a lack of previous research with respect to (a) the habits and attitudes towards retinoscopy and (b) the relative accuracy of dedicated retinoscopes compared to combined types in which changing the bulb allows use in spot or streak mode. An online British survey received responses from 298 optometrists. Decision tree analyses revealed that optometrists working in multiple practices tended to rely less on retinoscopy than those in the independent sector. Only half of the respondents used dynamic retinoscopy. The majority, however, agreed that retinoscopy was an important test. The University attended also influenced the type of retinoscope used and the use of autorefractors. Combined retinoscopes were used most by the more recently qualified optometrists and few agreed that combined retinoscopes were less accurate. A trial indicated that combined and dedicated retinoscopes were equally accurate. Here, 4 optometrists (2 using spot and 2 using streak retinoscopes) tested one eye of 6 patients using combined and dedicated retinoscopes. This trial also demonstrated the utility of the relatively unknown ’15 degrees of freedom’ rule that exploits replication in factorial ANOVA designs to achieve sufficient statistical power when recruitment is limited. An opportunistic international survey explored the use of retinoscopy by 468 practitioners (134 ophthalmologists, 334 optometrists) attending contact related courses. Decision tree analyses found (a) no differences in the habits of optometrists and ophthalmologists, (b) differences in the reliance on retinoscopy and use of dynamic techniques across the participating countries and (c) some evidence that younger practitioners were using static and dynamic retinoscopy least often. In conclusion, this study has revealed infrequent use of static and dynamic retinoscopy by some optometrists, which may be the only means of determining refractive error and evaluating accommodation in patients with communication difficulties.
Resumo:
The objective of this study was to investigate the effects of circularity, comorbidity, prevalence and presentation variation on the accuracy of differential diagnoses made in optometric primary care using a modified form of naïve Bayesian sequential analysis. No such investigation has ever been reported before. Data were collected for 1422 cases seen over one year. Positive test outcomes were recorded for case history (ethnicity, age, symptoms and ocular and medical history) and clinical signs in relation to each diagnosis. For this reason only positive likelihood ratios were used for this modified form of Bayesian analysis that was carried out with Laplacian correction and Chi-square filtration. Accuracy was expressed as the percentage of cases for which the diagnoses made by the clinician appeared at the top of a list generated by Bayesian analysis. Preliminary analyses were carried out on 10 diagnoses and 15 test outcomes. Accuracy of 100% was achieved in the absence of presentation variation but dropped by 6% when variation existed. Circularity artificially elevated accuracy by 0.5%. Surprisingly, removal of Chi-square filtering increased accuracy by 0.4%. Decision tree analysis showed that accuracy was influenced primarily by prevalence followed by presentation variation and comorbidity. Analysis of 35 diagnoses and 105 test outcomes followed. This explored the use of positive likelihood ratios, derived from the case history, to recommend signs to look for. Accuracy of 72% was achieved when all clinical signs were entered. The drop in accuracy, compared to the preliminary analysis, was attributed to the fact that some diagnoses lacked strong diagnostic signs; the accuracy increased by 1% when only recommended signs were entered. Chi-square filtering improved recommended test selection. Decision tree analysis showed that accuracy again influenced primarily by prevalence, followed by comorbidity and presentation variation. Future work will explore the use of likelihood ratios based on positive and negative test findings prior to considering naïve Bayesian analysis as a form of artificial intelligence in optometric practice.
Resumo:
Today, due to globalization of the world the size of data set is increasing, it is necessary to discover the knowledge. The discovery of knowledge can be typically in the form of association rules, classification rules, clustering, discovery of frequent episodes and deviation detection. Fast and accurate classifiers for large databases are an important task in data mining. There is growing evidence that integrating classification and association rules mining, classification approaches based on heuristic, greedy search like decision tree induction. Emerging associative classification algorithms have shown good promises on producing accurate classifiers. In this paper we focus on performance of associative classification and present a parallel model for classifier building. For classifier building some parallel-distributed algorithms have been proposed for decision tree induction but so far no such work has been reported for associative classification.
Resumo:
Electrocardiography (ECG) has been recently proposed as biometric trait for identification purposes. Intra-individual variations of ECG might affect identification performance. These variations are mainly due to Heart Rate Variability (HRV). In particular, HRV causes changes in the QT intervals along the ECG waveforms. This work is aimed at analysing the influence of seven QT interval correction methods (based on population models) on the performance of ECG-fiducial-based identification systems. In addition, we have also considered the influence of training set size, classifier, classifier ensemble as well as the number of consecutive heartbeats in a majority voting scheme. The ECG signals used in this study were collected from thirty-nine subjects within the Physionet open access database. Public domain software was used for fiducial points detection. Results suggested that QT correction is indeed required to improve the performance. However, there is no clear choice among the seven explored approaches for QT correction (identification rate between 0.97 and 0.99). MultiLayer Perceptron and Support Vector Machine seemed to have better generalization capabilities, in terms of classification performance, with respect to Decision Tree-based classifiers. No such strong influence of the training-set size and the number of consecutive heartbeats has been observed on the majority voting scheme.
Resumo:
The problems of the cognitive development of subject “perception” are discussed in the thesis: from the object being studied and means of action till the single system “subject – modus operandi of subject – object”. Problems of increasing adequacy of models of “live” nature are analyzed. The concept of developing decisionmaking support systems as expert systems to decision-making support systems as personal device of a decisionmaker is discussed. The experience of the development of qualitative prediction on the basis of polyvalent dependences, represented by a decision tree, which realizes the concept of “plural subjective determinism”, is analyzed. The examples of applied systems prediction of ecological-economic and social processes are given. The ways of their development are discussed.
Resumo:
Refraction simulators used for undergraduate training at Aston University did not realistically reflect variations in the relationship between vision and ametropia. This was because they used an algorithm, taken from the research literature, that strictly only applied to myopes or older hyperopes and did not factor in age and pupil diameter. The aim of this study was to generate new algorithms that overcame these limitations. Clinical data were collected from the healthy right eyes of 873 white subjects aged between 20 and 70 years. Vision and refractive error were recorded along with age and pupil diameter. Re-examination of 34 subjects enabled the calculation of coefficients of repeatability. The study population was slightly biased towards females and included many contact lens wearers. Sex and contact lens wear were, therefore, recorded in order to determine whether these might influence the findings. In addition, iris colour and cylinder axis orientation were recorded as these might also be influential. A novel Blur Sensitivity Ratio (BSR) was derived by dividing vision (expressed as minimum angle of resolution) by refractive error (expressed as a scalar vector, U). Alteration of the scalar vector, to account for additional vision reduction due to oblique cylinder axes, was not found to be useful. Decision tree analysis showed that sex, contact lens wear, iris colour and cylinder axis orientation did not influence the BSR. The following algorithms arose from two stepwise multiple linear regressions: BSR (myopes) = 1.13 + (0.24 x pupil diameter) + (0.14 x U) BSR (hyperopes) = (0.11 x pupil diameter) + (0.03 x age) - 0.22 These algorithms together accounted for 84% of the observed variance. They showed that pupil diameter influenced vision in both forms of ametropia. They also showed the age-related decline in the ability to accommodate in order to overcome reduced vision in hyperopia.
Resumo:
Accurate colour vision testing requires using the correct illumination. With the plethora of 'daylight' lamps available, is there a cost-effective alternative to the discontinued MacBeth Easel lamp? Smoking is a known risk factor for macula degeneration. As the macula is responsible for colour discrimination, any toxin that affects it has the potential to influence colour discrimination. Aims: To find a costeffective light source for colour vision testing. To investigate the effect of smoking on colour discrimination. To explore how deuteranomalous trichromats compare with normal trichromats. Methods: Using the Ishihara colour vision test subjects were classified into the groups: 'Normal/Control', 'Smoker/Test', and 'Case Study' (subjects who failed the screening test and did not smoke). They completed the Farnsworth Munsell 100 Hue test under each of the three light sources: Phillips EcoHalo Twist (tungsten halogen - THL), Kosnic KCF07ALU/GU10-865 (compact fluorescent- CFL), and Deal Guardian Ltd. GU103X2WA4B-60 (light-emitting diode - LED) Results: 42 subjects took part in the study: 18 in the Normal/Control group, 18 in the Smoker/Test group, and 6 in the Case Study group. For the Normal/Control group the total error scores (TESs) were significantly lower with the CFL than with the THL (p = 0.017) as it was for the Case Study group (p = 0.009). No significant differences were found between the Normal/Control group and the Smoker/Test group for each light source. Decision tree analysis found pack years to be a significant variable for TES. Discussion: All three light sources were comparable with previous studies. The CFL provided better colour discrimination than the LED despite them both being 6500 K. Deuteranomalous trichromats showed a greatest deviation than normal trichromats using the LED. Conclusions: The Kosnic KCF07ALU/GU10-865 is a cost-effective alternative for colour vision testing. Smoking appears to have an effect on colour vision, but requires further investigation.
Resumo:
* The work is supported by RFBR, grant 04-01-00858-a
Resumo:
The system of development unstable processes prediction is given. It is based on a decision-tree method. The processing technique of the expert information is offered. It is indispensable for constructing and processing by a decision-tree method. In particular data is set in the fuzzy form. The original search algorithms of optimal paths of development of the forecast process are described. This one is oriented to processing of trees of large dimension with vector estimations of arcs.
Resumo:
In this paper the issues of Ukrainian new three-level pension system are discussed. First, the paper presents the mathematical model that allows calculating the optimal size of contributions to the non-state pension fund. Next, the non-state pension fund chooses an Asset Management Company. To do so it is proposed to use an approach based on Kohonen networks to classify asset management companies that work in Ukrainian market. Further, when the asset management company is chosen, it receives the pension contributions of the participants of the non-pension fund. Asset Management Company has to invest these contributions profitably. This paper proposes an approach for choosing the most profitable investment project using decision trees. The new pension system has been lawfully ratified only four years ago and is still developing, that is why this paper is very important.
Resumo:
In this article there are considered problems of forecasting economical macroparameters, and in the first place, index of inflation. Concept of development of synthetical forecasting methods which use directly specified expert information as well as calculation result on the basis of objective economical and mathematical models for forecasting separate “slowly changeable parameters” are offered. This article discusses problems of macroparameters operation on the basis of analysis of received prognostic magnitude.
Resumo:
Allergy is an overreaction by the immune system to a previously encountered, ordinarily harmless substance - typically proteins - resulting in skin rash, swelling of mucous membranes, sneezing or wheezing, or other abnormal conditions. The use of modified proteins is increasingly widespread: their presence in food, commercial products, such as washing powder, and medical therapeutics and diagnostics, makes predicting and identifying potential allergens a crucial societal issue. The prediction of allergens has been explored widely using bioinformatics, with many tools being developed in the last decade; many of these are freely available online. Here, we report a set of novel models for allergen prediction utilizing amino acid E-descriptors, auto- and cross-covariance transformation, and several machine learning methods for classification, including logistic regression (LR), decision tree (DT), naïve Bayes (NB), random forest (RF), multilayer perceptron (MLP) and k nearest neighbours (kNN). The best performing method was kNN with 85.3% accuracy at 5-fold cross-validation. The resulting model has been implemented in a revised version of the AllerTOP server (http://www.ddg-pharmfac.net/AllerTOP). © Springer-Verlag 2014.