897 resultados para Decision tree method
Resumo:
The aims of the project were twofold: 1) To investigate classification procedures for remotely sensed digital data, in order to develop modifications to existing algorithms and propose novel classification procedures; and 2) To investigate and develop algorithms for contextual enhancement of classified imagery in order to increase classification accuracy. The following classifiers were examined: box, decision tree, minimum distance, maximum likelihood. In addition to these the following algorithms were developed during the course of the research: deviant distance, look up table and an automated decision tree classifier using expert systems technology. Clustering techniques for unsupervised classification were also investigated. Contextual enhancements investigated were: mode filters, small area replacement and Wharton's CONAN algorithm. Additionally methods for noise and edge based declassification and contextual reclassification, non-probabilitic relaxation and relaxation based on Markov chain theory were developed. The advantages of per-field classifiers and Geographical Information Systems were investigated. The conclusions presented suggest suitable combinations of classifier and contextual enhancement, given user accuracy requirements and time constraints. These were then tested for validity using a different data set. A brief examination of the utility of the recommended contextual algorithms for reducing the effects of data noise was also carried out.
Resumo:
Tonal, textural and contextual properties are used in manual photointerpretation of remotely sensed data. This study has used these three attributes to produce a lithological map of semi arid northwest Argentina by semi automatic computer classification procedures of remotely sensed data. Three different types of satellite data were investigated, these were LANDSAT MSS, TM and SIR-A imagery. Supervised classification procedures using tonal features only produced poor classification results. LANDSAT MSS produced classification accuracies in the range of 40 to 60%, while accuracies of 50 to 70% were achieved using LANDSAT TM data. The addition of SIR-A data produced increases in the classification accuracy. The increased classification accuracy of TM over the MSS is because of the better discrimination of geological materials afforded by the middle infra red bands of the TM sensor. The maximum likelihood classifier consistently produced classification accuracies 10 to 15% higher than either the minimum distance to means or decision tree classifier, this improved accuracy was obtained at the cost of greatly increased processing time. A new type of classifier the spectral shape classifier, which is computationally as fast as a minimum distance to means classifier is described. However, the results for this classifier were disappointing, being lower in most cases than the minimum distance or decision tree procedures. The classification results using only tonal features were felt to be unacceptably poor, therefore textural attributes were investigated. Texture is an important attribute used by photogeologists to discriminate lithology. In the case of TM data, texture measures were found to increase the classification accuracy by up to 15%. However, in the case of the LANDSAT MSS data the use of texture measures did not provide any significant increase in the accuracy of classification. For TM data, it was found that second order texture, especially the SGLDM based measures, produced highest classification accuracy. Contextual post processing was found to increase classification accuracy and improve the visual appearance of classified output by removing isolated misclassified pixels which tend to clutter classified images. Simple contextual features, such as mode filters were found to out perform more complex features such as gravitational filter or minimal area replacement methods. Generally the larger the size of the filter, the greater the increase in the accuracy. Production rules were used to build a knowledge based system which used tonal and textural features to identify sedimentary lithologies in each of the two test sites. The knowledge based system was able to identify six out of ten lithologies correctly.
Resumo:
This paper presents a novel prosody model in the context of computer text-to-speech synthesis applications for tone languages. We have demonstrated its applicability using the Standard Yorùbá (SY) language. Our approach is motivated by the theory that abstract and realised forms of various prosody dimensions should be modelled within a modular and unified framework [Coleman, J.S., 1994. Polysyllabic words in the YorkTalk synthesis system. In: Keating, P.A. (Ed.), Phonological Structure and Forms: Papers in Laboratory Phonology III, Cambridge University Press, Cambridge, pp. 293–324]. We have implemented this framework using the Relational Tree (R-Tree) technique. R-Tree is a sophisticated data structure for representing a multi-dimensional waveform in the form of a tree. The underlying assumption of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combine acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. To implement the intonation dimension, fuzzy logic based rules were developed using speech data from native speakers of Yorùbá. The Fuzzy Decision Tree (FDT) and the Classification and Regression Tree (CART) techniques were tested in modelling the duration dimension. For practical reasons, we have selected the FDT for implementing the duration dimension of our prosody model. To establish the effectiveness of our prosody model, we have also developed a Stem-ML prosody model for SY. We have performed both quantitative and qualitative evaluations on our implemented prosody models. The results suggest that, although the R-Tree model does not predict the numerical speech prosody data as accurately as the Stem-ML model, it produces synthetic speech prosody with better intelligibility and naturalness. The R-Tree model is particularly suitable for speech prosody modelling for languages with limited language resources and expertise, e.g. African languages. Furthermore, the R-Tree model is easy to implement, interpret and analyse.
Resumo:
This thesis examined solar thermal collectors for use in alternative hybrid solar-biomass power plant applications in Gujarat, India. Following a preliminary review, the cost-effective selection and design of the solar thermal field were identified as critical factors underlying the success of hybrid plants. Consequently, the existing solar thermal technologies were reviewed and ranked for use in India by means of a multi-criteria decision-making method, the Analytical Hierarchy Process (AHP). Informed by the outcome of the AHP, the thesis went on to pursue the Linear Fresnel Reflector (LFR), the design of which was optimised with the help of ray-tracing. To further enhance collector performance, LFR concepts incorporating novel mirror spacing and drive mechanisms were evaluated. Subsequently, a new variant, termed the Elevation Linear Fresnel Reflector (ELFR) was designed, constructed and tested at Aston University, UK, therefore allowing theoretical models for the performance of a solar thermal field to be verified. Based on the resulting characteristics of the LFR, and data gathered for the other hybrid system components, models of hybrid LFR- and ELFR-biomass power plants were developed and analysed in TRNSYS®. The techno-economic and environmental consequences of varying the size of the solar field in relation to the total plant capacity were modelled for a series of case studies to evaluate different applications: tri-generation (electricity, ice and heat), electricity-only generation, and process heat. The case studies also encompassed varying site locations, capacities, operational conditions and financial situations. In the case of a hybrid tri-generation plant in Gujarat, it was recommended to use an LFR solar thermal field of 14,000 m2 aperture with a 3 tonne biomass boiler, generating 815 MWh per annum of electricity for nearby villages and 12,450 tonnes of ice per annum for local fisheries and food industries. However, at the expense of a 0.3 ¢/kWh increase in levelised energy costs, the ELFR increased saving of biomass (100 t/a) and land (9 ha/a). For solar thermal applications in areas with high land cost, the ELFR reduced levelised energy costs. It was determined that off-grid hybrid plants for tri-generation were the most feasible application in India. Whereas biomass-only plants were found to be more economically viable, it was concluded that hybrid systems will soon become cost competitive and can considerably improve current energy security and biomass supply chain issues in India.
Resumo:
The studies presented in this thesis were carried out because of a lack of previous research with respect to (a) the habits and attitudes towards retinoscopy and (b) the relative accuracy of dedicated retinoscopes compared to combined types in which changing the bulb allows use in spot or streak mode. An online British survey received responses from 298 optometrists. Decision tree analyses revealed that optometrists working in multiple practices tended to rely less on retinoscopy than those in the independent sector. Only half of the respondents used dynamic retinoscopy. The majority, however, agreed that retinoscopy was an important test. The University attended also influenced the type of retinoscope used and the use of autorefractors. Combined retinoscopes were used most by the more recently qualified optometrists and few agreed that combined retinoscopes were less accurate. A trial indicated that combined and dedicated retinoscopes were equally accurate. Here, 4 optometrists (2 using spot and 2 using streak retinoscopes) tested one eye of 6 patients using combined and dedicated retinoscopes. This trial also demonstrated the utility of the relatively unknown ’15 degrees of freedom’ rule that exploits replication in factorial ANOVA designs to achieve sufficient statistical power when recruitment is limited. An opportunistic international survey explored the use of retinoscopy by 468 practitioners (134 ophthalmologists, 334 optometrists) attending contact related courses. Decision tree analyses found (a) no differences in the habits of optometrists and ophthalmologists, (b) differences in the reliance on retinoscopy and use of dynamic techniques across the participating countries and (c) some evidence that younger practitioners were using static and dynamic retinoscopy least often. In conclusion, this study has revealed infrequent use of static and dynamic retinoscopy by some optometrists, which may be the only means of determining refractive error and evaluating accommodation in patients with communication difficulties.
Resumo:
The objective of this study was to investigate the effects of circularity, comorbidity, prevalence and presentation variation on the accuracy of differential diagnoses made in optometric primary care using a modified form of naïve Bayesian sequential analysis. No such investigation has ever been reported before. Data were collected for 1422 cases seen over one year. Positive test outcomes were recorded for case history (ethnicity, age, symptoms and ocular and medical history) and clinical signs in relation to each diagnosis. For this reason only positive likelihood ratios were used for this modified form of Bayesian analysis that was carried out with Laplacian correction and Chi-square filtration. Accuracy was expressed as the percentage of cases for which the diagnoses made by the clinician appeared at the top of a list generated by Bayesian analysis. Preliminary analyses were carried out on 10 diagnoses and 15 test outcomes. Accuracy of 100% was achieved in the absence of presentation variation but dropped by 6% when variation existed. Circularity artificially elevated accuracy by 0.5%. Surprisingly, removal of Chi-square filtering increased accuracy by 0.4%. Decision tree analysis showed that accuracy was influenced primarily by prevalence followed by presentation variation and comorbidity. Analysis of 35 diagnoses and 105 test outcomes followed. This explored the use of positive likelihood ratios, derived from the case history, to recommend signs to look for. Accuracy of 72% was achieved when all clinical signs were entered. The drop in accuracy, compared to the preliminary analysis, was attributed to the fact that some diagnoses lacked strong diagnostic signs; the accuracy increased by 1% when only recommended signs were entered. Chi-square filtering improved recommended test selection. Decision tree analysis showed that accuracy again influenced primarily by prevalence, followed by comorbidity and presentation variation. Future work will explore the use of likelihood ratios based on positive and negative test findings prior to considering naïve Bayesian analysis as a form of artificial intelligence in optometric practice.
Resumo:
Today, due to globalization of the world the size of data set is increasing, it is necessary to discover the knowledge. The discovery of knowledge can be typically in the form of association rules, classification rules, clustering, discovery of frequent episodes and deviation detection. Fast and accurate classifiers for large databases are an important task in data mining. There is growing evidence that integrating classification and association rules mining, classification approaches based on heuristic, greedy search like decision tree induction. Emerging associative classification algorithms have shown good promises on producing accurate classifiers. In this paper we focus on performance of associative classification and present a parallel model for classifier building. For classifier building some parallel-distributed algorithms have been proposed for decision tree induction but so far no such work has been reported for associative classification.
Resumo:
Electrocardiography (ECG) has been recently proposed as biometric trait for identification purposes. Intra-individual variations of ECG might affect identification performance. These variations are mainly due to Heart Rate Variability (HRV). In particular, HRV causes changes in the QT intervals along the ECG waveforms. This work is aimed at analysing the influence of seven QT interval correction methods (based on population models) on the performance of ECG-fiducial-based identification systems. In addition, we have also considered the influence of training set size, classifier, classifier ensemble as well as the number of consecutive heartbeats in a majority voting scheme. The ECG signals used in this study were collected from thirty-nine subjects within the Physionet open access database. Public domain software was used for fiducial points detection. Results suggested that QT correction is indeed required to improve the performance. However, there is no clear choice among the seven explored approaches for QT correction (identification rate between 0.97 and 0.99). MultiLayer Perceptron and Support Vector Machine seemed to have better generalization capabilities, in terms of classification performance, with respect to Decision Tree-based classifiers. No such strong influence of the training-set size and the number of consecutive heartbeats has been observed on the majority voting scheme.
Resumo:
The problems of the cognitive development of subject “perception” are discussed in the thesis: from the object being studied and means of action till the single system “subject – modus operandi of subject – object”. Problems of increasing adequacy of models of “live” nature are analyzed. The concept of developing decisionmaking support systems as expert systems to decision-making support systems as personal device of a decisionmaker is discussed. The experience of the development of qualitative prediction on the basis of polyvalent dependences, represented by a decision tree, which realizes the concept of “plural subjective determinism”, is analyzed. The examples of applied systems prediction of ecological-economic and social processes are given. The ways of their development are discussed.
Resumo:
Refraction simulators used for undergraduate training at Aston University did not realistically reflect variations in the relationship between vision and ametropia. This was because they used an algorithm, taken from the research literature, that strictly only applied to myopes or older hyperopes and did not factor in age and pupil diameter. The aim of this study was to generate new algorithms that overcame these limitations. Clinical data were collected from the healthy right eyes of 873 white subjects aged between 20 and 70 years. Vision and refractive error were recorded along with age and pupil diameter. Re-examination of 34 subjects enabled the calculation of coefficients of repeatability. The study population was slightly biased towards females and included many contact lens wearers. Sex and contact lens wear were, therefore, recorded in order to determine whether these might influence the findings. In addition, iris colour and cylinder axis orientation were recorded as these might also be influential. A novel Blur Sensitivity Ratio (BSR) was derived by dividing vision (expressed as minimum angle of resolution) by refractive error (expressed as a scalar vector, U). Alteration of the scalar vector, to account for additional vision reduction due to oblique cylinder axes, was not found to be useful. Decision tree analysis showed that sex, contact lens wear, iris colour and cylinder axis orientation did not influence the BSR. The following algorithms arose from two stepwise multiple linear regressions: BSR (myopes) = 1.13 + (0.24 x pupil diameter) + (0.14 x U) BSR (hyperopes) = (0.11 x pupil diameter) + (0.03 x age) - 0.22 These algorithms together accounted for 84% of the observed variance. They showed that pupil diameter influenced vision in both forms of ametropia. They also showed the age-related decline in the ability to accommodate in order to overcome reduced vision in hyperopia.
Resumo:
Accurate colour vision testing requires using the correct illumination. With the plethora of 'daylight' lamps available, is there a cost-effective alternative to the discontinued MacBeth Easel lamp? Smoking is a known risk factor for macula degeneration. As the macula is responsible for colour discrimination, any toxin that affects it has the potential to influence colour discrimination. Aims: To find a costeffective light source for colour vision testing. To investigate the effect of smoking on colour discrimination. To explore how deuteranomalous trichromats compare with normal trichromats. Methods: Using the Ishihara colour vision test subjects were classified into the groups: 'Normal/Control', 'Smoker/Test', and 'Case Study' (subjects who failed the screening test and did not smoke). They completed the Farnsworth Munsell 100 Hue test under each of the three light sources: Phillips EcoHalo Twist (tungsten halogen - THL), Kosnic KCF07ALU/GU10-865 (compact fluorescent- CFL), and Deal Guardian Ltd. GU103X2WA4B-60 (light-emitting diode - LED) Results: 42 subjects took part in the study: 18 in the Normal/Control group, 18 in the Smoker/Test group, and 6 in the Case Study group. For the Normal/Control group the total error scores (TESs) were significantly lower with the CFL than with the THL (p = 0.017) as it was for the Case Study group (p = 0.009). No significant differences were found between the Normal/Control group and the Smoker/Test group for each light source. Decision tree analysis found pack years to be a significant variable for TES. Discussion: All three light sources were comparable with previous studies. The CFL provided better colour discrimination than the LED despite them both being 6500 K. Deuteranomalous trichromats showed a greatest deviation than normal trichromats using the LED. Conclusions: The Kosnic KCF07ALU/GU10-865 is a cost-effective alternative for colour vision testing. Smoking appears to have an effect on colour vision, but requires further investigation.
Resumo:
* The work is supported by RFBR, grant 04-01-00858-a
Resumo:
In this paper the issues of Ukrainian new three-level pension system are discussed. First, the paper presents the mathematical model that allows calculating the optimal size of contributions to the non-state pension fund. Next, the non-state pension fund chooses an Asset Management Company. To do so it is proposed to use an approach based on Kohonen networks to classify asset management companies that work in Ukrainian market. Further, when the asset management company is chosen, it receives the pension contributions of the participants of the non-pension fund. Asset Management Company has to invest these contributions profitably. This paper proposes an approach for choosing the most profitable investment project using decision trees. The new pension system has been lawfully ratified only four years ago and is still developing, that is why this paper is very important.
Resumo:
Background - The aim was to derive equations for the relationship between unaided vision and age, pupil diameter, iris colour and sphero-cylindrical refractive error. Methods - Data were collected from 663 healthy right eyes of white subjects aged 20 to 70 years. Subjective sphero-cylindrical refractive errors ranged from -6.8 to +9.4 D (mean spherical equivalent), -1.5 to +1.9 D (orthogonal component, J0) and -0.8 to 1.0 D (oblique component, J45). Cylinder axis orientation was orthogonal in 46 per cent of the eyes and oblique in 18 per cent. Unaided vision (-0.3 to +1.3 logMAR), pupil diameter (2.3 to 7.5 mm) and iris colour (67 per cent light/blue irides) was recorded. The sample included mostly females (60 per cent) and many contact lens wearers (42 per cent) and so the influences of these parameters were also investigated. Results - Decision tree analysis showed that sex, iris colour, contact lens wear and cylinder axis orientation did not influence the relationship between unaided vision and refractive error. New equations for the dependence of the minimum angle of resolution on age and pupil diameter arose from step backwards multiple linear regressions carried out separately on the myopes (2.91.scalar vector +0.51.pupil diameter -3.14 ) and hyperopes (1.55.scalar vector + 0.06.age – 3.45 ). Conclusion - The new equations may be useful in simulators designed for teaching purposes as they accounted for 81 per cent (for myopes) and 53 per cent (for hyperopes) of the variance in measured data. In comparison, previously published equations accounted for not more than 76 per cent (for myopes) and 24 per cent (for hyperopes) of the variance depending on whether they included pupil size. The new equations are, as far as is known to the authors, the first to include age. The age-related decline in accommodation is reflected in the equation for hyperopes.
Resumo:
Background: Allergy is a form of hypersensitivity to normally innocuous substances, such as dust, pollen, foods or drugs. Allergens are small antigens that commonly provoke an IgE antibody response. There are two types of bioinformatics-based allergen prediction. The first approach follows FAO/WHO Codex alimentarius guidelines and searches for sequence similarity. The second approach is based on identifying conserved allergenicity-related linear motifs. Both approaches assume that allergenicity is a linearly coded property. In the present study, we applied ACC pre-processing to sets of known allergens, developing alignment-independent models for allergen recognition based on the main chemical properties of amino acid sequences.Results: A set of 684 food, 1,156 inhalant and 555 toxin allergens was collected from several databases. A set of non-allergens from the same species were selected to mirror the allergen set. The amino acids in the protein sequences were described by three z-descriptors (z1, z2 and z3) and by auto- and cross-covariance (ACC) transformation were converted into uniform vectors. Each protein was presented as a vector of 45 variables. Five machine learning methods for classification were applied in the study to derive models for allergen prediction. The methods were: discriminant analysis by partial least squares (DA-PLS), logistic regression (LR), decision tree (DT), naïve Bayes (NB) and k nearest neighbours (kNN). The best performing model was derived by kNN at k = 3. It was optimized, cross-validated and implemented in a server named AllerTOP, freely accessible at http://www.pharmfac.net/allertop. AllerTOP also predicts the most probable route of exposure. In comparison to other servers for allergen prediction, AllerTOP outperforms them with 94% sensitivity.Conclusions: AllerTOP is the first alignment-free server for in silico prediction of allergens based on the main physicochemical properties of proteins. Significantly, as well allergenicity AllerTOP is able to predict the route of allergen exposure: food, inhalant or toxin. © 2013 Dimitrov et al.; licensee BioMed Central Ltd.