897 resultados para Decision tree method
Resumo:
To deliver sample estimates provided with the necessary probability foundation to permit generalization from the sample data subset to the whole target population being sampled, probability sampling strategies are required to satisfy three necessary not sufficient conditions: (i) All inclusion probabilities be greater than zero in the target population to be sampled. If some sampling units have an inclusion probability of zero, then a map accuracy assessment does not represent the entire target region depicted in the map to be assessed. (ii) The inclusion probabilities must be: (a) knowable for nonsampled units and (b) known for those units selected in the sample: since the inclusion probability determines the weight attached to each sampling unit in the accuracy estimation formulas, if the inclusion probabilities are unknown, so are the estimation weights. This original work presents a novel (to the best of these authors' knowledge, the first) probability sampling protocol for quality assessment and comparison of thematic maps generated from spaceborne/airborne Very High Resolution (VHR) images, where: (I) an original Categorical Variable Pair Similarity Index (CVPSI, proposed in two different formulations) is estimated as a fuzzy degree of match between a reference and a test semantic vocabulary, which may not coincide, and (II) both symbolic pixel-based thematic quality indicators (TQIs) and sub-symbolic object-based spatial quality indicators (SQIs) are estimated with a degree of uncertainty in measurement in compliance with the well-known Quality Assurance Framework for Earth Observation (QA4EO) guidelines. Like a decision-tree, any protocol (guidelines for best practice) comprises a set of rules, equivalent to structural knowledge, and an order of presentation of the rule set, known as procedural knowledge. The combination of these two levels of knowledge makes an original protocol worth more than the sum of its parts. The several degrees of novelty of the proposed probability sampling protocol are highlighted in this paper, at the levels of understanding of both structural and procedural knowledge, in comparison with related multi-disciplinary works selected from the existing literature. In the experimental session the proposed protocol is tested for accuracy validation of preliminary classification maps automatically generated by the Satellite Image Automatic MapperT (SIAMT) software product from two WorldView-2 images and one QuickBird-2 image provided by DigitalGlobe for testing purposes. In these experiments, collected TQIs and SQIs are statistically valid, statistically significant, consistent across maps and in agreement with theoretical expectations, visual (qualitative) evidence and quantitative quality indexes of operativeness (OQIs) claimed for SIAMT by related papers. As a subsidiary conclusion, the statistically consistent and statistically significant accuracy validation of the SIAMT pre-classification maps proposed in this contribution, together with OQIs claimed for SIAMT by related works, make the operational (automatic, accurate, near real-time, robust, scalable) SIAMT software product eligible for opening up new inter-disciplinary research and market opportunities in accordance with the visionary goal of the Global Earth Observation System of Systems (GEOSS) initiative and the QA4EO international guidelines.
Resumo:
Esta tesis doctoral se enmarca dentro de la computación con membranas. Se trata de un tipo de computación bio-inspirado, concretamente basado en las células de los organismos vivos, en las que se producen múltiples reacciones de forma simultánea. A partir de la estructura y funcionamiento de las células se han definido diferentes modelos formales, denominados P sistemas. Estos modelos no tratan de modelar el comportamiento biológico de una célula, sino que abstraen sus principios básicos con objeto de encontrar nuevos paradigmas computacionales. Los P sistemas son modelos de computación no deterministas y masivamente paralelos. De ahí el interés que en los últimos años estos modelos han suscitado para la resolución de problemas complejos. En muchos casos, consiguen resolver de forma teórica problemas NP-completos en tiempo polinómico o lineal. Por otra parte, cabe destacar también la aplicación que la computación con membranas ha tenido en la investigación de otros muchos campos, sobre todo relacionados con la biología. Actualmente, una gran cantidad de estos modelos de computación han sido estudiados desde el punto de vista teórico. Sin embargo, el modo en que pueden ser implementados es un reto de investigación todavía abierto. Existen varias líneas en este sentido, basadas en arquitecturas distribuidas o en hardware dedicado, que pretenden acercarse en lo posible a su carácter no determinista y masivamente paralelo, dentro de un contexto de viabilidad y eficiencia. En esta tesis doctoral se propone la realización de un análisis estático del P sistema, como vía para optimizar la ejecución del mismo en estas plataformas. Se pretende que la información recogida en tiempo de análisis sirva para configurar adecuadamente la plataforma donde se vaya a ejecutar posteriormente el P sistema, obteniendo como consecuencia una mejora en el rendimiento. Concretamente, en esta tesis se han tomado como referencia los P sistemas de transiciones para llevar a cabo el estudio de dicho análisis estático. De manera un poco más específica, el análisis estático propuesto en esta tesis persigue que cada membrana sea capaz de determinar sus reglas activas de forma eficiente en cada paso de evolución, es decir, aquellas reglas que reúnen las condiciones adecuadas para poder ser aplicadas. En esta línea, se afronta el problema de los estados de utilidad de una membrana dada, que en tiempo de ejecución permitirán a la misma conocer en todo momento las membranas con las que puede comunicarse, cuestión que determina las reglas que pueden aplicarse en cada momento. Además, el análisis estático propuesto en esta tesis se basa en otra serie de características del P sistema como la estructura de membranas, antecedentes de las reglas, consecuentes de las reglas o prioridades. Una vez obtenida toda esta información en tiempo de análisis, se estructura en forma de árbol de decisión, con objeto de que en tiempo de ejecución la membrana obtenga las reglas activas de la forma más eficiente posible. Por otra parte, en esta tesis se lleva a cabo un recorrido por un número importante de arquitecturas hardware y software que diferentes autores han propuesto para implementar P sistemas. Fundamentalmente, arquitecturas distribuidas, hardware dedicado basado en tarjetas FPGA y plataformas basadas en microcontroladores PIC. El objetivo es proponer soluciones que permitan implantar en dichas arquitecturas los resultados obtenidos del análisis estático (estados de utilidad y árboles de decisión para reglas activas). En líneas generales, se obtienen conclusiones positivas, en el sentido de que dichas optimizaciones se integran adecuadamente en las arquitecturas sin penalizaciones significativas. Summary Membrane computing is the focus of this doctoral thesis. It can be considered a bio-inspired computing type. Specifically, it is based on living cells, in which many reactions take place simultaneously. From cell structure and operation, many different formal models have been defined, named P systems. These models do not try to model the biological behavior of the cell, but they abstract the basic principles of the cell in order to find out new computational paradigms. P systems are non-deterministic and massively parallel computational models. This is why, they have aroused interest when dealing with complex problems nowadays. In many cases, they manage to solve in theory NP problems in polynomial or lineal time. On the other hand, it is important to note that membrane computing has been successfully applied in many researching areas, specially related to biology. Nowadays, lots of these computing models have been sufficiently characterized from a theoretical point of view. However, the way in which they can be implemented is a research challenge, that it is still open nowadays. There are some lines in this way, based on distributed architectures or dedicated hardware. All of them are trying to approach to its non-deterministic and parallel character as much as possible, taking into account viability and efficiency. In this doctoral thesis it is proposed carrying out a static analysis of the P system in order to optimize its performance in a computing platform. The general idea is that after data are collected in analysis time, they are used for getting a suitable configuration of the computing platform in which P system is going to be performed. As a consequence, the system throughput will improve. Specifically, this thesis has made use of Transition P systems for carrying out the study in static analysis. In particular, the static analysis proposed in this doctoral thesis tries to achieve that every membrane can efficiently determine its active rules in every evolution step. These rules are the ones that can be applied depending on the system configuration at each computational step. In this line, we are going to tackle the problem of the usefulness states for a membrane. This state will allow this membrane to know the set of membranes with which communication is possible at any time. This is a very important issue in determining the set of rules that can be applied. Moreover, static analysis in this thesis is carried out taking into account other properties such as membrane structure, rule antecedents, rule consequents and priorities among rules. After collecting all data in analysis time, they are arranged in a decision tree structure, enabling membranes to obtain the set of active rules as efficiently as possible in run-time system. On the other hand, in this doctoral thesis is going to carry out an overview of hardware and software architectures, proposed by different authors in order to implement P systems, such as distributed architectures, dedicated hardware based on PFGA, and computing platforms based on PIC microcontrollers. The aim of this overview is to propose solutions for implementing the results of the static analysis, that is, usefulness states and decision trees for active rules. In general, conclusions are satisfactory, because these optimizations can be properly integrated in most of the architectures without significant penalties.
Resumo:
Acquired brain injury (ABI) is one of the leading causes of death and disability in the world and is associated with high health care costs as a result of the acute treatment and long term rehabilitation involved. Different algorithms and methods have been proposed to predict the effectiveness of rehabilitation programs. In general, research has focused on predicting the overall improvement of patients with ABI. The purpose of this study is the novel application of data mining (DM) techniques to predict the outcomes of cognitive rehabilitation in patients with ABI. We generate three predictive models that allow us to obtain new knowledge to evaluate and improve the effectiveness of the cognitive rehabilitation process. Decision tree (DT), multilayer perceptron (MLP) and general regression neural network (GRNN) have been used to construct the prediction models. 10-fold cross validation was carried out in order to test the algorithms, using the Institut Guttmann Neurorehabilitation Hospital (IG) patients database. Performance of the models was tested through specificity, sensitivity and accuracy analysis and confusion matrix analysis. The experimental results obtained by DT are clearly superior with a prediction average accuracy of 90.38%, while MLP and GRRN obtained a 78.7% and 75.96%, respectively. This study allows to increase the knowledge about the contributing factors of an ABI patient recovery and to estimate treatment efficacy in individual patients.
Resumo:
Ubiquitous computing software needs to be autonomous so that essential decisions such as how to configure its particular execution are self-determined. Moreover, data mining serves an important role for ubiquitous computing by providing intelligence to several types of ubiquitous computing applications. Thus, automating ubiquitous data mining is also crucial. We focus on the problem of automatically configuring the execution of a ubiquitous data mining algorithm. In our solution, we generate configuration decisions in a resource aware and context aware manner since the algorithm executes in an environment in which the context often changes and computing resources are often severely limited. We propose to analyze the execution behavior of the data mining algorithm by mining its past executions. By doing so, we discover the effects of resource and context states as well as parameter settings on the data mining quality. We argue that a classification model is appropriate for predicting the behavior of an algorithm?s execution and we concentrate on decision tree classifier. We also define taxonomy on data mining quality so that tradeoff between prediction accuracy and classification specificity of each behavior model that classifies by a different abstraction of quality, is scored for model selection. Behavior model constituents and class label transformations are formally defined and experimental validation of the proposed approach is also performed.
Resumo:
Diabetes is the most common disease nowadays in all populations and in all age groups. diabetes contributing to heart disease, increases the risks of developing kidney disease, blindness, nerve damage, and blood vessel damage. Diabetes disease diagnosis via proper interpretation of the diabetes data is an important classification problem. Different techniques of artificial intelligence has been applied to diabetes problem. The purpose of this study is apply the artificial metaplasticity on multilayer perceptron (AMMLP) as a data mining (DM) technique for the diabetes disease diagnosis. The Pima Indians diabetes was used to test the proposed model AMMLP. The results obtained by AMMLP were compared with decision tree (DT), Bayesian classifier (BC) and other algorithms, recently proposed by other researchers, that were applied to the same database. The robustness of the algorithms are examined using classification accuracy, analysis of sensitivity and specificity, confusion matrix. The results obtained by AMMLP are superior to obtained by DT and BC.
Resumo:
El objetivo principal es desarrollar la metodología de opciones reales para evaluar la posible puesta en marcha de un proyecto minero. Para esto, el proyecto se divide en dos partes: En la primera parte, con carácter teórico se analizan las inversiones desde el punto de vista tradicional, comparando la problemática de estas valoraciones en ambientes de incertidumbre y flexibilidad operativa. Se analizan las opciones financieras y se comparan con las opciones reales, en cuanto a similitudes y problemáticas. Se desarrollan también los procesos estocásticos que afectan a las variables del proyecto de inversión. Se explican además, las metodologías para el cálculo de las opciones reales, incluido el cálculo de la volatilidad de las mismas. En una segunda parte, se estudia el yacimiento aurífero de Corcoesto, para el cual se realiza la simulación del plan de negocio según las características necesarias para la explotación, donde los ingresos se modelizan mediante un movimiento geométrico browniano para simular el comportamiento del precio de la onza de oro. Se elige un desarrollo de árboles binomiales para estimar el valor futuro del proyecto, a la vez que se establece un intervalo de precios de la opción para adquirir el proyecto minero. Este intervalo estará determinado por las incertidumbres del proyecto calculadas según las metodologías de Copeland y Antikarov, y Heraht y Park. Abstract This project is aimed mainly to develop real options theory to assess a mining project start-up. The project is divided in two documents: The first document with theorical content, investments are analyzed from the clasical point of view, comparing the advantages and disadvantages of this appraisal in high uncertainity and operational flexibility conditions. Financial options are analyzed and compared to real options, in both similarities and problematics. Stochastical process that affect the project variables are also developed. Methods for estimating real options value, including the methods for volatility estimation are commented. In the second document, the Corcoesto gold deposit has been studied. A bussines plan simulation has been maked according to the characteristics of the extraction, where incomes have been simulated with a geometrical Brownian movement to estimate the gold onze behaviour. The binomial tree method has been generated to study the future project value, as well as a range of option prices, for adquiring the mine project. This interval is determined by the project uncertainity calculated with the theories from Copeland and Antikarov and Herath and Park
Resumo:
Objective The main purpose of this research is the novel use of artificial metaplasticity on multilayer perceptron (AMMLP) as a data mining tool for prediction the outcome of patients with acquired brain injury (ABI) after cognitive rehabilitation. The final goal aims at increasing knowledge in the field of rehabilitation theory based on cognitive affectation. Methods and materials The data set used in this study contains records belonging to 123 ABI patients with moderate to severe cognitive affectation (according to Glasgow Coma Scale) that underwent rehabilitation at Institut Guttmann Neurorehabilitation Hospital (IG) using the tele-rehabilitation platform PREVIRNEC©. The variables included in the analysis comprise the neuropsychological initial evaluation of the patient (cognitive affectation profile), the results of the rehabilitation tasks performed by the patient in PREVIRNEC© and the outcome of the patient after a 3–5 months treatment. To achieve the treatment outcome prediction, we apply and compare three different data mining techniques: the AMMLP model, a backpropagation neural network (BPNN) and a C4.5 decision tree. Results The prediction performance of the models was measured by ten-fold cross validation and several architectures were tested. The results obtained by the AMMLP model are clearly superior, with an average predictive performance of 91.56%. BPNN and C4.5 models have a prediction average accuracy of 80.18% and 89.91% respectively. The best single AMMLP model provided a specificity of 92.38%, a sensitivity of 91.76% and a prediction accuracy of 92.07%. Conclusions The proposed prediction model presented in this study allows to increase the knowledge about the contributing factors of an ABI patient recovery and to estimate treatment efficacy in individual patients. The ability to predict treatment outcomes may provide new insights toward improving effectiveness and creating personalized therapeutic interventions based on clinical evidence.
Resumo:
The emergence of new horizons in the field of travel assistant management leads to the development of cutting-edge systems focused on improving the existing ones. Moreover, new opportunities are being also presented since systems trend to be more reliable and autonomous. In this paper, a self-learning embedded system for object identification based on adaptive-cooperative dynamic approaches is presented for intelligent sensor’s infrastructures. The proposed system is able to detect and identify moving objects using a dynamic decision tree. Consequently, it combines machine learning algorithms and cooperative strategies in order to make the system more adaptive to changing environments. Therefore, the proposed system may be very useful for many applications like shadow tolls since several types of vehicles may be distinguished, parking optimization systems, improved traffic conditions systems, etc.
Resumo:
Os motores de indução trifásicos são os principais elementos de conversão de energia elétrica em mecânica motriz aplicados em vários setores produtivos. Identificar um defeito no motor em operação pode fornecer, antes que ele falhe, maior segurança no processo de tomada de decisão sobre a manutenção da máquina, redução de custos e aumento de disponibilidade. Nesta tese são apresentas inicialmente uma revisão bibliográfica e a metodologia geral para a reprodução dos defeitos nos motores e a aplicação da técnica de discretização dos sinais de correntes e tensões no domínio do tempo. É também desenvolvido um estudo comparativo entre métodos de classificação de padrões para a identificação de defeitos nestas máquinas, tais como: Naive Bayes, k-Nearest Neighbor, Support Vector Machine (Sequential Minimal Optimization), Rede Neural Artificial (Perceptron Multicamadas), Repeated Incremental Pruning to Produce Error Reduction e C4.5 Decision Tree. Também aplicou-se o conceito de Sistemas Multiagentes (SMA) para suportar a utilização de múltiplos métodos concorrentes de forma distribuída para reconhecimento de padrões de defeitos em rolamentos defeituosos, quebras nas barras da gaiola de esquilo do rotor e curto-circuito entre as bobinas do enrolamento do estator de motores de indução trifásicos. Complementarmente, algumas estratégias para a definição da severidade dos defeitos supracitados em motores foram exploradas, fazendo inclusive uma averiguação da influência do desequilíbrio de tensão na alimentação da máquina para a determinação destas anomalias. Os dados experimentais foram adquiridos por meio de uma bancada experimental em laboratório com motores de potência de 1 e 2 cv acionados diretamente na rede elétrica, operando em várias condições de desequilíbrio das tensões e variações da carga mecânica aplicada ao eixo do motor.
Resumo:
In progressive companies succession planning plays a vital role in leadership development. Some companies have neither succession planning nor ways to fill leadership gaps. On the academic side of research, this Capstone Project focuses on the role of succession planning in leadership development through various approaches and theories. The purpose of this Capstone Project is to provide HR professionals with solid mechanisms for identifying and preparing leaders for leadership roles through proper succession mapping within a company. Through the analysis of case studies and research, recommendations are made on the formulation of a decision tree so that succession planning may take a new place in leadership development in an organization.
Resumo:
Paleoceanographic archives derived from 17 marine sediment cores reconstruct the response of the Southwest Pacific Ocean to the peak interglacial, Marine Isotope Stage (MIS) 5e (ca. 125?ka). Paleo-Sea Surface Temperature (SST) estimates were obtained from the Random Forest model-an ensemble decision tree tool-applied to core-top planktonic foraminiferal faunas calibrated to modern SSTs. The reconstructed geographic pattern of the SST anomaly (maximum SST between 120 and 132?ka minus mean modern SST) seems to indicate how MIS 5e conditions were generally warmer in the Southwest Pacific, especially in the western Tasman Sea where a strengthened East Australian Current (EAC) likely extended subtropical influence to ca. 45°S off Tasmania. In contrast, the eastern Tasman Sea may have had a modest cooling except around 45°S. The observed pattern resembles that developing under the present warming trend in the region. An increase in wind stress curl over the modern South Pacific is hypothesized to have spun-up the South Pacific Subtropical Gyre, with concurrent increase in subtropical flow in the western boundary currents that include the EAC. However, warmer temperatures along the Subtropical Front and Campbell Plateau to the south suggest that the relative influence of the boundary inflows to eastern New Zealand may have differed in MIS 5e, and these currents may have followed different paths compared to today.
Resumo:
Collaborate Filtering is one of the most popular recommendation algorithms. Most Collaborative Filtering algorithms work with a static set of data. This paper introduces a novel approach to providing recommendations using Collaborative Filtering when user rating is received over an incoming data stream. In an incoming stream there are massive amounts of data arriving rapidly making it impossible to save all the records for later analysis. By dynamically building a decision tree for every item as data arrive, the incoming data stream is used effectively although an inevitable trade off between accuracy and amount of memory used is introduced. By adding a simple personalization step using a hierarchy of the items, it is possible to improve the predicted ratings made by each decision tree and generate recommendations in real-time. Empirical studies with the dynamically built decision trees show that the personalization step improves the overall predicted accuracy.
Resumo:
Machine learning techniques for prediction and rule extraction from artificial neural network methods are used. The hypothesis that market sentiment and IPO specific attributes are equally responsible for first-day IPO returns in the US stock market is tested. Machine learning methods used are Bayesian classifications, support vector machines, decision tree techniques, rule learners and artificial neural networks. The outcomes of the research are predictions and rules associated With first-day returns of technology IPOs. The hypothesis that first-day returns of technology IPOs are equally determined by IPO specific and market sentiment is rejected. Instead lower yielding IPOs are determined by IPO specific and market sentiment attributes, while higher yielding IPOs are largely dependent on IPO specific attributes.
Resumo:
Risks and uncertainties are part and parcel of any project as projects are planned with many assumptions. Therefore, managing those risks is the key to project success. Although risk is present in all most all projects, large-scale construction projects are most vulnerable. Risk is by nature subjective. However, managing risk subjectively posses the danger of non-achievement of project goals. This study introduces an analytical framework for managing risk in projects. All the risk factors are identified, their effects are analyzed, and alternative responses are derived with cost implication for mitigating the identified risks. A decision-making framework is then formulated using decision tree. The expected monetary values are derived for each alternative. The responses, which require least cost is selected. The entire methodology has been explained through a case study of an oil pipeline project in India and its effectiveness in managing projects has been demonstrated. © INTERNATIONAL JOURNAL OF INDUSTRIAL ENGINEERING.
Resumo:
This work examines prosody modelling for the Standard Yorùbá (SY) language in the context of computer text-to-speech synthesis applications. The thesis of this research is that it is possible to develop a practical prosody model by using appropriate computational tools and techniques which combines acoustic data with an encoding of the phonological and phonetic knowledge provided by experts. Our prosody model is conceptualised around a modular holistic framework. The framework is implemented using the Relational Tree (R-Tree) techniques (Ehrich and Foith, 1976). R-Tree is a sophisticated data structure that provides a multi-dimensional description of a waveform. A Skeletal Tree (S-Tree) is first generated using algorithms based on the tone phonological rules of SY. Subsequent steps update the S-Tree by computing the numerical values of the prosody dimensions. To implement the intonation dimension, fuzzy control rules where developed based on data from native speakers of Yorùbá. The Classification And Regression Tree (CART) and the Fuzzy Decision Tree (FDT) techniques were tested in modelling the duration dimension. The FDT was selected based on its better performance. An important feature of our R-Tree framework is its flexibility in that it facilitates the independent implementation of the different dimensions of prosody, i.e. duration and intonation, using different techniques and their subsequent integration. Our approach provides us with a flexible and extendible model that can also be used to implement, study and explain the theory behind aspects of the phenomena observed in speech prosody.