7 resultados para MACHINE LEARNING CLASSIFIERS
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
Equipment maintenance is the major cost factor in industrial plants, it is very important the development of fault predict techniques. Three-phase induction motors are key electrical equipments used in industrial applications mainly because presents low cost and large robustness, however, it isn t protected from other fault types such as shorted winding and broken bars. Several acquisition ways, processing and signal analysis are applied to improve its diagnosis. More efficient techniques use current sensors and its signature analysis. In this dissertation, starting of these sensors, it is to make signal analysis through Park s vector that provides a good visualization capability. Faults data acquisition is an arduous task; in this way, it is developed a methodology for data base construction. Park s transformer is applied into stationary reference for machine modeling of the machine s differential equations solution. Faults detection needs a detailed analysis of variables and its influences that becomes the diagnosis more complex. The tasks of pattern recognition allow that systems are automatically generated, based in patterns and data concepts, in the majority cases undetectable for specialists, helping decision tasks. Classifiers algorithms with diverse learning paradigms: k-Neighborhood, Neural Networks, Decision Trees and Naïves Bayes are used to patterns recognition of machines faults. Multi-classifier systems are used to improve classification errors. It inspected the algorithms homogeneous: Bagging and Boosting and heterogeneous: Vote, Stacking and Stacking C. Results present the effectiveness of constructed model to faults modeling, such as the possibility of using multi-classifiers algorithm on faults classification
Resumo:
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated
Resumo:
The pattern classification is one of the machine learning subareas that has the most outstanding. Among the various approaches to solve pattern classification problems, the Support Vector Machines (SVM) receive great emphasis, due to its ease of use and good generalization performance. The Least Squares formulation of SVM (LS-SVM) finds the solution by solving a set of linear equations instead of quadratic programming implemented in SVM. The LS-SVMs provide some free parameters that have to be correctly chosen to achieve satisfactory results in a given task. Despite the LS-SVMs having high performance, lots of tools have been developed to improve them, mainly the development of new classifying methods and the employment of ensembles, in other words, a combination of several classifiers. In this work, our proposal is to use an ensemble and a Genetic Algorithm (GA), search algorithm based on the evolution of species, to enhance the LSSVM classification. In the construction of this ensemble, we use a random selection of attributes of the original problem, which it splits the original problem into smaller ones where each classifier will act. So, we apply a genetic algorithm to find effective values of the LS-SVM parameters and also to find a weight vector, measuring the importance of each machine in the final classification. Finally, the final classification is obtained by a linear combination of the decision values of the LS-SVMs with the weight vector. We used several classification problems, taken as benchmarks to evaluate the performance of the algorithm and compared the results with other classifiers
Resumo:
Traditional applications of feature selection in areas such as data mining, machine learning and pattern recognition aim to improve the accuracy and to reduce the computational cost of the model. It is done through the removal of redundant, irrelevant or noisy data, finding a representative subset of data that reduces its dimensionality without loss of performance. With the development of research in ensemble of classifiers and the verification that this type of model has better performance than the individual models, if the base classifiers are diverse, comes a new field of application to the research of feature selection. In this new field, it is desired to find diverse subsets of features for the construction of base classifiers for the ensemble systems. This work proposes an approach that maximizes the diversity of the ensembles by selecting subsets of features using a model independent of the learning algorithm and with low computational cost. This is done using bio-inspired metaheuristics with evaluation filter-based criteria
Resumo:
Although some individual techniques of supervised Machine Learning (ML), also known as classifiers, or algorithms of classification, to supply solutions that, most of the time, are considered efficient, have experimental results gotten with the use of large sets of pattern and/or that they have a expressive amount of irrelevant data or incomplete characteristic, that show a decrease in the efficiency of the precision of these techniques. In other words, such techniques can t do an recognition of patterns of an efficient form in complex problems. With the intention to get better performance and efficiency of these ML techniques, were thought about the idea to using some types of LM algorithms work jointly, thus origin to the term Multi-Classifier System (MCS). The MCS s presents, as component, different of LM algorithms, called of base classifiers, and realized a combination of results gotten for these algorithms to reach the final result. So that the MCS has a better performance that the base classifiers, the results gotten for each base classifier must present an certain diversity, in other words, a difference between the results gotten for each classifier that compose the system. It can be said that it does not make signification to have MCS s whose base classifiers have identical answers to the sames patterns. Although the MCS s present better results that the individually systems, has always the search to improve the results gotten for this type of system. Aim at this improvement and a better consistency in the results, as well as a larger diversity of the classifiers of a MCS, comes being recently searched methodologies that present as characteristic the use of weights, or confidence values. These weights can describe the importance that certain classifier supplied when associating with each pattern to a determined class. These weights still are used, in associate with the exits of the classifiers, during the process of recognition (use) of the MCS s. Exist different ways of calculating these weights and can be divided in two categories: the static weights and the dynamic weights. The first category of weights is characterizes for not having the modification of its values during the classification process, different it occurs with the second category, where the values suffers modifications during the classification process. In this work an analysis will be made to verify if the use of the weights, statics as much as dynamics, they can increase the perfomance of the MCS s in comparison with the individually systems. Moreover, will be made an analysis in the diversity gotten for the MCS s, for this mode verify if it has some relation between the use of the weights in the MCS s with different levels of diversity
Resumo:
The objective of the researches in artificial intelligence is to qualify the computer to execute functions that are performed by humans using knowledge and reasoning. This work was developed in the area of machine learning, that it s the study branch of artificial intelligence, being related to the project and development of algorithms and techniques capable to allow the computational learning. The objective of this work is analyzing a feature selection method for ensemble systems. The proposed method is inserted into the filter approach of feature selection method, it s using the variance and Spearman correlation to rank the feature and using the reward and punishment strategies to measure the feature importance for the identification of the classes. For each ensemble, several different configuration were used, which varied from hybrid (homogeneous) to non-hybrid (heterogeneous) structures of ensemble. They were submitted to five combining methods (voting, sum, sum weight, multiLayer Perceptron and naïve Bayes) which were applied in six distinct database (real and artificial). The classifiers applied during the experiments were k- nearest neighbor, multiLayer Perceptron, naïve Bayes and decision tree. Finally, the performance of ensemble was analyzed comparatively, using none feature selection method, using a filter approach (original) feature selection method and the proposed method. To do this comparison, a statistical test was applied, which demonstrate that there was a significant improvement in the precision of the ensembles
Resumo:
In the world we are constantly performing everyday actions. Two of these actions are frequent and of great importance: classify (sort by classes) and take decision. When we encounter problems with a relatively high degree of complexity, we tend to seek other opinions, usually from people who have some knowledge or even to the extent possible, are experts in the problem domain in question in order to help us in the decision-making process. Both the classification process as the process of decision making, we are guided by consideration of the characteristics involved in the specific problem. The characterization of a set of objects is part of the decision making process in general. In Machine Learning this classification happens through a learning algorithm and the characterization is applied to databases. The classification algorithms can be employed individually or by machine committees. The choice of the best methods to be used in the construction of a committee is a very arduous task. In this work, it will be investigated meta-learning techniques in selecting the best configuration parameters of homogeneous committees for applications in various classification problems. These parameters are: the base classifier, the architecture and the size of this architecture. We investigated nine types of inductors candidates for based classifier, two methods of generation of architecture and nine medium-sized groups for architecture. Dimensionality reduction techniques have been applied to metabases looking for improvement. Five classifiers methods are investigated as meta-learners in the process of choosing the best parameters of a homogeneous committee.