6 resultados para bagging
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
Equipment maintenance is the major cost factor in industrial plants, it is very important the development of fault predict techniques. Three-phase induction motors are key electrical equipments used in industrial applications mainly because presents low cost and large robustness, however, it isn t protected from other fault types such as shorted winding and broken bars. Several acquisition ways, processing and signal analysis are applied to improve its diagnosis. More efficient techniques use current sensors and its signature analysis. In this dissertation, starting of these sensors, it is to make signal analysis through Park s vector that provides a good visualization capability. Faults data acquisition is an arduous task; in this way, it is developed a methodology for data base construction. Park s transformer is applied into stationary reference for machine modeling of the machine s differential equations solution. Faults detection needs a detailed analysis of variables and its influences that becomes the diagnosis more complex. The tasks of pattern recognition allow that systems are automatically generated, based in patterns and data concepts, in the majority cases undetectable for specialists, helping decision tasks. Classifiers algorithms with diverse learning paradigms: k-Neighborhood, Neural Networks, Decision Trees and Naïves Bayes are used to patterns recognition of machines faults. Multi-classifier systems are used to improve classification errors. It inspected the algorithms homogeneous: Bagging and Boosting and heterogeneous: Vote, Stacking and Stacking C. Results present the effectiveness of constructed model to faults modeling, such as the possibility of using multi-classifiers algorithm on faults classification
Resumo:
One of the most important goals of bioinformatics is the ability to identify genes in uncharacterized DNA sequences on world wide database. Gene expression on prokaryotes initiates when the RNA-polymerase enzyme interacts with DNA regions called promoters. In these regions are located the main regulatory elements of the transcription process. Despite the improvement of in vitro techniques for molecular biology analysis, characterizing and identifying a great number of promoters on a genome is a complex task. Nevertheless, the main drawback is the absence of a large set of promoters to identify conserved patterns among the species. Hence, a in silico method to predict them on any species is a challenge. Improved promoter prediction methods can be one step towards developing more reliable ab initio gene prediction methods. In this work, we present an empirical comparison of Machine Learning (ML) techniques such as Na¨ýve Bayes, Decision Trees, Support Vector Machines and Neural Networks, Voted Perceptron, PART, k-NN and and ensemble approaches (Bagging and Boosting) to the task of predicting Bacillus subtilis. In order to do so, we first built two data set of promoter and nonpromoter sequences for B. subtilis and a hybrid one. In order to evaluate of ML methods a cross-validation procedure is applied. Good results were obtained with methods of ML like SVM and Naïve Bayes using B. subtilis. However, we have not reached good results on hybrid database
Resumo:
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated
Resumo:
Significant advances have emerged in research related to the topic of Classifier Committees. The models that receive the most attention in the literature are those of the static nature, also known as ensembles. The algorithms that are part of this class, we highlight the methods that using techniques of resampling of the training data: Bagging, Boosting and Multiboosting. The choice of the architecture and base components to be recruited is not a trivial task and has motivated new proposals in an attempt to build such models automatically, and many of them are based on optimization methods. Many of these contributions have not shown satisfactory results when applied to more complex problems with different nature. In contrast, the thesis presented here, proposes three new hybrid approaches for automatic construction for ensembles: Increment of Diversity, Adaptive-fitness Function and Meta-learning for the development of systems for automatic configuration of parameters for models of ensemble. In the first one approach, we propose a solution that combines different diversity techniques in a single conceptual framework, in attempt to achieve higher levels of diversity in ensembles, and with it, the better the performance of such systems. In the second one approach, using a genetic algorithm for automatic design of ensembles. The contribution is to combine the techniques of filter and wrapper adaptively to evolve a better distribution of the feature space to be presented for the components of ensemble. Finally, the last one approach, which proposes new techniques for recommendation of architecture and based components on ensemble, by techniques of traditional meta-learning and multi-label meta-learning. In general, the results are encouraging and corroborate with the thesis that hybrid tools are a powerful solution in building effective ensembles for pattern classification problems.
Resumo:
He was obtained and studied the feasibility of using TPA (Tissue Cotton Plan) screen type, for bagging, with a weight of 207.9 g / m2 in a composite of orthophthalic crystal polyester resin matrix. The process for obtaining the composite was tested against the maximum number of layers that could be used without compromising the processability and manufacturing of CPs in compression mold. Five configurations / formulations were selected and tested at 1, 4, 8, 10 and 12 layers of cotton tissue - TPA. TPA was not subjected to chemical treatment, only by passing a mechanical washing process. The composite in its various configurations / formulations was characterized to determine its physical properties. The properties of the composite were higher viability resistance to bending, approaching the matrix and impact resistance, superiority in relation to the polyester resin. Another property that has shown good result compared to other composite has water absorption. Analyzing all the properties set the settings / formulations with higher viability were TA8 and TA10, by combining good processability and higher mechanical strength, with lower loss compared to polyester resin matrix. The composite showed lower mechanical behavior of the resin matrix for all the formulations studied except the impact resistance. The SEM showed a good adhesion between the layers of TPA and polyester resin matrix, without the presence of micro voids in the matrix confirming the efficient manufacturing process of the samples for characterization. The composite proposed proved to be viable for the fabrication of structures with low requests from mechanical stresses, and as demonstrated for the manufacture of solar and wind prototypes, and packaging, shelving, decorative items, crafts and shelves, with good visual appearance.
Resumo:
He was obtained and studied the feasibility of using TPA (Tissue Cotton Plan) screen type, for bagging, with a weight of 207.9 g / m2 in a composite of orthophthalic crystal polyester resin matrix. The process for obtaining the composite was tested against the maximum number of layers that could be used without compromising the processability and manufacturing of CPs in compression mold. Five configurations / formulations were selected and tested at 1, 4, 8, 10 and 12 layers of cotton tissue - TPA. TPA was not subjected to chemical treatment, only by passing a mechanical washing process. The composite in its various configurations / formulations was characterized to determine its physical properties. The properties of the composite were higher viability resistance to bending, approaching the matrix and impact resistance, superiority in relation to the polyester resin. Another property that has shown good result compared to other composite has water absorption. Analyzing all the properties set the settings / formulations with higher viability were TA8 and TA10, by combining good processability and higher mechanical strength, with lower loss compared to polyester resin matrix. The composite showed lower mechanical behavior of the resin matrix for all the formulations studied except the impact resistance. The SEM showed a good adhesion between the layers of TPA and polyester resin matrix, without the presence of micro voids in the matrix confirming the efficient manufacturing process of the samples for characterization. The composite proposed proved to be viable for the fabrication of structures with low requests from mechanical stresses, and as demonstrated for the manufacture of solar and wind prototypes, and packaging, shelving, decorative items, crafts and shelves, with good visual appearance.