8 resultados para Machine learning,Keras,Tensorflow,Data parallelism,Model parallelism,Container,Docker
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
The Support Vector Machines (SVM) has attracted increasing attention in machine learning area, particularly on classification and patterns recognition. However, in some cases it is not easy to determinate accurately the class which given pattern belongs. This thesis involves the construction of a intervalar pattern classifier using SVM in association with intervalar theory, in order to model the separation of a pattern set between distinct classes with precision, aiming to obtain an optimized separation capable to treat imprecisions contained in the initial data and generated during the computational processing. The SVM is a linear machine. In order to allow it to solve real-world problems (usually nonlinear problems), it is necessary to treat the pattern set, know as input set, transforming from nonlinear nature to linear problem. The kernel machines are responsible to do this mapping. To create the intervalar extension of SVM, both for linear and nonlinear problems, it was necessary define intervalar kernel and the Mercer s theorem (which caracterize a kernel function) to intervalar function
Resumo:
Equipment maintenance is the major cost factor in industrial plants, it is very important the development of fault predict techniques. Three-phase induction motors are key electrical equipments used in industrial applications mainly because presents low cost and large robustness, however, it isn t protected from other fault types such as shorted winding and broken bars. Several acquisition ways, processing and signal analysis are applied to improve its diagnosis. More efficient techniques use current sensors and its signature analysis. In this dissertation, starting of these sensors, it is to make signal analysis through Park s vector that provides a good visualization capability. Faults data acquisition is an arduous task; in this way, it is developed a methodology for data base construction. Park s transformer is applied into stationary reference for machine modeling of the machine s differential equations solution. Faults detection needs a detailed analysis of variables and its influences that becomes the diagnosis more complex. The tasks of pattern recognition allow that systems are automatically generated, based in patterns and data concepts, in the majority cases undetectable for specialists, helping decision tasks. Classifiers algorithms with diverse learning paradigms: k-Neighborhood, Neural Networks, Decision Trees and Naïves Bayes are used to patterns recognition of machines faults. Multi-classifier systems are used to improve classification errors. It inspected the algorithms homogeneous: Bagging and Boosting and heterogeneous: Vote, Stacking and Stacking C. Results present the effectiveness of constructed model to faults modeling, such as the possibility of using multi-classifiers algorithm on faults classification
Resumo:
Traditional applications of feature selection in areas such as data mining, machine learning and pattern recognition aim to improve the accuracy and to reduce the computational cost of the model. It is done through the removal of redundant, irrelevant or noisy data, finding a representative subset of data that reduces its dimensionality without loss of performance. With the development of research in ensemble of classifiers and the verification that this type of model has better performance than the individual models, if the base classifiers are diverse, comes a new field of application to the research of feature selection. In this new field, it is desired to find diverse subsets of features for the construction of base classifiers for the ensemble systems. This work proposes an approach that maximizes the diversity of the ensembles by selecting subsets of features using a model independent of the learning algorithm and with low computational cost. This is done using bio-inspired metaheuristics with evaluation filter-based criteria
Resumo:
One of the most important goals of bioinformatics is the ability to identify genes in uncharacterized DNA sequences on world wide database. Gene expression on prokaryotes initiates when the RNA-polymerase enzyme interacts with DNA regions called promoters. In these regions are located the main regulatory elements of the transcription process. Despite the improvement of in vitro techniques for molecular biology analysis, characterizing and identifying a great number of promoters on a genome is a complex task. Nevertheless, the main drawback is the absence of a large set of promoters to identify conserved patterns among the species. Hence, a in silico method to predict them on any species is a challenge. Improved promoter prediction methods can be one step towards developing more reliable ab initio gene prediction methods. In this work, we present an empirical comparison of Machine Learning (ML) techniques such as Na¨ýve Bayes, Decision Trees, Support Vector Machines and Neural Networks, Voted Perceptron, PART, k-NN and and ensemble approaches (Bagging and Boosting) to the task of predicting Bacillus subtilis. In order to do so, we first built two data set of promoter and nonpromoter sequences for B. subtilis and a hybrid one. In order to evaluate of ML methods a cross-validation procedure is applied. Good results were obtained with methods of ML like SVM and Naïve Bayes using B. subtilis. However, we have not reached good results on hybrid database
Resumo:
Reinforcement learning is a machine learning technique that, although finding a large number of applications, maybe is yet to reach its full potential. One of the inadequately tested possibilities is the use of reinforcement learning in combination with other methods for the solution of pattern classification problems. It is well documented in the literature the problems that support vector machine ensembles face in terms of generalization capacity. Algorithms such as Adaboost do not deal appropriately with the imbalances that arise in those situations. Several alternatives have been proposed, with varying degrees of success. This dissertation presents a new approach to building committees of support vector machines. The presented algorithm combines Adaboost algorithm with a layer of reinforcement learning to adjust committee parameters in order to avoid that imbalances on the committee components affect the generalization performance of the final hypothesis. Comparisons were made with ensembles using and not using the reinforcement learning layer, testing benchmark data sets widely known in area of pattern classification
Resumo:
Although some individual techniques of supervised Machine Learning (ML), also known as classifiers, or algorithms of classification, to supply solutions that, most of the time, are considered efficient, have experimental results gotten with the use of large sets of pattern and/or that they have a expressive amount of irrelevant data or incomplete characteristic, that show a decrease in the efficiency of the precision of these techniques. In other words, such techniques can t do an recognition of patterns of an efficient form in complex problems. With the intention to get better performance and efficiency of these ML techniques, were thought about the idea to using some types of LM algorithms work jointly, thus origin to the term Multi-Classifier System (MCS). The MCS s presents, as component, different of LM algorithms, called of base classifiers, and realized a combination of results gotten for these algorithms to reach the final result. So that the MCS has a better performance that the base classifiers, the results gotten for each base classifier must present an certain diversity, in other words, a difference between the results gotten for each classifier that compose the system. It can be said that it does not make signification to have MCS s whose base classifiers have identical answers to the sames patterns. Although the MCS s present better results that the individually systems, has always the search to improve the results gotten for this type of system. Aim at this improvement and a better consistency in the results, as well as a larger diversity of the classifiers of a MCS, comes being recently searched methodologies that present as characteristic the use of weights, or confidence values. These weights can describe the importance that certain classifier supplied when associating with each pattern to a determined class. These weights still are used, in associate with the exits of the classifiers, during the process of recognition (use) of the MCS s. Exist different ways of calculating these weights and can be divided in two categories: the static weights and the dynamic weights. The first category of weights is characterizes for not having the modification of its values during the classification process, different it occurs with the second category, where the values suffers modifications during the classification process. In this work an analysis will be made to verify if the use of the weights, statics as much as dynamics, they can increase the perfomance of the MCS s in comparison with the individually systems. Moreover, will be made an analysis in the diversity gotten for the MCS s, for this mode verify if it has some relation between the use of the weights in the MCS s with different levels of diversity
Resumo:
The techniques of Machine Learning are applied in classification tasks to acquire knowledge through a set of data or information. Some learning methods proposed in literature are methods based on semissupervised learning; this is represented by small percentage of labeled data (supervised learning) combined with a quantity of label and non-labeled examples (unsupervised learning) during the training phase, which reduces, therefore, the need for a large quantity of labeled instances when only small dataset of labeled instances is available for training. A commom problem in semi-supervised learning is as random selection of instances, since most of paper use a random selection technique which can cause a negative impact. Much of machine learning methods treat single-label problems, in other words, problems where a given set of data are associated with a single class; however, through the requirement existent to classify data in a lot of domain, or more than one class, this classification as called multi-label classification. This work presents an experimental analysis of the results obtained using semissupervised learning in troubles of multi-label classification using reliability parameter as an aid in the classification data. Thus, the use of techniques of semissupervised learning and besides methods of multi-label classification, were essential to show the results
Resumo:
Educational Data Mining is an application domain in artificial intelligence area that has been extensively explored nowadays. Technological advances and in particular, the increasing use of virtual learning environments have allowed the generation of considerable amounts of data to be investigated. Among the activities to be treated in this context exists the prediction of school performance of the students, which can be accomplished through the use of machine learning techniques. Such techniques may be used for student’s classification in predefined labels. One of the strategies to apply these techniques consists in their combination to design multi-classifier systems, which efficiency can be proven by results achieved in other studies conducted in several areas, such as medicine, commerce and biometrics. The data used in the experiments were obtained from the interactions between students in one of the most used virtual learning environments called Moodle. In this context, this paper presents the results of several experiments that include the use of specific multi-classifier systems systems, called ensembles, aiming to reach better results in school performance prediction that is, searching for highest accuracy percentage in the student’s classification. Therefore, this paper presents a significant exploration of educational data and it shows analyzes of relevant results about these experiments.