32 resultados para Aprendizagem automática (Machine Learning)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Concept drift is a problem of increasing importance in machine learning and data mining. Data sets under analysis are no longer only static databases, but also data streams in which concepts and data distributions may not be stable over time. However, most learning algorithms produced so far are based on the assumption that data comes from a fixed distribution, so they are not suitable to handle concept drifts. Moreover, some concept drifts applications requires fast response, which means an algorithm must always be (re) trained with the latest available data. But the process of labeling data is usually expensive and/or time consuming when compared to unlabeled data acquisition, thus only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are also based on the assumption that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenge in machine learning. Recently, a particle competition and cooperation approach was used to realize graph-based semi-supervised learning from static data. In this paper, we extend that approach to handle data streams and concept drift. The result is a passive algorithm using a single classifier, which naturally adapts to concept changes, without any explicit drift detection mechanism. Its built-in mechanisms provide a natural way of learning from new data, gradually forgetting older knowledge as older labeled data items became less influent on the classification of newer data items. Some computer simulation are presented, showing the effectiveness of the proposed method.
Resumo:
Due to the increased incidence of skin cancer, computational methods based on intelligent approaches have been developed to aid dermatologists in the diagnosis of skin lesions. This paper proposes a method to classify texture in images, since it is an important feature for the successfully identification of skin lesions. For this is defined a feature vector, with the fractal dimension of images through the box-counting method (BCM), which is used with a SVM to classify the texture of the lesions in to non-irregular or irregular. With the proposed solution, we could obtain an accuracy of 72.84%. © 2012 AISTI.
Resumo:
Semi-supervised learning is applied to classification problems where only a small portion of the data items is labeled. In these cases, the reliability of the labels is a crucial factor, because mislabeled items may propagate wrong labels to a large portion or even the entire data set. This paper aims to address this problem by presenting a graph-based (network-based) semi-supervised learning method, specifically designed to handle data sets with mislabeled samples. The method uses teams of walking particles, with competitive and cooperative behavior, for label propagation in the network constructed from the input data set. The proposed model is nature-inspired and it incorporates some features to make it robust to a considerable amount of mislabeled data items. Computer simulations show the performance of the method in the presence of different percentage of mislabeled data, in networks of different sizes and average node degree. Importantly, these simulations reveals the existence of the critical points of the mislabeled subset size, below which the network is free of wrong label contamination, but above which the mislabeled samples start to propagate their labels to the rest of the network. Moreover, numerical comparisons have been made among the proposed method and other representative graph-based semi-supervised learning methods using both artificial and real-world data sets. Interestingly, the proposed method has increasing better performance than the others as the percentage of mislabeled samples is getting larger. © 2012 IEEE.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Educação - FCT
Resumo:
Pós-graduação em Psicologia - FCLAS
Resumo:
Both Semi-Supervised Leaning and Active Learning are techniques used when unlabeled data is abundant, but the process of labeling them is expensive and/or time consuming. In this paper, those two machine learning techniques are combined into a single nature-inspired method. It features particles walking on a network built from the data set, using a unique random-greedy rule to select neighbors to visit. The particles, which have both competitive and cooperative behavior, are created on the network as the result of label queries. They may be created as the algorithm executes and only nodes affected by the new particles have to be updated. Therefore, it saves execution time compared to traditional active learning frameworks, in which the learning algorithm has to be executed several times. The data items to be queried are select based on information extracted from the nodes and particles temporal dynamics. Two different rules for queries are explored in this paper, one of them is based on querying by uncertainty approaches and the other is based on data and labeled nodes distribution. Each of them may perform better than the other according to some data sets peculiarities. Experimental results on some real-world data sets are provided, and the proposed method outperforms the semi-supervised learning method, from which it is derived, in all of them.
Resumo:
Concept drift, which refers to non stationary learning problems over time, has increasing importance in machine learning and data mining. Many concept drift applications require fast response, which means an algorithm must always be (re)trained with the latest available data. But the process of data labeling is usually expensive and/or time consuming when compared to acquisition of unlabeled data, thus usually only a small fraction of the incoming data may be effectively labeled. Semi-supervised learning methods may help in this scenario, as they use both labeled and unlabeled data in the training process. However, most of them are based on assumptions that the data is static. Therefore, semi-supervised learning with concept drifts is still an open challenging task in machine learning. Recently, a particle competition and cooperation approach has been developed to realize graph-based semi-supervised learning from static data. We have extend that approach to handle data streams and concept drift. The result is a passive algorithm which uses a single classifier approach, naturally adapted to concept changes without any explicit drift detection mechanism. It has built-in mechanisms that provide a natural way of learning from new data, gradually "forgetting" older knowledge as older data items are no longer useful for the classification of newer data items. The proposed algorithm is applied to the KDD Cup 1999 Data of network intrusion, showing its effectiveness.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
It is known that some children fail to learn in school and among them are those that present difficulties and / or learning problems. At this difficulty and / or problems are attributed to various causes neurological problems, environmental or both. The areas most likely to be affected are: reading, writing and mathematics. This situation happens frequently in the daily life of schools and teachers should be prepared to work with diversity and enable the development of all, this includes having this particular part of the school population has a significant learning and an appropriate social relationship. It is therefore aim of this study, to investigate the conceptions of teachers in relation to the theme and possible practices of those in public elementary schools in Bauru city, a medium-sized city in the state of Sao Paulo, with those children who can not learn. Because it is a fact that often occurs in schools, it is necessary to investigate how teachers are working with these students since the school should ensure quality education for all and pedagogical efficiency. To obtain the data we used a questionnaire with multiple choice and essay questions, this tool was applied to a municipal school. It was noted that there is confusion between the terms learning difficulties and problems which affect the performance of teachers with these students and the working conditions of educators hinder their practices
Resumo:
The topic in this work involving the resolution of problems with structure multiplicativa emerged from discussions carried out on the difficulties encountered by students of the first cycle of the Fundamental School in Mathematics, mainly in respect of the arithmetic. The research had as objective to investigate the main difficulties presented by these students when they are faced with a task for a resolution of problems with multiplicativa structure. Were participants, in the first stage of the study, 20 students of the fifth year of the Fundamental School of a state school of public education of the State of Sao Paulo. These students have an assessment containing ten problems with structure multiplicativa answered a questionnaire regarding of mathematics. In the second stage, were selected two students to participate in the think aloud. The data analysis showed that the difficulties presented by the participants were: 1- difficulty to read and interpret the set of problems; 2- select the operation correct; 3- to operate correctly; 4 – Trouble writing
Resumo:
In this project the Pattern Recognition Problem is approached with the Support Vector Machines (SVM) technique, a binary method of classification that provides the best solution separating the data in the better way with a hiperplan and an extension of the input space dimension, as a Machine Learning solution. The system aims to classify two classes of pixels chosen by the user in the interface in the interest selection phase and in the background selection phase, generating all the data to be used in the LibSVM library, a library that implements the SVM, illustrating the library operation in a casual way. The data provided by the interface is organized in three types, RGB (Red, Green and Blue color system), texture (calculated) or RGB + texture. At last the project showed successful results, where the classification of the image pixels was showed as been from one of the two classes, from the interest selection area or from the background selection area. The simplest user view of results classification is the RGB type of data arrange, because it’s the most concrete way of data acquisition
Resumo:
In the pattern recognition research field, Support Vector Machines (SVM) have been an effectiveness tool for classification purposes, being successively employed in many applications. The SVM input data is transformed into a high dimensional space using some kernel functions where linear separation is more likely. However, there are some computational drawbacks associated to SVM. One of them is the computational burden required to find out the more adequate parameters for the kernel mapping considering each non-linearly separable input data space, which reflects the performance of SVM. This paper introduces the Polynomial Powers of Sigmoid for SVM kernel mapping, and it shows their advantages over well-known kernel functions using real and synthetic datasets.
Resumo:
The Experimentation in Science Education is used since the beginning of 19th century and has it origins linked to the laboratory classes realized in the universities. This classes used, and in many cases, still using the Scientific Method initially purposed by Descartes in 18th century for the construction of scientific knowledge. One of the allegations is that the method would be the fast stand the cheapest to generating scientific information, although, it is based on the empiricism-positivism, which considers that all people have the same learning skill and they can start from the same spot. Through this paper, is not intended to contest the scientific methodology, or even its importance in science history, but just try to identify and describe other possibilities in using of the teaching laboratory, which can make the learning easier for a much higher number of students, contemplating different cognitive capabilities and generating a better scientific knowledge learning and its transfer to practical situations in life, besides, they can provide more significant learnings. Over the text, four different purposes will be presented, which depart from the laboratory use for theory evidence, incapable to make students use the learned knowledge outside the school, until that which develops in the students capabilities to scientifically argue about their day to day themes