8 resultados para Learning set
em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"
Resumo:
Semi-supervised learning is applied to classification problems where only a small portion of the data items is labeled. In these cases, the reliability of the labels is a crucial factor, because mislabeled items may propagate wrong labels to a large portion or even the entire data set. This paper aims to address this problem by presenting a graph-based (network-based) semi-supervised learning method, specifically designed to handle data sets with mislabeled samples. The method uses teams of walking particles, with competitive and cooperative behavior, for label propagation in the network constructed from the input data set. The proposed model is nature-inspired and it incorporates some features to make it robust to a considerable amount of mislabeled data items. Computer simulations show the performance of the method in the presence of different percentage of mislabeled data, in networks of different sizes and average node degree. Importantly, these simulations reveals the existence of the critical points of the mislabeled subset size, below which the network is free of wrong label contamination, but above which the mislabeled samples start to propagate their labels to the rest of the network. Moreover, numerical comparisons have been made among the proposed method and other representative graph-based semi-supervised learning methods using both artificial and real-world data sets. Interestingly, the proposed method has increasing better performance than the others as the percentage of mislabeled samples is getting larger. © 2012 IEEE.
Resumo:
Both Semi-Supervised Leaning and Active Learning are techniques used when unlabeled data is abundant, but the process of labeling them is expensive and/or time consuming. In this paper, those two machine learning techniques are combined into a single nature-inspired method. It features particles walking on a network built from the data set, using a unique random-greedy rule to select neighbors to visit. The particles, which have both competitive and cooperative behavior, are created on the network as the result of label queries. They may be created as the algorithm executes and only nodes affected by the new particles have to be updated. Therefore, it saves execution time compared to traditional active learning frameworks, in which the learning algorithm has to be executed several times. The data items to be queried are select based on information extracted from the nodes and particles temporal dynamics. Two different rules for queries are explored in this paper, one of them is based on querying by uncertainty approaches and the other is based on data and labeled nodes distribution. Each of them may perform better than the other according to some data sets peculiarities. Experimental results on some real-world data sets are provided, and the proposed method outperforms the semi-supervised learning method, from which it is derived, in all of them.
Resumo:
In this work, we show the experience of continuing teacher education in Cartography in the period from 03/11/2009 to 03/11/2010, it was held by the Center for Continuing Education in Mathematics Education, Science and Environment (CECEMCA) - UNESP - Rio Claro, in DL (Distance Learning). This experience was through the extension course set in TelEduc platform. The course was titled Introduction to Cartography and aimed primarily: Present concepts of systematic and thematic mapping and its potential application in teaching practices, increase knowledge in the areas of Geography, Cartography and Environment; Offer alternatives for implementing content mapping in the classroom.
Resumo:
Pós-graduação em Televisão Digital: Informação e Conhecimento - FAAC
Resumo:
Objective: Is it feasible to learn the basics of wet mount microscopy of vaginal fluid in 10 hours?Materials and Methods: This is a pilot project wherein 6 students with different grades of education were invited for being tested on their ability to read wet mount microscopic slides before and after 10 hours of hands-on training. Microscopy was performed according to a standard protocol (Femicare, Tienen, Belgium). Before and after training, all students had to evaluate a different set of 50 digital slides. Different diagnoses and microscopic patterns had to be scored. kappa indices were calculated compared with the expert reading. Results: All readers improved their mean scores significantly, especially for the most important types of altered flora (p < .0001). The mean increase in reading concordance (kappa from 0.64 to 0.75) of 1 student with a solid previous experience with microscopy did not reach statistical significance, but the remaining 5 students all improved their scores from poor performance (all kappa < 0.20) to moderate (kappa = 0.53, n = 1) to good (kappa > 0.61, n = 4) concordance. Reading quality improved and reached fair to good concordance on all microscopic items studied, except for the detection of parabasal cells and cytolytic flora. Conclusions: Although further improvement is still possible, a short training course of 10 hours enables vast improvement on wet mount microscopy accuracy and results in fair to good concordance of the most important variables of the vaginal flora compared to a reference reader.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Research on image processing has shown that combining segmentation methods may lead to a solid approach to extract semantic information from different sort of images. Within this context, the Normalized Cut (NCut) is usually used as a final partitioning tool for graphs modeled in some chosen method. This work explores the Watershed Transform as a modeling tool, using different criteria of the hierarchical Watershed to convert an image into an adjacency graph. The Watershed is combined with an unsupervised distance learning step that redistributes the graph weights and redefines the Similarity matrix, before the final segmentation step using NCut. Adopting the Berkeley Segmentation Data Set and Benchmark as a background, our goal is to compare the results obtained for this method with previous work to validate its performance.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)