Biblioteca Digital

953 resultados para Semi-supervised classification

Uso de confiabilidade na rotulação de exemplos em problemas de classificação multirrótulo com aprendizado semissupervisionado

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The techniques of Machine Learning are applied in classification tasks to acquire knowledge through a set of data or information. Some learning methods proposed in literature are methods based on semissupervised learning; this is represented by small percentage of labeled data (supervised learning) combined with a quantity of label and non-labeled examples (unsupervised learning) during the training phase, which reduces, therefore, the need for a large quantity of labeled instances when only small dataset of labeled instances is available for training. A commom problem in semi-supervised learning is as random selection of instances, since most of paper use a random selection technique which can cause a negative impact. Much of machine learning methods treat single-label problems, in other words, problems where a given set of data are associated with a single class; however, through the requirement existent to classify data in a lot of domain, or more than one class, this classification as called multi-label classification. This work presents an experimental analysis of the results obtained using semissupervised learning in troubles of multi-label classification using reliability parameter as an aid in the classification data. Thus, the use of techniques of semissupervised learning and besides methods of multi-label classification, were essential to show the results

Investigando a combinação de técnicas de aprendizado semissupervisionado e classificação hierárquica multirrótulo

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Data classification is a task with high applicability in a lot of areas. Most methods for treating classification problems found in the literature dealing with single-label or traditional problems. In recent years has been identified a series of classification tasks in which the samples can be labeled at more than one class simultaneously (multi-label classification). Additionally, these classes can be hierarchically organized (hierarchical classification and hierarchical multi-label classification). On the other hand, we have also studied a new category of learning, called semi-supervised learning, combining labeled data (supervised learning) and non-labeled data (unsupervised learning) during the training phase, thus reducing the need for a large amount of labeled data when only a small set of labeled samples is available. Thus, since both the techniques of multi-label and hierarchical multi-label classification as semi-supervised learning has shown favorable results with its use, this work is proposed and used to apply semi-supervised learning in hierarchical multi-label classication tasks, so eciently take advantage of the main advantages of the two areas. An experimental analysis of the proposed methods found that the use of semi-supervised learning in hierarchical multi-label methods presented satisfactory results, since the two approaches were statistically similar results

Optimizing optimum-path forest classification for huge datasets

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Traditional pattern recognition techniques can not handle the classification of large datasets with both efficiency and effectiveness. In this context, the Optimum-Path Forest (OPF) classifier was recently introduced, trying to achieve high recognition rates and low computational cost. Although OPF was much faster than Support Vector Machines for training, it was slightly slower for classification. In this paper, we present the Efficient OPF (EOPF), which is an enhanced and faster version of the traditional OPF, and validate it for the automatic recognition of white matter and gray matter in magnetic resonance images of the human brain. © 2010 IEEE.

Improving the accuracy of the optimum-path forest supervised classifier for large datasets

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this work, a new approach for supervised pattern recognition is presented which improves the learning algorithm of the Optimum-Path Forest classifier (OPF), centered on detection and elimination of outliers in the training set. Identification of outliers is based on a penalty computed for each sample in the training set from the corresponding number of imputable false positive and false negative classification of samples. This approach enhances the accuracy of OPF while still gaining in classification time, at the expense of a slight increase in training time. © 2010 Springer-Verlag.

Fuzzy community structure detection by particle competition and cooperation

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Identification and classification of overlapping nodes in networks are important topics in data mining. In this paper, a network-based (graph-based) semi-supervised learning method is proposed. It is based on competition and cooperation among walking particles in a network to uncover overlapping nodes by generating continuous-valued outputs (soft labels), corresponding to the levels of membership from the nodes to each of the communities. Moreover, the proposed method can be applied to detect overlapping data items in a data set of general form, such as a vector-based data set, once it is transformed to a network. Usually, label propagation involves risks of error amplification. In order to avoid this problem, the proposed method offers a mechanism to identify outliers among the labeled data items, and consequently prevents error propagation from such outliers. Computer simulations carried out for synthetic and real-world data sets provide a numeric quantification of the performance of the method. © 2012 Springer-Verlag.

Signal classification of submerged aquatic vegetation based on the hemispherical-conical reflectance factor spectrum shape in the yellow and red regions

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The water column overlying the submerged aquatic vegetation (SAV) canopy presents difficulties when using remote sensing images for mapping such vegetation. Inherent and apparent water optical properties and its optically active components, which are commonly present in natural waters, in addition to the water column height over the canopy, and plant characteristics are some of the factors that affect the signal from SAV mainly due to its strong energy absorption in the near-infrared. By considering these interferences, a hypothesis was developed that the vegetation signal is better conserved and less absorbed by the water column in certain intervals of the visible region of the spectrum; as a consequence, it is possible to distinguish the SAV signal. To distinguish the signal from SAV, two types of classification approaches were selected. Both of these methods consider the hemispherical-conical reflectance factor (HCRF) spectrum shape, although one type was supervised and the other one was not. The first method adopts cluster analysis and uses the parameters of the band (absorption, asymmetry, height and width) obtained by continuum removal as the input of the classification. The spectral angle mapper (SAM) was adopted as the supervised classification approach. Both approaches tested different wavelength intervals in the visible and near-infrared spectra. It was demonstrated that the 585 to 685-nm interval, corresponding to the green, yellow and red wavelength bands, offered the best results in both classification approaches. However, SAM classification showed better results relative to cluster analysis and correctly separated all spectral curves with or without SAV. Based on this research, it can be concluded that it is possible to discriminate areas with and without SAV using remote sensing. © 2013 by the authors; licensee MDPI, Basel, Switzerland.

Estimation of croplands using indicator kriging and fuzzy classification

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Algoritmos de aprendizado semi-supervisionado baseados em grafos aplicados na bioinformática

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Segmentação interativa de imagens utilizando competição e cooperação entre partículas

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Efficient ultrasonic signal processing techniques for aided medical diagnostics

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Ultrasound imaging is widely used in medical diagnostics as it is the fastest, least invasive, and least expensive imaging modality. However, ultrasound images are intrinsically difficult to be interpreted. In this scenario, Computer Aided Detection (CAD) systems can be used to support physicians during diagnosis providing them a second opinion. This thesis discusses efficient ultrasound processing techniques for computer aided medical diagnostics, focusing on two major topics: (i) Ultrasound Tissue Characterization (UTC), aimed at characterizing and differentiating between healthy and diseased tissue; (ii) Ultrasound Image Segmentation (UIS), aimed at detecting the boundaries of anatomical structures to automatically measure organ dimensions and compute clinically relevant functional indices. Research on UTC produced a CAD tool for Prostate Cancer detection to improve the biopsy protocol. In particular, this thesis contributes with: (i) the development of a robust classification system; (ii) the exploitation of parallel computing on GPU for real-time performance; (iii) the introduction of both an innovative Semi-Supervised Learning algorithm and a novel supervised/semi-supervised learning scheme for CAD system training that improve system performance reducing data collection effort and avoiding collected data wasting. The tool provides physicians a risk map highlighting suspect tissue areas, allowing them to perform a lesion-directed biopsy. Clinical validation demonstrated the system validity as a diagnostic support tool and its effectiveness at reducing the number of biopsy cores requested for an accurate diagnosis. For UIS the research developed a heart disease diagnostic tool based on Real-Time 3D Echocardiography. Thesis contributions to this application are: (i) the development of an automated GPU based level-set segmentation framework for 3D images; (ii) the application of this framework to the myocardium segmentation. Experimental results showed the high efficiency and flexibility of the proposed framework. Its effectiveness as a tool for quantitative analysis of 3D cardiac morphology and function was demonstrated through clinical validation.

Content Classification of Multimedia Documents using Partitions of Low-Level Features

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.

Lithological mapping of Northwest Argentina with remote sensing data using tonal, textural and contextual features

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Tonal, textural and contextual properties are used in manual photointerpretation of remotely sensed data. This study has used these three attributes to produce a lithological map of semi arid northwest Argentina by semi automatic computer classification procedures of remotely sensed data. Three different types of satellite data were investigated, these were LANDSAT MSS, TM and SIR-A imagery. Supervised classification procedures using tonal features only produced poor classification results. LANDSAT MSS produced classification accuracies in the range of 40 to 60%, while accuracies of 50 to 70% were achieved using LANDSAT TM data. The addition of SIR-A data produced increases in the classification accuracy. The increased classification accuracy of TM over the MSS is because of the better discrimination of geological materials afforded by the middle infra red bands of the TM sensor. The maximum likelihood classifier consistently produced classification accuracies 10 to 15% higher than either the minimum distance to means or decision tree classifier, this improved accuracy was obtained at the cost of greatly increased processing time. A new type of classifier the spectral shape classifier, which is computationally as fast as a minimum distance to means classifier is described. However, the results for this classifier were disappointing, being lower in most cases than the minimum distance or decision tree procedures. The classification results using only tonal features were felt to be unacceptably poor, therefore textural attributes were investigated. Texture is an important attribute used by photogeologists to discriminate lithology. In the case of TM data, texture measures were found to increase the classification accuracy by up to 15%. However, in the case of the LANDSAT MSS data the use of texture measures did not provide any significant increase in the accuracy of classification. For TM data, it was found that second order texture, especially the SGLDM based measures, produced highest classification accuracy. Contextual post processing was found to increase classification accuracy and improve the visual appearance of classified output by removing isolated misclassified pixels which tend to clutter classified images. Simple contextual features, such as mode filters were found to out perform more complex features such as gravitational filter or minimal area replacement methods. Generally the larger the size of the filter, the greater the increase in the accuracy. Production rules were used to build a knowledge based system which used tonal and textural features to identify sedimentary lithologies in each of the two test sites. The knowledge based system was able to identify six out of ten lithologies correctly.

A Multi-View Learning Algorithm

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014

Análise dos impactos das mudanças do uso do solo na futura gestão e conservação da Reserva da Chimanimani - Moçambique

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Dissertação de Mestrado, Gestão e Conservação da Natureza, 27 de Outubro de 2015, Universidade dos Açores.

Nonlinear mixture model for hyperspectral unmixing

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Proceedings of International Conference - SPIE 7477, Image and Signal Processing for Remote Sensing XV - 28 September 2009

«
1
2
3
4
5
6
7
8
...
63
64
»