820 resultados para Data classification
Resumo:
The data acquired by Remote Sensing systems allow obtaining thematic maps of the earth's surface, by means of the registered image classification. This implies the identification and categorization of all pixels into land cover classes. Traditionally, methods based on statistical parameters have been widely used, although they show some disadvantages. Nevertheless, some authors indicate that those methods based on artificial intelligence, may be a good alternative. Thus, fuzzy classifiers, which are based on Fuzzy Logic, include additional information in the classification process through based-rule systems. In this work, we propose the use of a genetic algorithm (GA) to select the optimal and minimum set of fuzzy rules to classify remotely sensed images. Input information of GA has been obtained through the training space determined by two uncorrelated spectral bands (2D scatter diagrams), which has been irregularly divided by five linguistic terms defined in each band. The proposed methodology has been applied to Landsat-TM images and it has showed that this set of rules provides a higher accuracy level in the classification process
Resumo:
The application of thematic maps obtained through the classification of remote images needs the obtained products with an optimal accuracy. The registered images from the airplanes display a very satisfactory spatial resolution, but the classical methods of thematic classification not always give better results than when the registered data from satellite are used. In order to improve these results of classification, in this work, the LIDAR sensor data from first return (Light Detection And Ranging) registered simultaneously with the spectral sensor data from airborne are jointly used. The final results of the thematic classification of the scene object of study have been obtained, quantified and discussed with and without LIDAR data, after applying different methods: Maximum Likehood Classification, Support Vector Machine with four different functions kernel and Isodata clustering algorithm (ML, SVM-L, SVM-P, SVM-RBF, SVM-S, Isodata). The best results are obtained for SVM with Sigmoide kernel. These allow the correlation with others different physical parameters with great interest like Manning hydraulic coefficient, for their incorporation in a GIS and their application in hydraulic modeling.
Resumo:
Managing large medical image collections is an increasingly demanding important issue in many hospitals and other medical settings. A huge amount of this information is daily generated, which requires robust and agile systems. In this paper we present a distributed multi-agent system capable of managing very large medical image datasets. In this approach, agents extract low-level information from images and store them in a data structure implemented in a relational database. The data structure can also store semantic information related to images and particular regions. A distinctive aspect of our work is that a single image can be divided so that the resultant sub-images can be stored and managed separately by different agents to improve performance in data accessing and processing. The system also offers the possibility of applying some region-based operations and filters on images, facilitating image classification. These operations can be performed directly on data structures in the database.
Resumo:
A land classification method was designed for the Community of Madrid (CM), which has lands suitable for either agriculture use or natural spaces. The process started from an extensive previous CM study that contains sets of land attributes with data for 122 types and a minimum-requirements method providing a land quality classification (SQ) for each land. Borrowing some tools from Operations Research (OR) and from Decision Science, that SQ has been complemented by an additive valuation method that involves a more restricted set of 13 representative attributes analysed using Attribute Valuation Functions to obtain a quality index, QI, and by an original composite method that uses a fuzzy set procedure to obtain a combined quality index, CQI, that contains relevant information from both the SQ and the QI methods.
Resumo:
We present a novel approach using both sustained vowels and connected speech, to detect obstructive sleep apnea (OSA) cases within a homogeneous group of speakers. The proposed scheme is based on state-of-the-art GMM-based classifiers, and acknowledges specifically the way in which acoustic models are trained on standard databases, as well as the complexity of the resulting models and their adaptation to specific data. Our experimental database contains a suitable number of utterances and sustained speech from healthy (i.e control) and OSA Spanish speakers. Finally, a 25.1% relative reduction in classification error is achieved when fusing continuous and sustained speech classifiers. Index Terms: obstructive sleep apnea (OSA), gaussian mixture models (GMMs), background model (BM), classifier fusion.
Resumo:
There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson’s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.
Resumo:
This paper addresses the question of maximizing classifier accuracy for classifying task-related mental activity from Magnetoencelophalography (MEG) data. We propose the use of different sources of information and introduce an automatic channel selection procedure. To determine an informative set of channels, our approach combines a variety of machine learning algorithms: feature subset selection methods, classifiers based on regularized logistic regression, information fusion, and multiobjective optimization based on probabilistic modeling of the search space. The experimental results show that our proposal is able to improve classification accuracy compared to approaches whose classifiers use only one type of MEG information or for which the set of channels is fixed a priori.
Resumo:
In the last decade, complex networks have widely been applied to the study of many natural and man-made systems, and to the extraction of meaningful information from the interaction structures created by genes and proteins. Nevertheless, less attention has been devoted to metabonomics, due to the lack of a natural network representation of spectral data. Here we define a technique for reconstructing networks from spectral data sets, where nodes represent spectral bins, and pairs of them are connected when their intensities follow a pattern associated with a disease. The structural analysis of the resulting network can then be used to feed standard data-mining algorithms, for instance for the classification of new (unlabeled) subjects. Furthermore, we show how the structure of the network is resilient to the presence of external additive noise, and how it can be used to extract relevant knowledge about the development of the disease.
Resumo:
Ubiquitous computing software needs to be autonomous so that essential decisions such as how to configure its particular execution are self-determined. Moreover, data mining serves an important role for ubiquitous computing by providing intelligence to several types of ubiquitous computing applications. Thus, automating ubiquitous data mining is also crucial. We focus on the problem of automatically configuring the execution of a ubiquitous data mining algorithm. In our solution, we generate configuration decisions in a resource aware and context aware manner since the algorithm executes in an environment in which the context often changes and computing resources are often severely limited. We propose to analyze the execution behavior of the data mining algorithm by mining its past executions. By doing so, we discover the effects of resource and context states as well as parameter settings on the data mining quality. We argue that a classification model is appropriate for predicting the behavior of an algorithm?s execution and we concentrate on decision tree classifier. We also define taxonomy on data mining quality so that tradeoff between prediction accuracy and classification specificity of each behavior model that classifies by a different abstraction of quality, is scored for model selection. Behavior model constituents and class label transformations are formally defined and experimental validation of the proposed approach is also performed.