919 resultados para Optical pattern recognition Data processing
Resumo:
Neural networks and wavelet transform have been recently seen as attractive tools for developing eficient solutions for many real world problems in function approximation. Function approximation is a very important task in environments where computation has to be based on extracting information from data samples in real world processes. So, mathematical model is a very important tool to guarantee the development of the neural network area. In this article we will introduce one series of mathematical demonstrations that guarantee the wavelets properties for the PPS functions. As application, we will show the use of PPS-wavelets in pattern recognition problems of handwritten digit through function approximation techniques.
Resumo:
Clustering data is a very important task in data mining, image processing and pattern recognition problems. One of the most popular clustering algorithms is the Fuzzy C-Means (FCM). This thesis proposes to implement a new way of calculating the cluster centers in the procedure of FCM algorithm which are called ckMeans, and in some variants of FCM, in particular, here we apply it for those variants that use other distances. The goal of this change is to reduce the number of iterations and processing time of these algorithms without affecting the quality of the partition, or even to improve the number of correct classifications in some cases. Also, we developed an algorithm based on ckMeans to manipulate interval data considering interval membership degrees. This algorithm allows the representation of data without converting interval data into punctual ones, as it happens to other extensions of FCM that deal with interval data. In order to validate the proposed methodologies it was made a comparison between a clustering for ckMeans, K-Means and FCM algorithms (since the algorithm proposed in this paper to calculate the centers is similar to the K-Means) considering three different distances. We used several known databases. In this case, the results of Interval ckMeans were compared with the results of other clustering algorithms when applied to an interval database with minimum and maximum temperature of the month for a given year, referring to 37 cities distributed across continents
Resumo:
Data clustering is applied to various fields such as data mining, image processing and pattern recognition technique. Clustering algorithms splits a data set into clusters such that elements within the same cluster have a high degree of similarity, while elements belonging to different clusters have a high degree of dissimilarity. The Fuzzy C-Means Algorithm (FCM) is a fuzzy clustering algorithm most used and discussed in the literature. The performance of the FCM is strongly affected by the selection of the initial centers of the clusters. Therefore, the choice of a good set of initial cluster centers is very important for the performance of the algorithm. However, in FCM, the choice of initial centers is made randomly, making it difficult to find a good set. This paper proposes three new methods to obtain initial cluster centers, deterministically, the FCM algorithm, and can also be used in variants of the FCM. In this work these initialization methods were applied in variant ckMeans.With the proposed methods, we intend to obtain a set of initial centers which are close to the real cluster centers. With these new approaches startup if you want to reduce the number of iterations to converge these algorithms and processing time without affecting the quality of the cluster or even improve the quality in some cases. Accordingly, cluster validation indices were used to measure the quality of the clusters obtained by the modified FCM and ckMeans algorithms with the proposed initialization methods when applied to various data sets
Resumo:
This paper describes a data mining environment for knowledge discovery in bioinformatics applications. The system has a generic kernel that implements the mining functions to be applied to input primary databases, with a warehouse architecture, of biomedical information. Both supervised and unsupervised classification can be implemented within the kernel and applied to data extracted from the primary database, with the results being suitably stored in a complex object database for knowledge discovery. The kernel also includes a specific high-performance library that allows designing and applying the mining functions in parallel machines. The experimental results obtained by the application of the kernel functions are reported. © 2003 Elsevier Ltd. All rights reserved.
Resumo:
Until mid 2006, SCIAMACHY data processors for the operational retrieval of nitrogen dioxide (NO2) column data were based on the historical version 2 of the GOME Data Processor (GDP). On top of known problems inherent to GDP 2, ground-based validations of SCIAMACHY NO2 data revealed issues specific to SCIAMACHY, like a large cloud-dependent offset occurring at Northern latitudes. In 2006, the GDOAS prototype algorithm of the improved GDP version 4 was transferred to the off-line SCIAMACHY Ground Processor (SGP) version 3.0. In parallel, the calibration of SCIAMACHY radiometric data was upgraded. Before operational switch-on of SGP 3.0 and public release of upgraded SCIAMACHY NO2 data, we have investigated the accuracy of the algorithm transfer: (a) by checking the consistency of SGP 3.0 with prototype algorithms; and (b) by comparing SGP 3.0 NO2 data with ground-based observations reported by the WMO/GAW NDACC network of UV-visible DOAS/SAOZ spectrometers. This delta-validation study concludes that SGP 3.0 is a significant improvement with respect to the previous processor IPF 5.04. For three particular SCIAMACHY states, the study reveals unexplained features in the slant columns and air mass factors, although the quantitative impact on SGP 3.0 vertical columns is not significant.
Resumo:
The Optimum-Path Forest (OPF) classifier is a recent and promising method for pattern recognition, with a fast training algorithm and good accuracy results. Therefore, the investigation of a combining method for this kind of classifier can be important for many applications. In this paper we report a fast method to combine OPF-based classifiers trained with disjoint training subsets. Given a fixed number of subsets, the algorithm chooses random samples, without replacement, from the original training set. Each subset accuracy is improved by a learning procedure. The final decision is given by majority vote. Experiments with simulated and real data sets showed that the proposed combining method is more efficient and effective than naive approach provided some conditions. It was also showed that OPF training step runs faster for a series of small subsets than for the whole training set. The combining scheme was also designed to support parallel or distributed processing, speeding up the procedure even more. © 2011 Springer-Verlag.
Resumo:
In this work, we propose an innovative methodology to extend the construction of minimum and non-minimum delay perfect codes as a subset of cyclic division algebras over ℚ(ζ3), where the signal constellations are isomorphic to the hexagonal An 2 -rotated lattice, for any channel of any dimension n such that gcd{n, 3) = 1.
Resumo:
The increase in the number of spatial data collected has motivated the development of geovisualisation techniques, aiming to provide an important resource to support the extraction of knowledge and decision making. One of these techniques are 3D graphs, which provides a dynamic and flexible increase of the results analysis obtained by the spatial data mining algorithms, principally when there are incidences of georeferenced objects in a same local. This work presented as an original contribution the potentialisation of visual resources in a computational environment of spatial data mining and, afterwards, the efficiency of these techniques is demonstrated with the use of a real database. The application has shown to be very interesting in interpreting obtained results, such as patterns that occurred in a same locality and to provide support for activities which could be done as from the visualisation of results. © 2013 Springer-Verlag.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Ciências Cartográficas - FCT
Resumo:
Pós-graduação em Ciência e Tecnologia de Materiais - FC
Resumo:
Pós-graduação em Engenharia Mecânica - FEIS
Resumo:
In general, pattern recognition techniques require a high computational burden for learning the discriminating functions that are responsible to separate samples from distinct classes. As such, there are several studies that make effort to employ machine learning algorithms in the context of big data classification problems. The research on this area ranges from Graphics Processing Units-based implementations to mathematical optimizations, being the main drawback of the former approaches to be dependent on the graphic video card. Here, we propose an architecture-independent optimization approach for the optimum-path forest (OPF) classifier, that is designed using a theoretical formulation that relates the minimum spanning tree with the minimum spanning forest generated by the OPF over the training dataset. The experiments have shown that the approach proposed can be faster than the traditional one in five public datasets, being also as accurate as the original OPF. (C) 2014 Elsevier B. V. All rights reserved.
Resumo:
Pós-graduação em Engenharia Elétrica - FEIS
Resumo:
Pós-graduação em Engenharia Elétrica - FEIS