917 resultados para Pattern-recognition Methods
Resumo:
Given the widespread use of computers, the visual pattern recognition task has been automated in order to address the huge amount of available digital images. Many applications use image processing techniques as well as feature extraction and visual pattern recognition algorithms in order to identify people, to make the disease diagnosis process easier, to classify objects, etc. based on digital images. Among the features that can be extracted and analyzed from images is the shape of objects or regions. In some cases, shape is the unique feature that can be extracted with a relatively high accuracy from the image. In this work we present some of most important shape analysis methods and compare their performance when applied on three well-known shape image databases. Finally, we propose the development of a new shape descriptor based on the Hough Transform.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Color texture classification is an important step in image segmentation and recognition. The color information is especially important in textures of natural scenes, such as leaves surfaces, terrains models, etc. In this paper, we propose a novel approach based on the fractal dimension for color texture analysis. The proposed approach investigates the complexity in R, G and B color channels to characterize a texture sample. We also propose to study all channels in combination, taking into consideration the correlations between them. Both these approaches use the volumetric version of the Bouligand-Minkowski Fractal Dimension method. The results show a advantage of the proposed method over other color texture analysis methods. (C) 2011 Elsevier Ltd. All rights reserved.
Resumo:
Methods from statistical physics, such as those involving complex networks, have been increasingly used in the quantitative analysis of linguistic phenomena. In this paper, we represented pieces of text with different levels of simplification in co-occurrence networks and found that topological regularity correlated negatively with textual complexity. Furthermore, in less complex texts the distance between concepts, represented as nodes, tended to decrease. The complex networks metrics were treated with multivariate pattern recognition techniques, which allowed us to distinguish between original texts and their simplified versions. For each original text, two simplified versions were generated manually with increasing number of simplification operations. As expected, distinction was easier for the strongly simplified versions, where the most relevant metrics were node strength, shortest paths and diversity. Also, the discrimination of complex texts was improved with higher hierarchical network metrics, thus pointing to the usefulness of considering wider contexts around the concepts. Though the accuracy rate in the distinction was not as high as in methods using deep linguistic knowledge, the complex network approach is still useful for a rapid screening of texts whenever assessing complexity is essential to guarantee accessibility to readers with limited reading ability. Copyright (c) EPLA, 2012
Resumo:
Content-based image retrieval is still a challenging issue due to the inherent complexity of images and choice of the most discriminant descriptors. Recent developments in the field have introduced multidimensional projections to burst accuracy in the retrieval process, but many issues such as introduction of pattern recognition tasks and deeper user intervention to assist the process of choosing the most discriminant features still remain unaddressed. In this paper, we present a novel framework to CBIR that combines pattern recognition tasks, class-specific metrics, and multidimensional projection to devise an effective and interactive image retrieval system. User interaction plays an essential role in the computation of the final multidimensional projection from which image retrieval will be attained. Results have shown that the proposed approach outperforms existing methods, turning out to be a very attractive alternative for managing image data sets.
Resumo:
Thiosemicarbazones are cruzain inhibitors which have been identified as potential antitrypanosomal agents. In this work, several molecular properties were calculated at the density functional theory (DFT)/B3LYP/6-311G* level for a set of 44 thiosemicarbazones. Unsupervised and supervised pattern recognition techniques (hierarchical cluster analysis, principal component analysis, kth-nearest neighbors, and soft independent modeling by class analogy) were used to obtain structureactivity relationship models, which are able to classify unknown compounds according to their activities. The chemometric analyses performed here revealed that 12 descriptors can be considered responsible for the discrimination between high and low activity compounds. Classification models were validated with an external test set, showing that predictive classifications were achieved with the selected variable set. The results obtained here are in good agreement with previous findings from the literature, suggesting that our models can be useful on further investigations on the molecular determinants for the antichagasic activity. (C) 2012 Wiley Periodicals, Inc.
Resumo:
This paper compares the effectiveness of the Tsallis entropy over the classic Boltzmann-Gibbs-Shannon entropy for general pattern recognition, and proposes a multi-q approach to improve pattern analysis using entropy. A series of experiments were carried out for the problem of classifying image patterns. Given a dataset of 40 pattern classes, the goal of our image case study is to assess how well the different entropies can be used to determine the class of a newly given image sample. Our experiments show that the Tsallis entropy using the proposed multi-q approach has great advantages over the Boltzmann-Gibbs-Shannon entropy for pattern classification, boosting image recognition rates by a factor of 3. We discuss the reasons behind this success, shedding light on the usefulness of the Tsallis entropy and the multi-q approach. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Traditional supervised data classification considers only physical features (e. g., distance or similarity) of the input data. Here, this type of learning is called low level classification. On the other hand, the human (animal) brain performs both low and high orders of learning and it has facility in identifying patterns according to the semantic meaning of the input data. Data classification that considers not only physical attributes but also the pattern formation is, here, referred to as high level classification. In this paper, we propose a hybrid classification technique that combines both types of learning. The low level term can be implemented by any classification technique, while the high level term is realized by the extraction of features of the underlying network constructed from the input data. Thus, the former classifies the test instances by their physical features or class topologies, while the latter measures the compliance of the test instances to the pattern formation of the data. Our study shows that the proposed technique not only can realize classification according to the pattern formation, but also is able to improve the performance of traditional classification techniques. Furthermore, as the class configuration's complexity increases, such as the mixture among different classes, a larger portion of the high level term is required to get correct classification. This feature confirms that the high level classification has a special importance in complex situations of classification. Finally, we show how the proposed technique can be employed in a real-world application, where it is capable of identifying variations and distortions of handwritten digit images. As a result, it supplies an improvement in the overall pattern recognition rate.
Resumo:
In this paper,we present a novel texture analysis method based on deterministic partially self-avoiding walks and fractal dimension theory. After finding the attractors of the image (set of pixels) using deterministic partially self-avoiding walks, they are dilated in direction to the whole image by adding pixels according to their relevance. The relevance of each pixel is calculated as the shortest path between the pixel and the pixels that belongs to the attractors. The proposed texture analysis method is demonstrated to outperform popular and state-of-the-art methods (e.g. Fourier descriptors, occurrence matrix, Gabor filter and local binary patterns) as well as deterministic tourist walk method and recent fractal methods using well-known texture image datasets.
Resumo:
The measurement of mesozooplankton biomass in the ocean requires the use of analytical procedures that destroy the samples. Alternatively, the development of methods to estimate biomass from optical systems and appropriate conversion factors could be a compromise between the accuracy of analytical methods and the need to preserve the samples for further taxonomic studies. The conversion of the body area recorded by an optical counter or a camera, by converting the digitized area of an organism into individual biomass, was suggested as a suitable method to estimate total biomass. In this study, crustacean mesozooplankton from subtropical waters were analyzed, and individual dry weight and body area were compared. The obtained relationships agreed with other measurements of biomass obtained from a previous study in Antarctic waters. Gelatinous mesozooplankton from subtropical and Antarctic waters were also sampled and processed for body area and biomass. As expected, differences between crustacean and gelatinous plankton were highly significant. Transparent gelatinous organisms have a lower dry weight per unit area. Therefore, to estimate biomass from digitized images, pattern recognition discerning, at least, between crustaceans and gelatinous forms is required.
Resumo:
In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.
Resumo:
Facial expression recognition is one of the most challenging research areas in the image recognition ¯eld and has been actively studied since the 70's. For instance, smile recognition has been studied due to the fact that it is considered an important facial expression in human communication, it is therefore likely useful for human–machine interaction. Moreover, if a smile can be detected and also its intensity estimated, it will raise the possibility of new applications in the future
Resumo:
Für die Zukunft wird eine Zunahme an Verkehr prognostiziert, gleichzeitig herrscht ein Mangel an Raum und finanziellen Mitteln, um weitere Straßen zu bauen. Daher müssen die vorhandenen Kapazitäten durch eine bessere Verkehrssteuerung sinnvoller genutzt werden, z.B. durch Verkehrsleitsysteme. Dafür werden räumlich aufgelöste, d.h. den Verkehr in seiner flächenhaften Verteilung wiedergebende Daten benötigt, die jedoch fehlen. Bisher konnten Verkehrsdaten nur dort erhoben werden, wo sich örtlich feste Meßeinrichtungen befinden, jedoch können damit die fehlenden Daten nicht erhoben werden. Mit Fernerkundungssystemen ergibt sich die Möglichkeit, diese Daten flächendeckend mit einem Blick von oben zu erfassen. Nach jahrzehntelangen Erfahrungen mit Fernerkundungsmethoden zur Erfassung und Untersuchung der verschiedensten Phänomene auf der Erdoberfläche wird nun diese Methodik im Rahmen eines Pilotprojektes auf den Themenbereich Verkehr angewendet. Seit Ende der 1990er Jahre wurde mit flugzeuggetragenen optischen und Infrarot-Aufnahmesystemen Verkehr beobachtet. Doch bei schlechten Wetterbedingungen und insbesondere bei Bewölkung, sind keine brauchbaren Aufnahmen möglich. Mit einem abbildenden Radarverfahren werden Daten unabhängig von Wetter- und Tageslichtbedingungen oder Bewölkung erhoben. Im Rahmen dieser Arbeit wird untersucht, inwieweit mit Hilfe von flugzeuggetragenem synthetischem Apertur Radar (SAR) Verkehrsdaten aufgenommen, verarbeitet und sinnvoll angewendet werden können. Nicht nur wird die neue Technik der Along-Track Interferometrie (ATI) und die Prozessierung und Verarbeitung der aufgenommenen Verkehrsdaten ausführlich dargelegt, es wird darüberhinaus ein mit dieser Methodik erstellter Datensatz mit einer Verkehrssimulation verglichen und bewertet. Abschließend wird ein Ausblick auf zukünftige Entwicklungen der Radarfernerkundung zur Verkehrsdatenerfassung gegeben.