915 resultados para Automatic classifier
Resumo:
In general, pattern recognition techniques require a high computational burden for learning the discriminating functions that are responsible to separate samples from distinct classes. As such, there are several studies that make effort to employ machine learning algorithms in the context of big data classification problems. The research on this area ranges from Graphics Processing Units-based implementations to mathematical optimizations, being the main drawback of the former approaches to be dependent on the graphic video card. Here, we propose an architecture-independent optimization approach for the optimum-path forest (OPF) classifier, that is designed using a theoretical formulation that relates the minimum spanning tree with the minimum spanning forest generated by the OPF over the training dataset. The experiments have shown that the approach proposed can be faster than the traditional one in five public datasets, being also as accurate as the original OPF. (C) 2014 Elsevier B. V. All rights reserved.
Resumo:
This paper presents a Computer Aided Diagnosis (CAD) system that automatically classifies microcalcifications detected on digital mammograms into one of the five types proposed by Michele Le Gal, a classification scheme that allows radiologists to determine whether a breast tumor is malignant or not without the need for surgeries. The developed system uses a combination of wavelets and Artificial Neural Networks (ANN) and is executed on an Altera DE2-115 Development Kit, a kit containing a Field-Programmable Gate Array (FPGA) that allows the system to be smaller, cheaper and more energy efficient. Results have shown that the system was able to correctly classify 96.67% of test samples, which can be used as a second opinion by radiologists in breast cancer early diagnosis. (C) 2013 The Authors. Published by Elsevier B.V.
Resumo:
In Computer-Aided Diagnosis-based schemes in mammography analysis each module is interconnected, which directly affects the system operation as a whole. The identification of mammograms with and without masses is highly needed to reduce the false positive rates regarding the automatic selection of regions of interest for further image segmentation. This study aims to evaluate the performance of three techniques in classifying regions of interest as containing masses or without masses (without clinical findings), as well as the main contribution of this work is to introduce the Optimum-Path Forest (OPF) classifier in this context, which has never been done so far. Thus, we have compared OPF against with two sorts of neural networks in a private dataset composed by 120 images: Radial Basis Function and Multilayer Perceptron (MLP). Texture features have been used for such purpose, and the experiments have demonstrated that MLP networks have been slightly better than OPF, but the former is much faster, which can be a suitable tool for real-time recognition systems.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Image categorization by means of bag of visual words has received increasing attention by the image processing and vision communities in the last years. In these approaches, each image is represented by invariant points of interest which are mapped to a Hilbert Space representing a visual dictionary which aims at comprising the most discriminative features in a set of images. Notwithstanding, the main problem of such approaches is to find a compact and representative dictionary. Finding such representative dictionary automatically with no user intervention is an even more difficult task. In this paper, we propose a method to automatically find such dictionary by employing a recent developed graph-based clustering algorithm called Optimum-Path Forest, which does not make any assumption about the visual dictionary's size and is more efficient and effective than the state-of-the-art techniques used for dictionary generation.
Resumo:
Princeton WordNet (WN.Pr) lexical database has motivated efficient compilations of bulky relational lexicons since its inception in the 1980's. The EuroWordNet project, the first multilingual initiative built upon WN.Pr, opened up ways of building individual wordnets, and interrelating them by means of the so-called Inter-Lingual-Index, an unstructured list of the WN.Pr synsets. Other important initiative, relying on a slightly different method of building multilingual wordnets, is the MultiWordNet project, where the key strategy is building language specific wordnets keeping as much as possible of the semantic relations available in the WN.Pr. This paper, in particular, stresses that the additional advantage of using WN.Pr lexical database as a resource for building wordnets for other languages is to explore possibilities of implementing an automatic procedure to map the WN.Pr conceptual relations as hyponymy, co-hyponymy, troponymy, meronymy, cause, and entailment onto the lexical database of the wordnet under construction, a viable possibility, for those are language-independent relations that hold between lexicalized concepts, not between lexical units. Accordingly, combining methods from both initiatives, this paper presents the ongoing implementation of the WN.Br lexical database and the aforementioned automation procedure illustrated with a sample of the automatic encoding of the hyponymy and co-hyponymy relations.
Resumo:
This paper reports a research to evaluate the potential and the effects of use of annotated Paraconsistent logic in automatic indexing. This logic attempts to deal with contradictions, concerned with studying and developing inconsistency-tolerant systems of logic. This logic, being flexible and containing logical states that go beyond the dichotomies yes and no, permits to advance the hypothesis that the results of indexing could be better than those obtained by traditional methods. Interactions between different disciplines, as information retrieval, automatic indexing, information visualization, and nonclassical logics were considered in this research. From the methodological point of view, an algorithm for treatment of uncertainty and imprecision, developed under the Paraconsistent logic, was used to modify the values of the weights assigned to indexing terms of the text collections. The tests were performed on an information visualization system named Projection Explorer (PEx), created at Institute of Mathematics and Computer Science (ICMC - USP Sao Carlos), with available source code. PEx uses traditional vector space model to represent documents of a collection. The results were evaluated by criteria built in the information visualization system itself, and demonstrated measurable gains in the quality of the displays, confirming the hypothesis that the use of the para-analyser under the conditions of the experiment has the ability to generate more effective clusters of similar documents. This is a point that draws attention, since the constitution of more significant clusters can be used to enhance information indexing and retrieval. It can be argued that the adoption of non-dichotomous (non-exclusive) parameters provides new possibilities to relate similar information.
Resumo:
In this letter, a speech recognition algorithm based on the least-squares method is presented. Particularly, the intention is to exemplify how such a traditional numerical technique can be applied to solve a signal processing problem that is usually treated by using more elaborated formulations.
Resumo:
In vitro production has been employed in bovine embryos and quantification of lipids is fundamental to understand the metabolism of these embryos. This paper presents a unsupervised segmentation method for histological images of bovine embryos. In this method, the anisotropic filter was used in the differents RGB components. After pre-processing step, the thresholding technique based on maximum entropy was applied to separate lipid droplets in the histological slides in different stages: early cleavage, morula and blastocyst. In the postprocessing step, false positives are removed using the connected components technique that identify regions with excess of dye near pellucid zone. The proposed segmentation method was applied in 30 histological images of bovine embryos. Experiments were performed with the images and statistical measures of sensitivity, specificity and accuracy were calculated based on reference images (gold standard). The value of accuracy of the proposed method was 96% with standard deviation of 3%.
Resumo:
In this paper we presente a classification system that uses a combination of texture features from stromal regions: Haralick features and Local Binary Patterns (LBP) in wavelet domain. The system has five steps for classification of the tissues. First, the stromal regions were detected and extracted using segmentation techniques based on thresholding and RGB colour space. Second, the Wavelet decomposition was applied in the extracted regions to obtain the Wavelet coefficients. Third, the Haralick and LBP features were extracted from the coefficients. Fourth, relevant features were selected using the ANOVA statistical method. The classication (fifth step) was performed with Radial Basis Function (RBF) networks. The system was tested in 105 prostate images, which were divided into three groups of 35 images: normal, hyperplastic and cancerous. The system performance was evaluated using the area under the ROC curve and resulted in 0.98 for normal versus cancer, 0.95 for hyperplasia versus cancer and 0.96 for normal versus hyperplasia. Our results suggest that texture features can be used as discriminators for stromal tissues prostate images. Furthermore, the system was effective to classify prostate images, specially the hyperplastic class which is the most difficult type in diagnosis and prognosis.
Automatic method to classify images based on multiscale fractal descriptors and paraconsistent logic
Resumo:
In this study is presented an automatic method to classify images from fractal descriptors as decision rules, such as multiscale fractal dimension and lacunarity. The proposed methodology was divided in three steps: quantification of the regions of interest with fractal dimension and lacunarity, techniques under a multiscale approach; definition of reference patterns, which are the limits of each studied group; and, classification of each group, considering the combination of the reference patterns with signals maximization (an approach commonly considered in paraconsistent logic). The proposed method was used to classify histological prostatic images, aiming the diagnostic of prostate cancer. The accuracy levels were important, overcoming those obtained with Support Vector Machine (SVM) and Bestfirst Decicion Tree (BFTree) classifiers. The proposed approach allows recognize and classify patterns, offering the advantage of giving comprehensive results to the specialists.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Background: This paper addresses the prediction of the free energy of binding of a drug candidate with enzyme InhA associated with Mycobacterium tuberculosis. This problem is found within rational drug design, where interactions between drug candidates and target proteins are verified through molecular docking simulations. In this application, it is important not only to correctly predict the free energy of binding, but also to provide a comprehensible model that could be validated by a domain specialist. Decision-tree induction algorithms have been successfully used in drug-design related applications, specially considering that decision trees are simple to understand, interpret, and validate. There are several decision-tree induction algorithms available for general-use, but each one has a bias that makes it more suitable for a particular data distribution. In this article, we propose and investigate the automatic design of decision-tree induction algorithms tailored to particular drug-enzyme binding data sets. We investigate the performance of our new method for evaluating binding conformations of different drug candidates to InhA, and we analyze our findings with respect to decision tree accuracy, comprehensibility, and biological relevance. Results: The empirical analysis indicates that our method is capable of automatically generating decision-tree induction algorithms that significantly outperform the traditional C4.5 algorithm with respect to both accuracy and comprehensibility. In addition, we provide the biological interpretation of the rules generated by our approach, reinforcing the importance of comprehensible predictive models in this particular bioinformatics application. Conclusions: We conclude that automatically designing a decision-tree algorithm tailored to molecular docking data is a promising alternative for the prediction of the free energy from the binding of a drug candidate with a flexible-receptor.
Resumo:
The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.
Resumo:
In this manuscript, an automatic setup for screening of microcystins in surface waters by employing photometric detection is described. Microcystins are toxins delivered by cyanobacteria within an aquatic environment, which have been considered strongly poisonous for humans. For that reason, the World Health Organization (WHO) has proposed a provisional guideline value for drinking water of 1 mu g L-1. In this work, we developed an automated equipment setup, which allows the screening of water for concentration of microcystins below 0.1 mu g V. The photometric method was based on the enzyme-linked immunosorbent assay (ELISA) and the analytical signal was monitored at 458 nm using a homemade LED-based photometer. The proposed system was employed for the detection of microcystins in rivers and lakes waters. Accuracy was assessed by processing samples using a reference method and applying the paired t-test between results. No significant difference at the 95% confidence level was observed. Other useful features including a linear response ranging from 0.05 up to 2.00 mu g L-1 (R-2 =0.999) and a detection limit of 0.03 mu g L-1 microcystins were achieved. (C) 2011 Elsevier B.V. All rights reserved.