53 resultados para Máquina de vetor de suporte
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
The use of the maps obtained from remote sensing orbital images submitted to digital processing became fundamental to optimize conservation and monitoring actions of the coral reefs. However, the accuracy reached in the mapping of submerged areas is limited by variation of the water column that degrades the signal received by the orbital sensor and introduces errors in the final result of the classification. The limited capacity of the traditional methods based on conventional statistical techniques to solve the problems related to the inter-classes took the search of alternative strategies in the area of the Computational Intelligence. In this work an ensemble classifiers was built based on the combination of Support Vector Machines and Minimum Distance Classifier with the objective of classifying remotely sensed images of coral reefs ecosystem. The system is composed by three stages, through which the progressive refinement of the classification process happens. The patterns that received an ambiguous classification in a certain stage of the process were revalued in the subsequent stage. The prediction non ambiguous for all the data happened through the reduction or elimination of the false positive. The images were classified into five bottom-types: deep water; under-water corals; inter-tidal corals; algal and sandy bottom. The highest overall accuracy (89%) was obtained from SVM with polynomial kernel. The accuracy of the classified image was compared through the use of error matrix to the results obtained by the application of other classification methods based on a single classifier (neural network and the k-means algorithm). In the final, the comparison of results achieved demonstrated the potential of the ensemble classifiers as a tool of classification of images from submerged areas subject to the noise caused by atmospheric effects and the water column
Resumo:
The skin cancer is the most common of all cancers and the increase of its incidence must, in part, caused by the behavior of the people in relation to the exposition to the sun. In Brazil, the non-melanoma skin cancer is the most incident in the majority of the regions. The dermatoscopy and videodermatoscopy are the main types of examinations for the diagnosis of dermatological illnesses of the skin. The field that involves the use of computational tools to help or follow medical diagnosis in dermatological injuries is seen as very recent. Some methods had been proposed for automatic classification of pathology of the skin using images. The present work has the objective to present a new intelligent methodology for analysis and classification of skin cancer images, based on the techniques of digital processing of images for extraction of color characteristics, forms and texture, using Wavelet Packet Transform (WPT) and learning techniques called Support Vector Machine (SVM). The Wavelet Packet Transform is applied for extraction of texture characteristics in the images. The WPT consists of a set of base functions that represents the image in different bands of frequency, each one with distinct resolutions corresponding to each scale. Moreover, the characteristics of color of the injury are also computed that are dependants of a visual context, influenced for the existing colors in its surround, and the attributes of form through the Fourier describers. The Support Vector Machine is used for the classification task, which is based on the minimization principles of the structural risk, coming from the statistical learning theory. The SVM has the objective to construct optimum hyperplanes that represent the separation between classes. The generated hyperplane is determined by a subset of the classes, called support vectors. For the used database in this work, the results had revealed a good performance getting a global rightness of 92,73% for melanoma, and 86% for non-melanoma and benign injuries. The extracted describers and the SVM classifier became a method capable to recognize and to classify the analyzed skin injuries
Resumo:
The human voice is an important communication tool and any disorder of the voice can have profound implications for social and professional life of an individual. Techniques of digital signal processing have been used by acoustic analysis of vocal disorders caused by pathologies in the larynx, due to its simplicity and noninvasive nature. This work deals with the acoustic analysis of voice signals affected by pathologies in the larynx, specifically, edema, and nodules on the vocal folds. The purpose of this work is to develop a classification system of voices to help pre-diagnosis of pathologies in the larynx, as well as monitoring pharmacological treatments and after surgery. Linear Prediction Coefficients (LPC), Mel Frequency cepstral coefficients (MFCC) and the coefficients obtained through the Wavelet Packet Transform (WPT) are applied to extract relevant characteristics of the voice signal. For the classification task is used the Support Vector Machine (SVM), which aims to build optimal hyperplanes that maximize the margin of separation between the classes involved. The hyperplane generated is determined by the support vectors, which are subsets of points in these classes. According to the database used in this work, the results showed a good performance, with a hit rate of 98.46% for classification of normal and pathological voices in general, and 98.75% in the classification of diseases together: edema and nodules
Sistema inteligente para detecção de manchas de óleo na superfície marinha através de imagens de SAR
Resumo:
Oil spill on the sea, accidental or not, generates enormous negative consequences for the affected area. The damages are ambient and economic, mainly with the proximity of these spots of preservation areas and/or coastal zones. The development of automatic techniques for identification of oil spots on the sea surface, captured through Radar images, assist in a complete monitoring of the oceans and seas. However spots of different origins can be visualized in this type of imaging, which is a very difficult task. The system proposed in this work, based on techniques of digital image processing and artificial neural network, has the objective to identify the analyzed spot and to discern between oil and other generating phenomena of spot. Tests in functional blocks that compose the proposed system allow the implementation of different algorithms, as well as its detailed and prompt analysis. The algorithms of digital image processing (speckle filtering and gradient), as well as classifier algorithms (Multilayer Perceptron, Radial Basis Function, Support Vector Machine and Committe Machine) are presented and commented.The final performance of the system, with different kind of classifiers, is presented by ROC curve. The true positive rates are considered agreed with the literature about oil slick detection through SAR images presents
Resumo:
Lung cancer is one of the most common types of cancer and has the highest mortality rate. Patient survival is highly correlated with early detection. Computed Tomography technology services the early detection of lung cancer tremendously by offering aminimally invasive medical diagnostic tool. However, the large amount of data per examination makes the interpretation difficult. This leads to omission of nodules by human radiologist. This thesis presents a development of a computer-aided diagnosis system (CADe) tool for the detection of lung nodules in Computed Tomography study. The system, called LCD-OpenPACS (Lung Cancer Detection - OpenPACS) should be integrated into the OpenPACS system and have all the requirements for use in the workflow of health facilities belonging to the SUS (Brazilian health system). The LCD-OpenPACS made use of image processing techniques (Region Growing and Watershed), feature extraction (Histogram of Gradient Oriented), dimensionality reduction (Principal Component Analysis) and classifier (Support Vector Machine). System was tested on 220 cases, totaling 296 pulmonary nodules, with sensitivity of 94.4% and 7.04 false positives per case. The total time for processing was approximately 10 minutes per case. The system has detected pulmonary nodules (solitary, juxtavascular, ground-glass opacity and juxtapleural) between 3 mm and 30 mm.
Resumo:
Several are the areas in which digital images are used in solving day-to-day problems. In medicine the use of computer systems have improved the diagnosis and medical interpretations. In dentistry it’s not different, increasingly procedures assisted by computers have support dentists in their tasks. Set in this context, an area of dentistry known as public oral health is responsible for diagnosis and oral health treatment of a population. To this end, oral visual inspections are held in order to obtain oral health status information of a given population. From this collection of information, also known as epidemiological survey, the dentist can plan and evaluate taken actions for the different problems identified. This procedure has limiting factors, such as a limited number of qualified professionals to perform these tasks, different diagnoses interpretations among other factors. Given this context came the ideia of using intelligent systems techniques in supporting carrying out these tasks. Thus, it was proposed in this paper the development of an intelligent system able to segment, count and classify teeth from occlusal intraoral digital photographic images. The proposed system makes combined use of machine learning techniques and digital image processing. We first carried out a color-based segmentation on regions of interest, teeth and non teeth, in the images through the use of Support Vector Machine. After identifying these regions were used techniques based on morphological operators such as erosion and transformed watershed for counting and detecting the boundaries of the teeth, respectively. With the border detection of teeth was possible to calculate the Fourier descriptors for their shape and the position descriptors. Then the teeth were classified according to their types through the use of the SVM from the method one-against-all used in multiclass problem. The multiclass classification problem has been approached in two different ways. In the first approach we have considered three class types: molar, premolar and non teeth, while the second approach were considered five class types: molar, premolar, canine, incisor and non teeth. The system presented a satisfactory performance in the segmenting, counting and classification of teeth present in the images.
Resumo:
Several are the areas in which digital images are used in solving day-to-day problems. In medicine the use of computer systems have improved the diagnosis and medical interpretations. In dentistry it’s not different, increasingly procedures assisted by computers have support dentists in their tasks. Set in this context, an area of dentistry known as public oral health is responsible for diagnosis and oral health treatment of a population. To this end, oral visual inspections are held in order to obtain oral health status information of a given population. From this collection of information, also known as epidemiological survey, the dentist can plan and evaluate taken actions for the different problems identified. This procedure has limiting factors, such as a limited number of qualified professionals to perform these tasks, different diagnoses interpretations among other factors. Given this context came the ideia of using intelligent systems techniques in supporting carrying out these tasks. Thus, it was proposed in this paper the development of an intelligent system able to segment, count and classify teeth from occlusal intraoral digital photographic images. The proposed system makes combined use of machine learning techniques and digital image processing. We first carried out a color-based segmentation on regions of interest, teeth and non teeth, in the images through the use of Support Vector Machine. After identifying these regions were used techniques based on morphological operators such as erosion and transformed watershed for counting and detecting the boundaries of the teeth, respectively. With the border detection of teeth was possible to calculate the Fourier descriptors for their shape and the position descriptors. Then the teeth were classified according to their types through the use of the SVM from the method one-against-all used in multiclass problem. The multiclass classification problem has been approached in two different ways. In the first approach we have considered three class types: molar, premolar and non teeth, while the second approach were considered five class types: molar, premolar, canine, incisor and non teeth. The system presented a satisfactory performance in the segmenting, counting and classification of teeth present in the images.
Resumo:
Reinforcement learning is a machine learning technique that, although finding a large number of applications, maybe is yet to reach its full potential. One of the inadequately tested possibilities is the use of reinforcement learning in combination with other methods for the solution of pattern classification problems. It is well documented in the literature the problems that support vector machine ensembles face in terms of generalization capacity. Algorithms such as Adaboost do not deal appropriately with the imbalances that arise in those situations. Several alternatives have been proposed, with varying degrees of success. This dissertation presents a new approach to building committees of support vector machines. The presented algorithm combines Adaboost algorithm with a layer of reinforcement learning to adjust committee parameters in order to avoid that imbalances on the committee components affect the generalization performance of the final hypothesis. Comparisons were made with ensembles using and not using the reinforcement learning layer, testing benchmark data sets widely known in area of pattern classification
Resumo:
The pattern classification is one of the machine learning subareas that has the most outstanding. Among the various approaches to solve pattern classification problems, the Support Vector Machines (SVM) receive great emphasis, due to its ease of use and good generalization performance. The Least Squares formulation of SVM (LS-SVM) finds the solution by solving a set of linear equations instead of quadratic programming implemented in SVM. The LS-SVMs provide some free parameters that have to be correctly chosen to achieve satisfactory results in a given task. Despite the LS-SVMs having high performance, lots of tools have been developed to improve them, mainly the development of new classifying methods and the employment of ensembles, in other words, a combination of several classifiers. In this work, our proposal is to use an ensemble and a Genetic Algorithm (GA), search algorithm based on the evolution of species, to enhance the LSSVM classification. In the construction of this ensemble, we use a random selection of attributes of the original problem, which it splits the original problem into smaller ones where each classifier will act. So, we apply a genetic algorithm to find effective values of the LS-SVM parameters and also to find a weight vector, measuring the importance of each machine in the final classification. Finally, the final classification is obtained by a linear combination of the decision values of the LS-SVMs with the weight vector. We used several classification problems, taken as benchmarks to evaluate the performance of the algorithm and compared the results with other classifiers
Resumo:
The automatic speech recognition by machine has been the target of researchers in the past five decades. In this period have been numerous advances, such as in the field of recognition of isolated words (commands), which has very high rates of recognition, currently. However, we are still far from developing a system that could have a performance similar to the human being (automatic continuous speech recognition). One of the great challenges of searches for continuous speech recognition is the large amount of pattern. The modern languages such as English, French, Spanish and Portuguese have approximately 500,000 words or patterns to be identified. The purpose of this study is to use smaller units than the word such as phonemes, syllables and difones units as the basis for the speech recognition, aiming to recognize any words without necessarily using them. The main goal is to reduce the restriction imposed by the excessive amount of patterns. In order to validate this proposal, the system was tested in the isolated word recognition in dependent-case. The phonemes characteristics of the Brazil s Portuguese language were used to developed the hierarchy decision system. These decisions are made through the use of neural networks SVM (Support Vector Machines). The main speech features used were obtained from the Wavelet Packet Transform. The descriptors MFCC (Mel-Frequency Cepstral Coefficient) are also used in this work. It was concluded that the method proposed in this work, showed good results in the steps of recognition of vowels, consonants (syllables) and words when compared with other existing methods in literature
Resumo:
The Support Vector Machines (SVM) has attracted increasing attention in machine learning area, particularly on classification and patterns recognition. However, in some cases it is not easy to determinate accurately the class which given pattern belongs. This thesis involves the construction of a intervalar pattern classifier using SVM in association with intervalar theory, in order to model the separation of a pattern set between distinct classes with precision, aiming to obtain an optimized separation capable to treat imprecisions contained in the initial data and generated during the computational processing. The SVM is a linear machine. In order to allow it to solve real-world problems (usually nonlinear problems), it is necessary to treat the pattern set, know as input set, transforming from nonlinear nature to linear problem. The kernel machines are responsible to do this mapping. To create the intervalar extension of SVM, both for linear and nonlinear problems, it was necessary define intervalar kernel and the Mercer s theorem (which caracterize a kernel function) to intervalar function
Resumo:
The classifier support vector machine is used in several problems in various areas of knowledge. Basically the method used in this classier is to end the hyperplane that maximizes the distance between the groups, to increase the generalization of the classifier. In this work, we treated some problems of binary classification of data obtained by electroencephalography (EEG) and electromyography (EMG) using Support Vector Machine with some complementary techniques, such as: Principal Component Analysis to identify the active regions of the brain, the periodogram method which is obtained by Fourier analysis to help discriminate between groups and Simple Moving Average to eliminate some of the existing noise in the data. It was developed two functions in the software R, for the realization of training tasks and classification. Also, it was proposed two weights systems and a summarized measure to help on deciding in classification of groups. The application of these techniques, weights and the summarized measure in the classier, showed quite satisfactory results, where the best results were an average rate of 95.31% to visual stimuli data, 100% of correct classification for epilepsy data and rates of 91.22% and 96.89% to object motion data for two subjects.
Resumo:
One of the most important goals of bioinformatics is the ability to identify genes in uncharacterized DNA sequences on world wide database. Gene expression on prokaryotes initiates when the RNA-polymerase enzyme interacts with DNA regions called promoters. In these regions are located the main regulatory elements of the transcription process. Despite the improvement of in vitro techniques for molecular biology analysis, characterizing and identifying a great number of promoters on a genome is a complex task. Nevertheless, the main drawback is the absence of a large set of promoters to identify conserved patterns among the species. Hence, a in silico method to predict them on any species is a challenge. Improved promoter prediction methods can be one step towards developing more reliable ab initio gene prediction methods. In this work, we present an empirical comparison of Machine Learning (ML) techniques such as Na¨ýve Bayes, Decision Trees, Support Vector Machines and Neural Networks, Voted Perceptron, PART, k-NN and and ensemble approaches (Bagging and Boosting) to the task of predicting Bacillus subtilis. In order to do so, we first built two data set of promoter and nonpromoter sequences for B. subtilis and a hybrid one. In order to evaluate of ML methods a cross-validation procedure is applied. Good results were obtained with methods of ML like SVM and Naïve Bayes using B. subtilis. However, we have not reached good results on hybrid database
Resumo:
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated
Resumo:
This work presents JFLoat, a software implementation of IEEE-754 standard for binary floating point arithmetic. JFloat was built to provide some features not implemented in Java, specifically directed rounding support. That feature is important for Java-XSC, a project developed in this Department. Also, Java programs should have same portability when using floating point operations, mainly because IEEE-754 specifies that programs should have exactly same behavior on every configuration. However, it was noted that programs using Java native floating point types may be machine and operating system dependent. Also, JFloat is a possible solution to that problem