999 resultados para Fusão de classificadores
Resumo:
A presente dissertação apresenta a análise dos classificadores nominais específicos chineses, embasada na Lingüística Cognitiva, tendo como arcabouço teórico a Semântica Cognitiva Experiencialista e a Teoria Prototípica, visando a revelar as motivações semânticas subjacentes e as propriedades de categorização dos classificadores nominais chineses, quando colocados junto a substantivos. Foram analisados todos os classificadores nominais, a partir dos modelos da Semântica Cognitiva Experiencialista, baseados em Lakoff (1987). A amostragem envolveu dados retirados de livros, revistas e internet e da própria experiência vivencial de pesquisadora. Estão descritas as análises de dez classificadores, selecionados pela relevância cultural e potencial de explicitação dos aspectos discutidos. O estudo revela que a combinação de classificadores com substantivos não é arbitrária, como alguns lingüistas chineses acreditam, mas, sim, um reflexo da interação humana com o mundo objetivo, baseada na cognição.
Resumo:
Uma significativa quantidade de proteínas vegetais apresenta-se compartimentalizada nas diversas estruturas celulares. A sua localização pode conduzir à elucidação do funcionamento dos processos biossintéticos e catabólicos e auxiliar na identificação de genes importantes. A fim de localizar produtos gênicos relacionados à resistência, foi utilizada a fusão de cDNAs de arroz (Oryza sativa L.) ao gene da proteína verde fluorescente (GFP). Os cDNAs foram obtidos a partir de uma biblioteca supressiva subtrativa de genes de arroz durante uma interação incompatível com o fungo Magnaporthe grisea. Estes cDNAs foram fusionados a uma versão intensificada de gfp e usados para transformar 500 plantas de Arabidopsis thaliana. Outras 50 plantas foram transformadas com o mesmo vetor, porém sem a fusão (vetor vazio). Foram obtidas aproximadamente 25.500 sementes oriundas das plantas transformadas com as fusões EGFP::cDNAs e 35.000 sementes das transformadas com o vetor vazio, produzindo, respectivamente, 750 e 800 plantas tolerantes ao herbicida glufosinato de amônio. Após a seleção, segmentos foliares das plantas foram analisados por microscopia de fluorescência, visando o estabelecimento do padrão de localização de EGFP. Foram observadas 18 plantas transformadas com a fusão EGFP::cDNAs e 16 plantas transformadas com o vetor vazio apresentando expressão detectável de GFP. Uma planta transformada com uma fusão EGFP::cDNA apresentou localização diferenciada da fluorescência, notadamente nas células guarda dos estômatos e nos tricomas. Após seqüenciamento do cDNA fusionado, foi verificado que esta planta apresentava uma inserção similar a uma seqüência codificante de uma quinase, uma classe de enzimas envolvidas na transdução de sinais em resposta à infecção por patógenos.
Resumo:
Este trabalho minera as informações coletadas no processo de vestibular entre 2009 e 2012 para o curso de graduação de administração de empresas da FGV-EAESP, para estimar classificadores capazes de calcular a probabilidade de um novo aluno ter bom desempenho. O processo de KDD (Knowledge Discovery in Database) desenvolvido por Fayyad et al. (1996a) é a base da metodologia adotada e os classificadores serão estimados utilizando duas ferramentas matemáticas. A primeira é a regressão logística, muito usada por instituições financeiras para avaliar se um cliente será capaz de honrar com seus pagamentos e a segunda é a rede Bayesiana, proveniente do campo de inteligência artificial. Este estudo mostre que os dois modelos possuem o mesmo poder discriminatório, gerando resultados semelhantes. Além disso, as informações que influenciam a probabilidade de o aluno ter bom desempenho são a sua idade no ano de ingresso, a quantidade de vezes que ele prestou vestibular da FGV/EAESP antes de ser aprovado, a região do Brasil de onde é proveniente e as notas das provas de matemática fase 01 e fase 02, inglês, ciências humanas e redação. Aparentemente o grau de formação dos pais e o grau de decisão do aluno em estudar na FGV/EAESP não influenciam nessa probabilidade.
Resumo:
The use of the maps obtained from remote sensing orbital images submitted to digital processing became fundamental to optimize conservation and monitoring actions of the coral reefs. However, the accuracy reached in the mapping of submerged areas is limited by variation of the water column that degrades the signal received by the orbital sensor and introduces errors in the final result of the classification. The limited capacity of the traditional methods based on conventional statistical techniques to solve the problems related to the inter-classes took the search of alternative strategies in the area of the Computational Intelligence. In this work an ensemble classifiers was built based on the combination of Support Vector Machines and Minimum Distance Classifier with the objective of classifying remotely sensed images of coral reefs ecosystem. The system is composed by three stages, through which the progressive refinement of the classification process happens. The patterns that received an ambiguous classification in a certain stage of the process were revalued in the subsequent stage. The prediction non ambiguous for all the data happened through the reduction or elimination of the false positive. The images were classified into five bottom-types: deep water; under-water corals; inter-tidal corals; algal and sandy bottom. The highest overall accuracy (89%) was obtained from SVM with polynomial kernel. The accuracy of the classified image was compared through the use of error matrix to the results obtained by the application of other classification methods based on a single classifier (neural network and the k-means algorithm). In the final, the comparison of results achieved demonstrated the potential of the ensemble classifiers as a tool of classification of images from submerged areas subject to the noise caused by atmospheric effects and the water column
Resumo:
Equipment maintenance is the major cost factor in industrial plants, it is very important the development of fault predict techniques. Three-phase induction motors are key electrical equipments used in industrial applications mainly because presents low cost and large robustness, however, it isn t protected from other fault types such as shorted winding and broken bars. Several acquisition ways, processing and signal analysis are applied to improve its diagnosis. More efficient techniques use current sensors and its signature analysis. In this dissertation, starting of these sensors, it is to make signal analysis through Park s vector that provides a good visualization capability. Faults data acquisition is an arduous task; in this way, it is developed a methodology for data base construction. Park s transformer is applied into stationary reference for machine modeling of the machine s differential equations solution. Faults detection needs a detailed analysis of variables and its influences that becomes the diagnosis more complex. The tasks of pattern recognition allow that systems are automatically generated, based in patterns and data concepts, in the majority cases undetectable for specialists, helping decision tasks. Classifiers algorithms with diverse learning paradigms: k-Neighborhood, Neural Networks, Decision Trees and Naïves Bayes are used to patterns recognition of machines faults. Multi-classifier systems are used to improve classification errors. It inspected the algorithms homogeneous: Bagging and Boosting and heterogeneous: Vote, Stacking and Stacking C. Results present the effectiveness of constructed model to faults modeling, such as the possibility of using multi-classifiers algorithm on faults classification
Resumo:
The development and refinement of techniques that make simultaneous localization and mapping (SLAM) for an autonomous mobile robot and the building of local 3-D maps from a sequence of images, is widely studied in scientific circles. This work presents a monocular visual SLAM technique based on extended Kalman filter, which uses features found in a sequence of images using the SURF descriptor (Speeded Up Robust Features) and determines which features can be used as marks by a technique based on delayed initialization from 3-D straight lines. For this, only the coordinates of the features found in the image and the intrinsic and extrinsic camera parameters are avaliable. Its possible to determine the position of the marks only on the availability of information of depth. Tests have shown that during the route, the mobile robot detects the presence of characteristics in the images and through a proposed technique for delayed initialization of marks, adds new marks to the state vector of the extended Kalman filter (EKF), after estimating the depth of features. With the estimated position of the marks, it was possible to estimate the updated position of the robot at each step, obtaining good results that demonstrate the effectiveness of monocular visual SLAM system proposed in this paper
Resumo:
Traditional applications of feature selection in areas such as data mining, machine learning and pattern recognition aim to improve the accuracy and to reduce the computational cost of the model. It is done through the removal of redundant, irrelevant or noisy data, finding a representative subset of data that reduces its dimensionality without loss of performance. With the development of research in ensemble of classifiers and the verification that this type of model has better performance than the individual models, if the base classifiers are diverse, comes a new field of application to the research of feature selection. In this new field, it is desired to find diverse subsets of features for the construction of base classifiers for the ensemble systems. This work proposes an approach that maximizes the diversity of the ensembles by selecting subsets of features using a model independent of the learning algorithm and with low computational cost. This is done using bio-inspired metaheuristics with evaluation filter-based criteria
Resumo:
Although some individual techniques of supervised Machine Learning (ML), also known as classifiers, or algorithms of classification, to supply solutions that, most of the time, are considered efficient, have experimental results gotten with the use of large sets of pattern and/or that they have a expressive amount of irrelevant data or incomplete characteristic, that show a decrease in the efficiency of the precision of these techniques. In other words, such techniques can t do an recognition of patterns of an efficient form in complex problems. With the intention to get better performance and efficiency of these ML techniques, were thought about the idea to using some types of LM algorithms work jointly, thus origin to the term Multi-Classifier System (MCS). The MCS s presents, as component, different of LM algorithms, called of base classifiers, and realized a combination of results gotten for these algorithms to reach the final result. So that the MCS has a better performance that the base classifiers, the results gotten for each base classifier must present an certain diversity, in other words, a difference between the results gotten for each classifier that compose the system. It can be said that it does not make signification to have MCS s whose base classifiers have identical answers to the sames patterns. Although the MCS s present better results that the individually systems, has always the search to improve the results gotten for this type of system. Aim at this improvement and a better consistency in the results, as well as a larger diversity of the classifiers of a MCS, comes being recently searched methodologies that present as characteristic the use of weights, or confidence values. These weights can describe the importance that certain classifier supplied when associating with each pattern to a determined class. These weights still are used, in associate with the exits of the classifiers, during the process of recognition (use) of the MCS s. Exist different ways of calculating these weights and can be divided in two categories: the static weights and the dynamic weights. The first category of weights is characterizes for not having the modification of its values during the classification process, different it occurs with the second category, where the values suffers modifications during the classification process. In this work an analysis will be made to verify if the use of the weights, statics as much as dynamics, they can increase the perfomance of the MCS s in comparison with the individually systems. Moreover, will be made an analysis in the diversity gotten for the MCS s, for this mode verify if it has some relation between the use of the weights in the MCS s with different levels of diversity
Resumo:
In systems that combine the outputs of classification methods (combination systems), such as ensembles and multi-agent systems, one of the main constraints is that the base components (classifiers or agents) should be diverse among themselves. In other words, there is clearly no accuracy gain in a system that is composed of a set of identical base components. One way of increasing diversity is through the use of feature selection or data distribution methods in combination systems. In this work, an investigation of the impact of using data distribution methods among the components of combination systems will be performed. In this investigation, different methods of data distribution will be used and an analysis of the combination systems, using several different configurations, will be performed. As a result of this analysis, it is aimed to detect which combination systems are more suitable to use feature distribution among the components
Resumo:
RePART (Reward/Punishment ART) is a neural model that constitutes a variation of the Fuzzy Artmap model. This network was proposed in order to minimize the inherent problems in the Artmap-based model, such as the proliferation of categories and misclassification. RePART makes use of additional mechanisms, such as an instance counting parameter, a reward/punishment process and a variable vigilance parameter. The instance counting parameter, for instance, aims to minimize the misclassification problem, which is a consequence of the sensitivity to the noises, frequently presents in Artmap-based models. On the other hand, the use of the variable vigilance parameter tries to smoouth out the category proliferation problem, which is inherent of Artmap-based models, decreasing the complexity of the net. RePART was originally proposed in order to minimize the aforementioned problems and it was shown to have better performance (higer accuracy and lower complexity) than Artmap-based models. This work proposes an investigation of the performance of the RePART model in classifier ensembles. Different sizes, learning strategies and structures will be used in this investigation. As a result of this investigation, it is aimed to define the main advantages and drawbacks of this model, when used as a component in classifier ensembles. This can provide a broader foundation for the use of RePART in other pattern recognition applications
Resumo:
The objective of the researches in artificial intelligence is to qualify the computer to execute functions that are performed by humans using knowledge and reasoning. This work was developed in the area of machine learning, that it s the study branch of artificial intelligence, being related to the project and development of algorithms and techniques capable to allow the computational learning. The objective of this work is analyzing a feature selection method for ensemble systems. The proposed method is inserted into the filter approach of feature selection method, it s using the variance and Spearman correlation to rank the feature and using the reward and punishment strategies to measure the feature importance for the identification of the classes. For each ensemble, several different configuration were used, which varied from hybrid (homogeneous) to non-hybrid (heterogeneous) structures of ensemble. They were submitted to five combining methods (voting, sum, sum weight, multiLayer Perceptron and naïve Bayes) which were applied in six distinct database (real and artificial). The classifiers applied during the experiments were k- nearest neighbor, multiLayer Perceptron, naïve Bayes and decision tree. Finally, the performance of ensemble was analyzed comparatively, using none feature selection method, using a filter approach (original) feature selection method and the proposed method. To do this comparison, a statistical test was applied, which demonstrate that there was a significant improvement in the precision of the ensembles
Resumo:
Classifier ensembles are systems composed of a set of individual classifiers and a combination module, which is responsible for providing the final output of the system. In the design of these systems, diversity is considered as one of the main aspects to be taken into account since there is no gain in combining identical classification methods. The ideal situation is a set of individual classifiers with uncorrelated errors. In other words, the individual classifiers should be diverse among themselves. One way of increasing diversity is to provide different datasets (patterns and/or attributes) for the individual classifiers. The diversity is increased because the individual classifiers will perform the same task (classification of the same input patterns) but they will be built using different subsets of patterns and/or attributes. The majority of the papers using feature selection for ensembles address the homogenous structures of ensemble, i.e., ensembles composed only of the same type of classifiers. In this investigation, two approaches of genetic algorithms (single and multi-objective) will be used to guide the distribution of the features among the classifiers in the context of homogenous and heterogeneous ensembles. The experiments will be divided into two phases that use a filter approach of feature selection guided by genetic algorithm
Resumo:
Committees of classifiers may be used to improve the accuracy of classification systems, in other words, different classifiers used to solve the same problem can be combined for creating a system of greater accuracy, called committees of classifiers. To that this to succeed is necessary that the classifiers make mistakes on different objects of the problem so that the errors of a classifier are ignored by the others correct classifiers when applying the method of combination of the committee. The characteristic of classifiers of err on different objects is called diversity. However, most measures of diversity could not describe this importance. Recently, were proposed two measures of the diversity (good and bad diversity) with the aim of helping to generate more accurate committees. This paper performs an experimental analysis of these measures applied directly on the building of the committees of classifiers. The method of construction adopted is modeled as a search problem by the set of characteristics of the databases of the problem and the best set of committee members in order to find the committee of classifiers to produce the most accurate classification. This problem is solved by metaheuristic optimization techniques, in their mono and multi-objective versions. Analyzes are performed to verify if use or add the measures of good diversity and bad diversity in the optimization objectives creates more accurate committees. Thus, the contribution of this study is to determine whether the measures of good diversity and bad diversity can be used in mono-objective and multi-objective optimization techniques as optimization objectives for building committees of classifiers more accurate than those built by the same process, but using only the accuracy classification as objective of optimization