4 resultados para Computing Classification Systems
em Universidade Federal do Rio Grande do Norte(UFRN)
Resumo:
Committees of classifiers may be used to improve the accuracy of classification systems, in other words, different classifiers used to solve the same problem can be combined for creating a system of greater accuracy, called committees of classifiers. To that this to succeed is necessary that the classifiers make mistakes on different objects of the problem so that the errors of a classifier are ignored by the others correct classifiers when applying the method of combination of the committee. The characteristic of classifiers of err on different objects is called diversity. However, most measures of diversity could not describe this importance. Recently, were proposed two measures of the diversity (good and bad diversity) with the aim of helping to generate more accurate committees. This paper performs an experimental analysis of these measures applied directly on the building of the committees of classifiers. The method of construction adopted is modeled as a search problem by the set of characteristics of the databases of the problem and the best set of committee members in order to find the committee of classifiers to produce the most accurate classification. This problem is solved by metaheuristic optimization techniques, in their mono and multi-objective versions. Analyzes are performed to verify if use or add the measures of good diversity and bad diversity in the optimization objectives creates more accurate committees. Thus, the contribution of this study is to determine whether the measures of good diversity and bad diversity can be used in mono-objective and multi-objective optimization techniques as optimization objectives for building committees of classifiers more accurate than those built by the same process, but using only the accuracy classification as objective of optimization
Resumo:
In this paper artificial neural network (ANN) based on supervised and unsupervised algorithms were investigated for use in the study of rheological parameters of solid pharmaceutical excipients, in order to develop computational tools for manufacturing solid dosage forms. Among four supervised neural networks investigated, the best learning performance was achieved by a feedfoward multilayer perceptron whose architectures was composed by eight neurons in the input layer, sixteen neurons in the hidden layer and one neuron in the output layer. Learning and predictive performance relative to repose angle was poor while to Carr index and Hausner ratio (CI and HR, respectively) showed very good fitting capacity and learning, therefore HR and CI were considered suitable descriptors for the next stage of development of supervised ANNs. Clustering capacity was evaluated for five unsupervised strategies. Network based on purely unsupervised competitive strategies, classic "Winner-Take-All", "Frequency-Sensitive Competitive Learning" and "Rival-Penalize Competitive Learning" (WTA, FSCL and RPCL, respectively) were able to perform clustering from database, however this classification was very poor, showing severe classification errors by grouping data with conflicting properties into the same cluster or even the same neuron. On the other hand it could not be established what was the criteria adopted by the neural network for those clustering. Self-Organizing Maps (SOM) and Neural Gas (NG) networks showed better clustering capacity. Both have recognized the two major groupings of data corresponding to lactose (LAC) and cellulose (CEL). However, SOM showed some errors in classify data from minority excipients, magnesium stearate (EMG) , talc (TLC) and attapulgite (ATP). NG network in turn performed a very consistent classification of data and solve the misclassification of SOM, being the most appropriate network for classifying data of the study. The use of NG network in pharmaceutical technology was still unpublished. NG therefore has great potential for use in the development of software for use in automated classification systems of pharmaceutical powders and as a new tool for mining and clustering data in drug development
Resumo:
Nowadays, classifying proteins in structural classes, which concerns the inference of patterns in their 3D conformation, is one of the most important open problems in Molecular Biology. The main reason for this is that the function of a protein is intrinsically related to its spatial conformation. However, such conformations are very difficult to be obtained experimentally in laboratory. Thus, this problem has drawn the attention of many researchers in Bioinformatics. Considering the great difference between the number of protein sequences already known and the number of three-dimensional structures determined experimentally, the demand of automated techniques for structural classification of proteins is very high. In this context, computational tools, especially Machine Learning (ML) techniques, have become essential to deal with this problem. In this work, ML techniques are used in the recognition of protein structural classes: Decision Trees, k-Nearest Neighbor, Naive Bayes, Support Vector Machine and Neural Networks. These methods have been chosen because they represent different paradigms of learning and have been widely used in the Bioinfornmatics literature. Aiming to obtain an improvment in the performance of these techniques (individual classifiers), homogeneous (Bagging and Boosting) and heterogeneous (Voting, Stacking and StackingC) multiclassification systems are used. Moreover, since the protein database used in this work presents the problem of imbalanced classes, artificial techniques for class balance (Undersampling Random, Tomek Links, CNN, NCL and OSS) are used to minimize such a problem. In order to evaluate the ML methods, a cross-validation procedure is applied, where the accuracy of the classifiers is measured using the mean of classification error rate, on independent test sets. These means are compared, two by two, by the hypothesis test aiming to evaluate if there is, statistically, a significant difference between them. With respect to the results obtained with the individual classifiers, Support Vector Machine presented the best accuracy. In terms of the multi-classification systems (homogeneous and heterogeneous), they showed, in general, a superior or similar performance when compared to the one achieved by the individual classifiers used - especially Boosting with Decision Tree and the StackingC with Linear Regression as meta classifier. The Voting method, despite of its simplicity, has shown to be adequate for solving the problem presented in this work. The techniques for class balance, on the other hand, have not produced a significant improvement in the global classification error. Nevertheless, the use of such techniques did improve the classification error for the minority class. In this context, the NCL technique has shown to be more appropriated
Resumo:
Dry eye syndrome is a multifactorial disease of the tear film, resulting from the instability of the lacrimal functional unit that produces volume change, up or tear distribution. In patients in intensive care the cause is enhanced due to various risk factors, such as mechanical ventilation, sedation, lagophthalmos, low temperatures, among others. The study's purpose is to build an assessment tool of Dry Eye Severity in patients in intensive care units based on the systematization of nursing care and their classification systems. The aim of this study is to build an assessment tool of Dry Eye Severity in hospitalized patients in Care Unit Intensiva.Trata is a methodological study conducted in three stages, namely: context analysis, concept analysis, construction of operational definitions and magnitudes of nursing outcome. For the first step we used the methodological framework for Hinds, Chaves and Cypress (1992). For the second step we used the model of Walker and Avant and an integrative review Whitemore seconds, Knalf (2005). This step enabled the identification of the concept of attributes, background and consequent ground and the construction of the settings for the result of nursing severity of dry eye. For the construction of settings and operational magnitudes, it was used Psicometria proposed by Pasquali (1999). As a result of context analysis, visualized from the reflection that the matter should be discussed and that nursing needs to pay attention to the problem of eye injury, so minimizing strategies are created this event with a high prevalence. With the integrative review were located from the crosses 19 853 titles, selected 215, and from the abstracts 96 articles were read in full. From reading 10 were excluded culminating in the sample of 86 articles that were used to analyze the concept and construction of settings. Selected articles were found in greater numbers in the Scopus database (55.82%), performed in the United States (39.53%), and published mainly in the last five years (48.82). Regarding the concept of analysis were identified as antecedents: age, lagophthalmos, environmental factors, medication use, systemic diseases, mechanical ventilation and ophthalmic surgery. As attributes: TBUT <10s, Schimer I test <5 mm in Schimer II test <10mm, reduced osmolarity. As consequential: the ocular surface damage, ocular discomfort, visual instability. The settings were built and added indicators such as: decreased blink mechanism and eyestrain.